Taking a break from my posts and reading on thermodynamic computing just earmarking here a lot of interesting things that are going on in the ML/AI world.
Kolmogorov-Arnold networks (KAN)
Last week there was a preprint about a new deep learning architecture called Kolmogorov-Arnold networks(KAN). These are in contrast to multi-layer perceptron (MLP). What is really interesting is that instead of learning the linear weights on edges,
that are passed to a fix non-linear activation function,
here the
I didn't state why this would work, it turns out that KAN are based on the Kolmogorov-Arnold representation theorem which is a connivent way to represent real-valued multivariate functions as a sum of univariate functions over the domain
This is my best understanding of the KAN paper, but I've barely scratched the surface of the 48 pages. The other thing is that the function composition in eq.
If I am getting this wrong above please drop a comment. The graphical abstract covers all what I tried to convey above better:
Abstract graphic for KANs from [1]. |
The thing that interests me is that KANs actually seem more intuitive. The network learns the representative non-linear functions that capture characteristics of the data. I guess a MLPs do this as well but less effective? Well at least based on the paper which shows a much better scaling law compared to MLPs.
I don't think I'll have much time to focus more on this but I am very curious about the use of KANs over Gated MLPs in things like Graph Neural Networks. The authors of the preprint put together a very nice and usable python package, pykan.
Update 16 May, 2024
The author of ref. [1] posted a review video of the paper.
AlphaFold 3
Now onto the field of computational biochemistry and molecular biology. Google's Deepmind just published there research results for AlphaFold 3 [2]. Basically, it appears they have the ability to develop and discover new drugs using computational means as well as address other life science challenges. Google is probably sitting on a very lucrative resource, so kudos them for investing in it.
The thing thats a little bit off, is how all these big tech companies are doing research and publishing in science journals yet no one has access or knowledge into exactly how these models work and are trained. Its not bad that they want to keep the information internally but they probably shouldn't publish theb. I should make a blog post on this topic about how big tech scientist are now in the business of publishing papers; just a new kind era I guess.
I should mention that they did make available a compute server where non-commercial users can make queries with AlphaFold 3.
Footnotes
References
[1] Z. Liu, Y. Wang, S. Vaidya, F. Ruehle, J. Halverson, M. Soljačić, T.Y. Hou, M. Tegmark, KAN: Kolmogorov-Arnold Networks, (2024). arXiv.
[2] Google DeepMind and Isomorphic Labs, Accurate structure prediction of biomolecular interactions with AlphaFold 3, Nature (2024) 1–3. https://doi.org/10.1038/s41586-024-07487-w.
@misc{Bringuier_9MAY2024,
title = {KANs and AlphaFold 3},
author = {Bringuier, Stefan},
year = 2024,
month = may,
url = {https://www.diracs-student.blog/2024/05/}#
{kans-and-alphafold-3.html},
note = {Accessed: 2025-07-30},
howpublished = {Dirac's Student [Blog]},
}
No comments:
Post a Comment
Please refrain from using ad hominem attacks, profanity, slander, or any similar sentiment in your comments. Let's keep the discussion respectful and constructive.