Machine learning is a data analysis method that automates the construction of analytical models. It is a part, a “subset” of Artificial Intelligence. It is based on the idea that systems can learn from data, identify patterns on their own, and make decisions with minimal human intervention. In today’s article we will try to analyze which are the most popular Machine Learning libraries in the world.
Most Popular Machine Learning Libraries – 2014/2019
While it is difficult to calculate precisely which are the most used machine learning libraries in the world, in this first part of the article we will try to analyze some data. To do this we will use the popularity of the libraries on GitHub. Data and popularity are calculated based on the number of Stars of the repositories exported from GitHub Archive.
In December 2019, the most popular Machine Learning library, according to GitHub data, was TensorFlow. This had in fact a score of 141384. A definitely very high figure compared to the second, Keras and the third skikit-learn. In the first four positions, at the end of 2019, there were all libraries that are part of the Python world. In fact, in fourth place we find PyTorch with a value of 36434. Next, from the fifth to the sixth position we find libraries from C++. Caffe, XGBoost and MXNET. Following again a Python library, Fastai with 17224, in the new place CNTK and closes the ranking in the decismo place DarkNet of C. This is for the data of 2019. But if we move the hand of time to 2014 what was the situation well 7 years ago?
In January 2014 the most popular library, again according to GitHub, was scikit-learn with a value of 1911. The first position then still belonged to Python but, in second place, there was Vowpal Wabbit from C++ with a value of 872. Among the top positions we also find Accord.NET from .NET, Torch7 (Lua) and Depplearning4k from Java. It was only in late 2015 that TensorFlow overtook scikit-learn. In the video you can see in general how Python has increasingly increased its popularity arriving, at the end of 2019, to position as many as 4 libraries in the top 4 positions.
Top 5 Machine Learning Frameworks To Use in 2021
TensorFlow is an open source, end-to-end platform for machine learning. It features a comprehensive and flexible ecosystem of tools, libraries and community resources that enables researchers to promote the state of the art in ML and developers to easily create and deploy ML-based applications.
Easy model buildingEasily create and train ML models using intuitive high-level APIs such as Keras with fast execution, which makes model iteration immediate and debugging easy. Source: https://www.tensorflow.org/
PyTorch is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing,primarily developed by Facebook’s AI Research lab (FAIR). It is free and open-source software released under the Modified BSD license. Although the Python interface is more polished and the primary focus of development, PyTorch also has a C++ interface. Source: https://en.wikipedia.org/wiki/PyTorch
Scikit-learn (formerly scikits.learn and also known as sklearn) is a free software machine learning library for the Python programming language It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance. Originally developed at the University of California, Berkeley’s AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since.
Torch is an open-source machine learning library, a scientific computing framework, and a script language based on the Lua programming language.It provides a wide range of algorithms for deep learning, and uses the scripting language LuaJIT, and an underlying C implementation. As of 2018, Torch is no longer in active development. However PyTorch, which is based on the Torch library, is actively developed as of December 2020. Source: https://en.wikipedia.org/wiki/Torch_(machine_learning)
Source and links
I used several sources for this article. The first of all is GitHub. Subsequently I analysed several articles present online. These include: https://towardsdatascience.com/best-python-libraries-for-machine-learning-and-deep-learning-b0bd40c7e8c. https://www.upgrad.com/blog/top-python-libraries-for-machine-learning/. https://www.activestate.com/blog/top-10-python-machine-learning-packages/.
These are all very interesting articles that I recommend you read to get a better idea.
See the video here: Most Popular Machine Learning Libraries – 2014/2019: https://youtu.be/ZtOlJF_RQEY
Follow our channel for more videos: https://youtube.com/c/statisticsanddata
Visit the website for further information and articles: https://statisticsanddata.org/
Support “Statistics and Data”