Which programming language should you pick for your machine learning or deep learning project? These are your best options.
AI (artificial intelligence) opens up a world of possibilities for application developers. By taking advantage of machine learning or deep learning, you could produce far better user profiles, personalisation, and recommendations, or incorporate smarter search, a voice interface, or intelligent assistance, or improve your app any number of other ways. You could even build applications that see, hear, and react.
Which programming language should you learn to plumb the depths of AI? You’ll want a language with many good machine learning and deep learning libraries, of course. It should also feature good runtime performance, good tools support, a large community of programmers, and a healthy ecosystem of supporting packages. That still leaves plenty of good options.
Here are my picks for the five best programming languages for AI development, along with three honorable mentions. Some of these languages are on the rise, while others seem to be slipping. Come back in a few months, and you might find these rankings have changed.
At number one, it’s Python. How could it be anything else, really? While there are maddening things about Python—the whitespacing, the massive split between Python 2.x and Python 3.x, the five different packaging systems that are all broken in different ways—if you’re doing AI work, you almost certainly will be using Python at some point.
The libraries available in Python are pretty much unparalleled in other languages. NumPy has become so ubiquitous it is almost a standard API for tensor operations, and Pandas brings R’s powerful and flexible dataframes to Python. For natural language processing (NLP), you have the venerable NLTK and the blazingly-fast SpaCy. For machine learning, there is the battle-tested Scikit-learn. And when it comes to deep learning, all of the current libraries (TensorFlow, PyTorch, Chainer, Apache MXNet, Theano, etc.) are effectively Python-first projects.
If you’re reading cutting-edge deep learning research on arXiv, then almost certainly you will find source code in Python. Then there are the other parts of the Python ecosystem. While IPython has become Jupyter Notebook, and less Python-centric, you will still find that most Jupyter Notebook users, and most of the notebooks shared online, use Python.
There’s no getting around it. Python is the language at the forefront of AI research, the one you’ll find the most machine learning and deep learning frameworks for, and the one that almost everybody in the AI world speaks. For these reasons, Python is first among AI programming languages, despite the fact that your author curses the whitespace issues at least once a day.
2. Java and friends
The JVM family of languages (Java, Scala, Kotlin, Clojure, etc.) is also a great choice for AI application development. You have a wealth of libraries available for all parts of the pipeline, whether it’s natural language processing (CoreNLP), tensor operations (ND4J), or a full GPU-accelerated deep learning stack (DL4J). Plus you get easy access to big data platforms like Apache Spark and Apache Hadoop.
Java is the lingua franca of most enterprises, and with the new language constructs available in Java 8 and Java 9, writing Java code is not the hateful experience many of us remember. Writing an AI application in Java may feel a touch boring, but it can get the job done—and you can use all your existing Java infrastructure for development, deployment, and monitoring.
C/C++ is unlikely to be your first choice when developing an AI application, but if you’re working in an embedded environment, and you can’t afford the overhead of a Java Virtual Machine or a Python interpreter, C/C++ is the answer. When you need to wring every last bit of performance from the system, then you need to head back to the terrifying world of pointers.
Thankfully, modern C/C++ can be pleasant to write (honest!). You have a choice of approaches. You can either dive in at the bottom of the stack, using libraries like CUDA to write your own code that runs directly on your GPU, or you can use TensorFlow or Caffe to obtain access to flexible high-level APIs. The latter also allow you to import models that your data scientists may have built with Python and then run them in production with all the speed that C/C++ offers.
Keep an eye out for what Rust does in this space in the year to come. Combining the speed of C/C++ with type and data safety, Rust is a great choice for achieving production performance without creating security headaches. And a TensorFlow binding is available for it already.
R comes in at the bottom of our top five, and is trending downward. R is the language that data scientists love. However, other programmers find R a little confusing when they first encounter it, due to its dataframe-centric approach. If you have a dedicated group of R developers, then it can make sense to use the integrations with TensorFlow, Keras, or H2O for research, prototyping, and experimentation, but I hesitate to recommend R for production usage, due to performance and operational concerns. While you can write performant R code that can be deployed on production servers, it will almost certainly be easier to take that R prototype and recode it in Java or Python.
Other AI programming options
A few years ago, Lua was riding high in the world of artificial intelligence. With the Torch framework, Lua was one of the most popular languages for deep learning development, and you’ll still come across a lot of historical deep learning work on GitHub that defines models with Lua/Torch. I think it’s a good idea to have a passing familiarity with Lua for the purposes of research and looking over people’s previous work. But with the arrival of frameworks like TensorFlow and PyTorch, the use of Lua has dropped off considerably.
Julia is a high-performance programming language that is focused on numerical computing, which makes it a good fit in the math-heavy world of AI. While it’s not all that popular as a language choice right now, wrappers like TensorFlow.jl and Mocha (heavily influenced by Caffe) provide good deep learning support. If you don’t mind that there’s not a huge ecosystem out there just yet, but want to benefit from its focus on making high-performance calculations easy and swift.
As we were going to press, Chris Lattner, creator of the LLVM compiler and the Swift programming language, announced Swift for TensorFlow, a project that promises to combine the ease-of-use that Python provides with the speed and static type checking of a compiled language. As a bonus, Swift for TensorFlow also allows you to import Python libraries such as NumPy and use them in your Swift code almost as you would with any other library.
Now, Swift for Tensorflow is in a very early stage of development right now, but being able to write modern programming constructs and get compile-time guarantees of speed and safety is a tantalizing prospect. Even if you don’t go out and learn Swift just yet, I would recommend that you keep an eye on this project
*Read the original article at InfoWorld.
Want to know more about digital marketing training or code training for you or your organisation?