Learning to create Python multi-threaded and multi-process
This article is not useful for experienced Python developers. It is rather a superficial overview of Python multi-threaded features for those who recently started learning Python.
Unfortunately, you cannot find tons of material about multithreading in Python. Moreover, quite often I meet Python beginners who don’t know about GIL, for example. In this article, I want to cover the most basic features of a Python multi-threaded, discuss what does GIL stand for and how to act with/without it.
Python is a perfect programming language. It ideally includes many programming paradigms. Majority of the tasks that can be faced by a developer are solved easily, elegantly and concisely using Python. However, a single-threaded solution is quite often enough for all those tasks. Single-threaded programs are usually predictable and easy-to-debug. The same can’t be said for the Python multi-threaded and multi-process applications.
Python Multi-threaded application
Python has a threading module that includes everything for the multi-threaded programming: there you can find different types of locks, a semaphore, and an event mechanism. Thus, you get everything you need for the vast majority of Python multi-threaded applications. Moreover, it is extremely easy to use all those tools. To make sure, let’s discuss an example of an application that runs two threads. The first thread types ten “0”, the second – ten “1”, and strictly in turn.
import threading def writer(x, event_for_wait, event_for_set): for i in xrange(10): event_for_wait.wait() # wait for event event_for_wait.clear() # clean event for future print x event_for_set.set() # set event for neighbor thread # init events e1 = threading.Event() e2 = threading.Event() # init threads t1 = threading.Thread(target=writer, args=(0, e1, e2)) t2 = threading.Thread(target=writer, args=(1, e2, e1)) # start threads t1.start() t2.start() e1.set() # initiate the first event # join threads to the main thread t1.join() t2.join()
No magic and voodoo code. As you can see, the code is accurate and consistent. As you can see, we have created the thread out of a function which is highly convenient for small tasks. Moreover, this code is rather flexible. For example, you have created one more thread that types “2”. Thus, you will get the following:
import threading def writer(x, event_for_wait, event_for_set): for i in xrange(10): event_for_wait.wait() # wait for event event_for_wait.clear() # clean event for future print x event_for_set.set() # set event for neighbor thread # init events e1 = threading.Event() e2 = threading.Event() e3 = threading.Event() # init threads t1 = threading.Thread(target=writer, args=(0, e1, e2)) t2 = threading.Thread(target=writer, args=(1, e2, e3)) t3 = threading.Thread(target=writer, args=(2, e3, e1)) # start threads t1.start() t2.start() t3.start() e1.set() # initiate the first event # join threads to the main thread t1.join() t2.join() t3.join()
Here, we have added a new event, a new thread, and changed the parameters passed to the threads for the start (it is also possible to create a more general solution with the help of MapReduce, for example, but is out of the article). Thus, as you can see, there are no complicated things and magic. Everything is simple and comprehensive. Let’s move forward.
Global Interpreter Lock
There are two the most widespread reasons for using threads. Firstly, it is useful for increasing the using of modern multi-core processor architecture. It means increasing application performance. Secondly, threads are of utmost importance in case we need to divide the application logic into parallel and fully or partly asynchronous sections (i.e. you need to have an opportunity to ping several servers simultaneously).
Considering the first situation, we face the following Python limitation called Global Interpreter Lock (GIL). The GIL concept means that at any specific time only one thread may be executed by a processor. It was designed to avoid the threads being competing for different variables. An executing thread gains access through the whole environment. This feature of Python thread implementation significantly simplifies the work with threads. Moreover, this feature provides you with certain thread safety.
However, you should pay attention to the following moment: it may seem that a Python multi-threaded application will work exactly the same time as a single-threaded, doing the same thing. However, here you will face the following unpleasant issue. Let’s consider the following code to understand what I mean:
with open('test1.txt', 'w') as fout: for i in xrange(1000000): print >> fout, 1 # This application just creates a million of strings ‘1’ for ~0.35s on my local machine. # Now, let’s consider another program for comparison: from threading import Thread def writer(filename, n): with open(filename, 'w') as fout: for i in xrange(n): print >> fout, 1 t1 = Thread(target=writer, args=('test2.txt', 500000,)) t2 = Thread(target=writer, args=('test3.txt', 500000,)) t1.start() t2.start() t1.join() t2.join()
The second application creates 2 threads. In each thread, the application creates a separate file for half a million lines “1”. In fact, the amount of work is the same. However, over time, you will see an interesting effect. The application performance is from 0.7 seconds to 7 seconds. What is the reason for this situation?
In fact, it happens due to the fact that when a thread does not need a CPU resource, it frees GIL. At this moment, both threads can try to get it. At the same time, the operating system knows that there are many cores. Thus, it can intensify this situation by trying to distribute the threads between the cores.
UPD: currently, Python 3.2 has an improved implementation of GIL, and this problem is partially solved. The solution is about the fact that each thread after losing control waits for a short period before it can capture GIL again.
“Thus, it’s extremely difficult to create an effective multi-threaded application in Python?” you may ask. However, keep calm, there is always a solution.
Python Multi-process applications
In order to solve the problem mentioned in the previous chapter, Python provides us with the subprocess module. We can create an application that should be executed in a parallel thread and execute it in several threads in other application. This solution would have significantly increased our application performance. The matter is that the threads created in GIL only wait for the launched process shutdown. However, this approach has many problems as well. The main issue is that it becomes difficult to transfer the data between the processes. This way, we would have to serialize objects, adjust the connection through PIPE or other tools. In fact, it results in additional expenses and the code becomes hard to understand.
For this reason, here comes another useful approach. Python also has multiprocessing module which is quite similar to threading. For example, the processes can be created in the same way using the functions. The methods of operation with the processes are almost the same as for threading. However, in order to synchronize the processes and provide data sharing – we need to use other tools. I am referring to Queues and Pipes. Alongside, the analogues to locks, events and semaphores mentioned in threading are also available here.
Additionally, the multiprocessing module also provides an operating principle of general memory. Thus, the module provides the class of Value and Array variables that can be shared between the processes. For the convenience of working with the variables, you can use Manager classes. They are more flexible and convenient, but slower.
Furthermore, the multiprocessing module provides an opportunity to create pools of processes. This mechanism is pretty convenient for implementing the Master-Worker template to build a parallel Map.
Among the basic problems of working with multiprocessing, I need to point out a module relative platform dependence. Due to the fact that different operating systems provide a different working process with the processes, the code receives several limitations. For example, Windows OS doesn’t have fork mechanism. Therefore, you need to wrap the processes separation point in the following:
if __name__ =='__main__':
Thus, this code construction is a good code style.
What’s more …
In order to create parallel applications using Python, there are also other libraries and approaches. For example, you can use Haddop+Python or different implementations of MPI and Python (pyMPI, mpi4py). Moreover, it’s even possible to use the wrappers of the existing libraries in C++ and Fortran. Here, we can mention such frameworks/libraries as Pyro, Twisted, Tornado and so on. However, all these things are the topics of other articles.