A Python Concurrency Notebook

This notebook is based on David Beazley’s excellent screencast from 2019

We’ll start by creating some basic loops

import time

def countdown(n):
    while n > 0:
        print(n)
        time.sleep(1)
        n -= 1

countdown(5)

def countup(n):
    x = 0
    while x < n:
        print(x)
        time.sleep(1)
        x += 1

countdown(3)
countup(3)

Now let’s look at how this works with Python threads.

import threading

t1 = threading.Thread(target=countdown, args=(5,))
t2 = threading.Thread(target=countup, args=(5,))
t1.start()
t2.start()
t1.join()
t2.join()

One problem with concurrency in Python is the GIL, which only allows execution of one thread at a time. But this is a compute-bound concern: if you have 4 CPU cores and want to crunch numbers in parallel, the GIL is the problem.

But let’s think about I/O constraints instead - like holding open 5,000 network connections.

The GIL gets released during I/O syscalls. When a thread calls time.sleep(), reads from a socket, or waits on a db response, CPython releases the GIL, which means another thread can grab the GIL and run while the first one is blocked waiting for a response.

So for I/O-heavy workloads, Python threads actually do achieve real concurrency in Python: the GIL isn’t the bottleneck. The bottleneck is: OS overhead.

Python threads are real OS threads: not a Python abstraction, but pthread_create syscalls. The OS is allocating each thread its own stack (about 1-8MB), registering it with the kernel, and saving/restoring its state on every context switch. With 5,000 threads (for our network connections example), that could be like 40GB of stack memory.

So, if threads are mostly just waiting, do we really need the OS to manage them? Can we track 5,000 waiting connections ourselves, in userspace, and just check which ones have data ready?

# Problem: how to achieve concurrency without threads?
# Issue: Figure out how to switch between tasks?

# So the problem with this:
def countdown(n):
    while n > 0:
        print(n)
        time.sleep(1)
        n -= 1

def countup(n):
    x = 0
    while x < n:
        print(x)
        time.sleep(1)
        x += 1

# If we don't want to create full OS threads with their own stack
# and execution state, then we can't have two Python while loops
# switching on and off. We need to restructure this.

from collections import deque

class Scheduler:
    def __init__(self):
        self.ready = deque() # Functions ready to execute

    def call_soon(self, func):
        self.ready.append(func)

    def run(self):
        while self.ready:
            func = self.ready.popleft()
            func()

sched = Scheduler()

def countdown(n):
    if n > 0:
        print(n)
        time.sleep(1)
        # we need a lambda because our Scheduler
        # expects a function with no args.
        sched.call_soon(lambda: countdown(n-1))

sched.call_soon(lambda: countdown(5))
sched.run()

So basically we’ve created a recursive function call…

# the default argument here is kind of a hack
# you could probably write some internal helper
# function to avoid doing this
def countup(stop, x=0):
    if x < stop:
        print(x)
        time.sleep(1)
        sched.call_soon(lambda: countup(stop, x+1))

sched.call_soon(lambda: countdown(5))
sched.call_soon(lambda: countup(5))
sched.run()