Rust and Python
Contents
I have lately been working on coupled cluster Monte Carlo.1 Our pilot code is pure Python: easy to write, but the performance is not optimal. We are thus in the process of rewriting the most computationally intensive parts of the code in a compiled language.
Originally, that compiled language was going to be C++. pybind11 indeed works perfectly for writing Python extensions. Then I learned about Rust and decided to try that instead:
- I thought it would be a good idea to learn a new language;
- The memory safety guarantees and their potential benefits for parallel and concurrent programming are very enthusing;
- There is no runtime library apart from GLIBC;
- The tooling is excellent: unobtrusive and shipped with the language.
Both Python and Rust can talk to C, so that’s ultimately what we’ll have to do to get our extension to work properly. Fortunately, we don’t have to go all the way to C. There are 2 tools that can help in writing Python extensions in Rust:
- Using CFFI. Slightly lower level and it can involve a lot of boilerplate.
- Using PyO3. Aims at doing what pybind11 (and Boost.Python before it) did for C++ extensions. It is more bleeding edge and what I chose to work with.
This presentation has more information on writing Python extensions in Rust.
In this post I’ll explain:
- How to set up a mixed-language development environment.
- How to deploy a
manylinux1
-compatible package using Travis CI for tagged releases.
The public example repository in on GitHub: robertodr/rustafarian.
Development environment
Rust dependencies
These are for developers only. First and foremost, we need the Rust toolchain. rustup
to the rescue!
$ curl https://sh.rustup.rs -sSf | sh -s -- -y --no-modify-path --default-toolchain nightly
$ source "$HOME"/.cargo/env
Other dependencies will be specified in the Cargo.toml
file and installed when we build the extension. Note that we chose nightly
as default Rust toolchain. As was mentioned, PyO3 is bleeding edge and uses features on the Rust language only available in nightly builds.
Python dependencies
These are both for final users (e.g., Click for the command-line interface) and for developers (e.g., pytest for testing). Preferably one should only install Python dependencies in virtual environments, never globally. I am partial towards Pipenv, but Conda could work as well.
Running pipenv install --dev
with the following Pipfile
:
[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true
[dev-packages]
maturin = ">=0.7.6"
pytest = ">=4.0"
[packages]
click = ">=7.0"
will create the virtual environment and install all needed packages. With pipenv shell
, you can jump in the virtual environment. The maturin package is essential: this is the tool that will orchestrate building the extension and packaging for PyPI.
Workflow
OK, now we have a working development environment! To build the extension:
- Make sure that the Rust toolchain is on
PATH
:
- Make sure that you are in the Python virtual environment:
- Invoke the build tool:
The Cargo.toml
lists the Rust dependencies of our project and these will be automatically downloaded and compiled if not already available.
And DONE! maturin will compile the module and link it into a dynamic shared object (DSO) alongside the Python code resulting in the following directory tree:
rustafarian/
├── cli.py
├── __init__.py
├── module
│ ├── double.py
│ ├── __init__.py
└── rustafarian.cpython-37m-x86_64-linux-gnu.so
We can thus run tests with pytest
:
or import rustafarian
and use it in an interactive Python session spawned in our development environment.
Running ldd
on the DSO shows that there is no Rust runtime to link to:
$ ldd rustafarian/rustafarian.cpython-37m-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffe979ac000)
libdl.so.2 => /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libdl.so.2 (0x00007efd23df5000)
librt.so.1 => /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/librt.so.1 (0x00007efd23deb000)
libpthread.so.0 => /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libpthread.so.0 (0x00007efd23dca000)
libgcc_s.so.1 => /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libgcc_s.so.1 (0x00007efd23bb4000)
libc.so.6 => /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6 (0x00007efd239fe000)
/nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib64/ld-linux-x86-64.so.2 (0x00007efd23e63000)
Packaging and deployment
It is now time to distribute this little Rust+Python package by deploying it to PyPI. PEP 517 introduced the pyproject.toml
file to specify build systems for Python packages. Ours will specify maturin as our build tool and thus look like:
[build-system]
requires = ["maturin"]
build-backend = "maturin"
As you can see, there is no mention of user dependencies (Click, in our case) in this file. Nor of console scripts to be installed alongside the module. This information is contained in the Cargo.toml
:
[package.metadata.maturin]
requires-python = ">=3.6"
requires-dist = ["click>=7.0"]
scripts = {rustafarian = "rustafarian.cli:cli"}
classifier = ["Programming Language :: Python"]
Why is this so? maturin will generate the appropriate Python packaging information based on the contents of Cargo.toml
, thus replacing any Python-related packaging tool or configuration file.
Automated deploy to PyPI with Travis CI
maturin has a publish
CLI switch to upload a package to PyPI. The process can be automatised: whenever a new tag is pushed to GitHub, Travis will run a CI job. If successful, it will deploy the artifact to PyPI:
deploy:
provider: script
script: ".ci/deploy_to_pypi.sh"
on:
python: "3.6"
tags: true
repo: robertodr/rustafarian
The deployment script contains the following:
#!/usr/bin/env bash
# Use https://hub.docker.com/r/robertodr/maturin to deploy
docker run \
--env MATURIN_PASSWORD="$MATURIN_PASSWORD" \
--rm \
-v "$(pwd)":/io \
robertodr/maturin \
publish \
--interpreter python3.6 python3.7 \
--username robertodr \
--password "$MATURIN_PASSWORD"
where the MATURIN_PASSWORD
is the password to the PyPI account and is encrypted via the Travis CI web UI. We are building in a Docker container to ensure compliance with the manylinux1
policy. The Docker image is derived from the official manylinux1_x86_64
image and guarantees linkage to an ancient enough version of GLIBC. The Dockerfile
for the image can be found in the example repository.
Moving forward
This post should get you started using the excellent PyO3 and maturin projects for your Rust+Python mixed-language programming.
In an upcoming post, I’ll explain how to automate the environment set up using direnv and pin dependencies using Nix.
Acknowledgments
Thanks for @__radovan for reading and commenting on an early draft.
References
(1) Scott, C. J. C.; Di Remigio, R.; Crawford, T. D.; Thom, A. J. W. Diagrammatic Coupled Cluster Monte Carlo. J. Phys. Chem. Lett. 2019, 10 (5), 925–935. https://doi.org/10.1021/acs.jpclett.9b00067.