Embedded Python: How to use it effectively

Hi,

I wanted to write a little about this, as I feel that the embedded Python distribution available since Python 3.5 is perhaps underappreciated by many in the community.

As an experienced Python dev, I understand the benefits of using a properly installed Python, or using virtualenvs. And I agree that those should be the main ways that developers, especially more inexperienced ones, should work with Python.

However, the embedded Python distribution has obvious benefits for certain cases. For example, if you want to distribute software in Python but don’t want to compile it to an EXE using py2exe or similar, using the embedded environment is an excellent way to enable end users to use your software without needing to know that the software is necessarily written in Python, or at least to not need to install a Python environment prior to using it, or to simply have a clean and isolated Python distribution which won’t interfere with others in the system.

For those who understand this, I think it’s perfectly fine to leverage this environment.

However, there’s a few issues with the embedded environment which can be cumbersome and annoying.

Here’s a few of them:

Unable to import packages from the current working directory

This is not a problem if your current working directory is the same directory in which python.exe exists.

However, let’s take this example case: let’s say that you want to ensure that you can run on both 32- and 64-bit systems. In this case, if you want to embed the Python environment, you may end up with two such environments. You may have some launch scripts which select the correct environment based upon environment variables (e.g. PROCESSOR_ARCHITECTURE and/or PROCESSOR_ARCHITEW6432). In this case, maybe you have these two environments in subdirectories, or at least in some directory separate from your Python sources.

If you’re in the directory containing your sources, and attempt to run one of the embedded Pythons via the ‘-m’ flag (e.g. “py32\python.exe -m mypackage.main”), you’ll find that it doesn’t work. Even when you can clearly see your code in the current directory, it won’t work.

The problem has to do with how sys.path is calculated. Normally it includes ” as the first entry, which means the current working directory. After that it’ll add several other directories, including some it may discover via the Windows registry. Now, for an embedded environment, this type of behavior is a problem – we don’t want to pull anything from any system-installed Python distributions. So, the embedded python includes a _pth file (e.g. python36._pth) which overrides this behavior and explicitly specifies the paths which should be included in sys.path. (Source)

However, this itself has a problem. It doesn’t include the current working directory. It includes ‘.’, which specifically refers to the directory containing the _pth file – if you run Python from a different directory, then that will not be the same thing as the current working directory. Thus, things may break unless all your code is placed directly within in the embedded Python environment.

Thankfully, if you know about the _pth file, then you can change this rather easily. (Or after you spend hours or days trying to figure this out and eventually stumble across the above mentioned doc, since the embedded distribution docs, while being on the same page as the others info, don’t explain or link back to the other section.)

There’s 2 steps to fixing this particular issue:

  1. Edit the _pth file, uncommenting the “import site” line.
  2. Create a sitecustomize.py, with the contents:
    import sys
    sys.path.insert(0, '')

This fixes the issue – now you can run the Python and get normal behavior regarding code in the current working directory, but without the side effects of pulling from system-installed Python environments that you may get from simply deleting the ._pth file.

Missing Tkinter library

From my experience, the most conspicuous missing library in the embedded environment is Tkinter. (Indeed, I don’t know of any other libs normally available in the python.org distribution of Python 3 which aren’t also included in the embedded environment, or at least none that I needed.)

Unfortunately, there does not seem to be a wheel or similar for Tkinter. There is a very strong assumption that people don’t use this embedded environment. (Or, perhaps not enough people have used it and work just hasn’t been done to make Tkinter something that can be optionally installed easily via pip or similar.)

The simplest solution I know is just to copy the missing files from a regular Python installation into the embedded environment. Note that I’m not a lawyer and I’m not 100% sure whether this is okay – if you do so though, it’s at your own risk. Hopefully someone upstream will see this and do the appropriate work to break this out.

Assuming it is okay, what seemed to work in my case was copying these files into the root of the embedded environment, alongside python.exe:

  1. The tkinter Python package from the Lib directory. (Copy the directory, not just its contents.)
  2. The _tkinter.pyd extension library from the DLLs directory.
  3. tcl86t.dll and tk86t.dll, also from the DLLs directory.
  4. The tcl directory. (Again, copy the directory, not just its contents.)

I haven’t tested this procedure, but I believe this is what I did previously during some experiments and I believe this worked.

Installing packages

As mentioned in the docs, pip is not included in the embedded environment.

I have not tried to install pip natively into the environment. In my case, I found this unnecessary. I didn’t need my end users to use pip, I just needed to use pip for getting the appropriate dependency libraries for what I was working on.

Pip has a –target flag which will let you download and install packages to the directory of your choice. This works very well. The simplest solution is to use this and set –target as the embedded Python directory. Be sure to use the pip from a matching Python version, and especially a matching architecture – if you pull libraries with binary components, you don’t want to accidentally install 64-bit DLLs into a 32-bit distribution by mistake. I also like using the –no-compile flag to keep Python modules as plain .py files rather than .pyc files, but that’s mostly a matter of personal preference.

Conclusion

So, with the above, you should be able to set up an embedded Python environment with more flexibility while preserving the isolation from system-installed Python environments. I hope this helps someone out as, I hate to say, the “current directory” issue took me awhile to figure out despite everything technically being documented.

Cheers!

Leave a Reply

Your email address will not be published. Required fields are marked *