Monthly Archives: September 2016

Adaptive Payments is Moving to Limited Release – What you Need to Know

By

On October 6th we will begin the process of moving the Adaptive Payments product into a limited release mode. Limited release means a few things for Adaptive Payments in this case:

  • Adaptive Payments will be restricted to select partners for approved use cases and should not be used for new integrations without guidance from PayPal.
  • The Adaptive Payments documentation will only be accessible from a single new location, the documentation directory.
  • All references to Adaptive Payments as a solution within the documentation will be removed / replaced with the best current solutions.

Adaptive Payments will continue to be a fully supported offering, so integrations will continue to function without interruption. Our end goal with this project is to migrate all existing users of Adaptive Payments on to the modern products that will be the core of our future development APIs, namely Braintree v.zero and the PayPal REST APIs.

Why we’re moving to a limited release

This is the first step in a far reaching effort to continue to provide modern offerings, and the best solutions, for developers. To continue this, we are centralizing our efforts behind the development and optimization of the products that are built towards supporting the future of the payments industry.

Much of the functionality provided by Adaptive Payments is available in newer solutions that fully support the latest in consumer experiences from PayPal, such as mobile optimization and One Touch. However, Adaptive Payments is still the best solution for a small set of use cases, which is why we will continue to offer it as a limited release product for select partners.

Existing applications and Adaptive Payments users

For existing users of Adaptive Payments, your applications will continue to work during this process, and you will not experience any disruption in service. With that said, we will be working hard in the coming months to provide migration guides, advice, and support for moving existing developers towards integrating Braintree v.zero or the PayPal REST APIs, which will be continually updated to support any modern features or payment methods that should arise.

If you are not currently using Adaptive Payments for your payment integration, it is not recommended that you create a new service with Adaptive Payments as the payment integration mechanism.

Stay tuned for updates.

Python by the C side

By

C shells by the C shoreMahmoud’s note: This will be my last post on the PayPal Engineering blog. If you’ve enjoyed this sort of content subscribe to my blog/pythondoeswhat.com or follow me on Twitter. It’s been fun!

All the world is legacy code, and there is always another, lower layer to peel away. These realities cause developers around the world to go on regular pilgrimage, from the terra firma of Python to the coasts of C. From zlib to SQLite to OpenSSL, whether pursuing speed, efficiency, or features, the waters are powerful, and often choppy. The good news is, when you’re writing Python, C interactions can be a day at the beach.

 

A brief history

As the name suggests, CPython, the primary implementation of Python used by millions, is written in C. Python core developers embraced and exposed Python’s strong C roots, taking a traditional tack on portability, contrasting with the “write once, debug everywhere” approach popularized elsewhere. The community followed suit with the core developers, developing several methods for linking to C. Years of these interactions have made Python a wonderful environment for interfacing with operating systems, data processing libraries, and everything the C world has to offer.

This has given us a lot of choices, and we’ve tried all of the standouts:

Approach Vintage Representative User Notable Pros Notable Cons
C extension modules 1991 Standard library Extensive documentation and tutorials. Total control. Compilation, portability, reference management. High C knowledge.
SWIG 1996 crfsuite Generate bindings for many languages at once Excessive overhead if Python is the only target.
ctypes 2003 oscrypto No compilation, wide availability Accessing and mutating C structures cumbersome and error prone.
Cython 2007 gevent, kivy Python-like. Highly mature. High performance. Compilation, new syntax and toolchain.
cffi 2013 cryptography, pypy Ease of integration, PyPy compatibility New/High-velocity.

There’s a lot of history and detail that doesn’t fit into a table, but every option falls into one of three categories:

  1. Writing C
  2. Writing code that translates to C
  3. Writing code that calls into libraries that present a C interface

Each has its merits, so we’ll explore each category, then finish with a real, live, worked example.

Writing C

Python’s core developers did it and so can you. Writing C extensions to Python gives an interface that fits like a glove, but also requires knowing, writing, building, and debugging C. The bugs are much more severe, too, as a segmentation fault that kills the whole process is much worse than a Python exception, especially in an asynchronous environment with hundreds of requests being handled within the same process. Not to mention that the glove is also tailored to CPython, and won’t fit quite right, or at all, in other execution environments.

At PayPal, we’ve used C extensions to speed up our service serialization. And while we’ve solved the build and portability issue, we’ve lost track of our share of references and have moved on from writing straight C extensions for new code.

Translating to C

After years of writing C, certain developers decide that they can do better. Some of them are certainly onto something.

Going Cythonic

Cython is a superset of the Python programming language that has been turning type-annotated Python into C extensions for nearly a decade, longer if you count its predecessor, Pyrex. Apart from its maturity, the points that matters to us are:

  • Every Python file is a valid Cython file, enabling incremental, iterative optimization
  • The generated C is highly portable, building on Windows, Mac, and Linux
  • It’s common practice to check in the generated C, meaning that builders don’t need to have Cython installed.

Not to mention that the generated C often makes use of performance tricks that are too tedious or arcane to write by hand, partially motivated by scientific computing’s constant push. And through all that, Cython code maintains a high level of integration with Python itself, right down to the stack trace and line numbers.

PayPal has certainly benefitted from their efforts through high-performance Cython users like gevent, lxml, and NumPy. While our first go with Cython didn’t stick in 2011, since 2015, all native extensions have been written and rewritten to use Cython. It wasn’t always this way however.

A sip, not a SWIG

An early contributor to Python at PayPal got us started using SWIG, the Simplified Wrapper and Interface Generator, to wrap PayPal C++ infrastructure. It served its purpose for a while, but every modification was a slog compared to more Pythonic techniques. It wasn’t long before we decided it wasn’t our cup of tea.

Long ago SWIG may have rivaled extension modules as Python programmers’ method of choice. These days it seems to suit the needs of C library developers looking for a fast and easy way to wrap their C bindings for multiple languages. It also says something that searching for SWIG usage in Python nets as much SWIG replacement libraries as SWIG usage itself.

Calling into C

So far all our examples have involved extra build steps, portability concerns, and quite a bit of writing languages other than Python. Now we’ll dig into some approaches that more closely match Python’s own dynamic nature: ctypes and cffi.

Both ctypes and cffi leverage C’s Foreign Function Interface (FFI), a sort of low-level API that declares callable entrypoints to compiled artifacts like shared objects (.so files) on Linux/FreeBSD/etc. and dynamic-link libraries (.dll files) on Windows. Shared objects take a bit more work to call, so ctypes and cffi both use libffi, a C library that enables dynamic calls into other C libraries.

Shared libraries in C have some gaps that libffi helps fill. A Linux .so, Windows .dll, or OS X .dylib is only going to provide symbols: a mapping from names to memory locations, usually function pointers. Dynamic linkers do not provide any information about how to use these memory locations. When dynamically linking shared libraries to C code, header files provide the function signatures; as long as the shared library and application are ABI compatible, everything works fine. The ABI is defined by the C compiler, and is usually carefully managed so as not to change too often.

However, Python is not a C compiler, so it has no way to properly call into C even with a known memory location and function signature. This is where libffi comes in. If symbols define where to call the API, and header files define what API to call, libffi translates these two pieces of information into how to call the API. Even so, we still need a layer above libffi that translates native Python types to C and vice versa, among other tasks.

ctypes

ctypes is an early and Pythonic approach to FFI interactions, most notable for its inclusion in the Python standard library.

ctypes works, it works well, and it works across CPython, PyPy, Jython, IronPython, and most any Python runtime worth its salt. Using ctypes, you can access C APIs from pure Python with no external dependencies. This makes it great for scratching that quick C itch, like a Windows API that hasn’t been exposed in the os module. If you have an otherwise small module that just needs to access one or two C functions, ctypes allows you to do so without adding a heavyweight dependency.

For a while, PayPal Python code used ctypes after moving off of SWIG. We found it easier to call into vanilla shared objects built from C++ with an extern C rather than deal with the SWIG toolchain. ctypes is still used incidentally throughout the code for exactly this: unobtrusively calling into certain shared objects that are widely deployed. A great open-source example of this use case is oscrypto, which does exactly this for secure networking. That said, ctypes is not ideal for huge libraries or libraries that change often. Porting signatures from headers to Python code is tedious and error-prone.

cffi

cffi, our most modern approach to C integration, comes out of the PyPy project. They were seeking an approach that would lend itself to the optimization potential of PyPy, and they ended up creating a library that fixes many of the pains of ctypes. Rather than handcrafting Python representations of the function signatures, you simply load or paste them in from C header files.

For all its convenience, cffi’s approach has its limits. C is really almost two languages, taking into account preprocessor macros. A macro performs string replacement, which opens a Fun World of Possibilities, as straightforward or as complicated as you can imagine. cffi’s approach is limited around these macros, so applicability will depend on the library with which you are integrating.

On the plus side, cffi does achieve its stated goal of outperforming ctypes under PyPy, while remaining comparable to ctypes under CPython. The project is still quite young, and we are excited to see where it goes next.

A Tale of 3 Integrations: PKCS11

We promised an example, and we almost made it three.

PKCS11 is a cryptography standard for interacting with many hardware and software security systems. The 200-plus-page core specification includes many things, including the official client interface: A large set of C header-style information. There are a variety of pre-existing bindings, but each device has its own vendor-specific quirks, so what are we waiting for?

Metaprogramming

As stated earlier, ctypes is not great for sprawling interfaces. The drudgery of converting function signatures invites transcription bugs. We somewhat automated it, but the approach was far from perfect.

Our second approach, using cffi, worked well for our first version’s supported feature subset, but unfortunately PKCS11 uses its own CK_DECLARE_FUNCTION macro instead of regular C syntax for defining functions. Therefore, cffi’s approach of skipping #define macros will result in syntactically invalid C code that cannot be parsed. On the other hand, there are other macro symbols which are compiler or operating system intrinsics (e.g. __cplusplus, _WIN32, __linux__). So even if cffi attempted to evaluate every macro, we would immediately runs into problems.

So in short, we’re faced with a hard problem. The PKCS11 standard is a gnarly piece of C. In particular:

  1. Many hundreds of important constant values are created with #define
  2. Macros are defined, then re-defined to something different later on in the same file
  3. pkcs11f.h is included multiple times, even once as the body of a struct

In the end, the solution that worked best was to write up a rigorous parser for the particular conventions used by the slow-moving standard, generate Cython, which generates C, which finally gives us access to the complete client, with the added performance bonus in certain cases. Biting this bullet took all of a day and a half, we’ve been very satisfied with the result, and it’s all thanks to a special trick up our sleeves.

Parsing Expression Grammars

Parsing expression grammars (PEGs) combine the power of a true parser generating an abstract syntax tree, not unlike the one used for Python itself, with the convenience of regular expressions. One might think of PEGs as recursive regular expressions. There are several good libraries for Python, including parsimonious and parsley. We went with the former for its simplicity.

For this application, we defined a two grammars, one for pkcs11f.h and one for pkcs11t.h:

PKCS11F GRAMMAR

    file = ( comment / func / " " )*
    func = func_hdr func_args
    func_hdr = "CK_PKCS11_FUNCTION_INFO(" name ")"
    func_args = arg_hdr " (" arg* " ); #endif"
    arg_hdr = " #ifdef CK_NEED_ARG_LIST" (" " comment)?
    arg = " " type " " name ","? " " comment
    name = identifier
    type = identifier
    identifier = ~"[A-Z_][A-Z0-9_]*"i
    comment = ~"(/\*.*?\*/)"ms

PKCS11T GRAMMAR

    file = ( comment / define / typedef / struct_typedef / func_typedef / struct_alias_typedef / ignore )*
    typedef = " typedef" type identifier ";"
    struct_typedef = " typedef struct" identifier " "? "{" (comment / member)* " }" identifier ";"
    struct_alias_typedef = " typedef struct" identifier " CK_PTR"? identifier ";"
    func_typedef = " typedef CK_CALLBACK_FUNCTION(CK_RV," identifier ")(" (identifier identifier ","? comment?)* " );"    member = identifier identifier array_size? ";" comment?
    array_size = "[" ~"[0-9]"+ "]"
    define = "#define" identifier (hexval / decval / " (~0UL)" / identifier / ~" \([A-Z_]*\|0x[0-9]{8}\)" )
    hexval = ~" 0x[A-F0-9]{8}"i
    decval = ~" [0-9]+"
    type = " unsigned char" / " unsigned long int" / " long int" / (identifier " CK_PTR") / identifier
    identifier = " "? ~"[A-Z_][A-Z0-9_]*"i
    comment = " "? ~"(/\*.*?\*/)"ms
    ignore = ( " #ifndef" identifier ) / " #endif" / " "

Short, but dense, in true grammatical style. Looking at the whole program, it’s a straightforward process:

  1. Apply the grammars to the header files to get our abstract syntax tree.
  2. Walk the AST and sift out the semantically important pieces, function signatures in our case.
  3. Generate code from the function signature data structures.

Using only 200 lines of code to bring such a massive standard to bear, along with the portability and performance of Cython, through the power of PEGs ranks as one of the high points of Python in practice at PayPal.

Wrapping up

It’s been a long journey, but we stayed afloat and we’re happy to have made it. To recap:

  • Python and C are hand-in-glove made for one another.
  • Different C integration techniques have their applications, our stances are:
    • ctypes for dynamic calls to small, stable interfaces
    • cffi for dynamic calls to larger interfaces, especially when targeting PyPy
    • Old-fashioned C extensions if you’re already good at them
    • Cython-based C extensions for the rest
    • SWIG pretty much never
  • Parsing Expression Grammars are great!

All of this encapsulates perfectly why we love Python so much. Python is a great starter language, but it also has serious chops as a systems language and ecosystem. That bottom-to-top, rags-to-riches, books-to-bits story is what makes it the ineffable, incomparable language that it is.

C you around!

Kurt and Mahmoud

Spark in Flames – Profiling Spark Applications Using Flame Graphs

By

When your organization runs multiple jobs on a Spark cluster, resource utilization becomes a priority. Ideally, computations receive sufficient resources to complete in an acceptable time and release resources for other work.

In order to make sure applications do not waste any resources, we want to profile their threads to try and spot any problematic code. Common profiling methods are difficult to apply to a distributed application running on a cluster.

This post suggests an approach to profiling Spark applications. The form of thread profiling used is sampling – capturing stack traces and aggregating these stack traces into meaningful data, in this case displayed as Flame Graphs. Flame graphs are graphs which show what percentage of CPU time is spent running which methods in a clear and user-friendly way as interactive SVG files (http://www.brendangregg.com/flamegraphs.html http://techblog.netflix.com/2015/07/java-in-flames.html). This solution is a derivation of the Hadoop jobs thread profiling solution described in a post by Ihor Bobak which relies on Etsy’s statsd-jvm-profiler.

OVERVIEW

Since Spark applications run on a cluster, and computation is split up into different executors (on different machines), each running their own processes, profiling such an application is trickier than profiling a simple application running on a single JVM. We need to capture stack traces on each executor’s process and collect them to a single place in order to compute Flame Graphs for our application.

To achieve this we specify the statsd-jvm-profiler java agent in the executor processes to capture stack traces. We configure the java agent to report to InfluxDB (a time-series database to centrally store all stack traces with timestamps). Next, we run a script which dumps the stack traces from InfluxDB to text files, which we then input to Flame Graphs SVG rendering script.

You can generate flame graphs for specific executors, or an aggregation of all executors.

As with other profiling methods, your application will incur slight performance overhead by running the profiling itself, so beware of constantly running your applications in production with the java agent.

 

Spark Application Profiling Overview

RESULT

Below is an example of the result of the solution described. What you see is a Flame Graph, aggregated over stack traces taken from all executors running a simple Spark application, which contains a problematic method named inefficientMethod in its code. By studying the Flame Graph we see that the application spends 41.29% of its threads running this method. The second column in the graph, in which we find ineffecientMethod, is the application’s user code. The other columns are Spark native threads which appear in most of the stack traces so the focus of our profiling is on the column containing the user code. By focusing on that second column, we see that ineffecientMethod method takes up most of the user code’s CPU run time and the other parts of the user code are just a tiny sliver reaching the top of the Flame Graph.

Spark in Flames_Image 2

SETUP

1. Download and install InfluxDB 1.0.0

1.0.0 is currently in beta, however 0.13 has concurrency bugs which cause it to fail (See: https://github.com/influxdata/influxdb/issues/6235)

https://influxdata.com/downloads/#influxdb

Run it:

# linux
sudo service influxdb start

Access the DB using either the web UI (http://localhost:8083/) or shell:

# linux
/user/bin/influx

Create a database to store the stack traces in:

CREATE DATABASE profiler

Create a user:

CREATE USER profiler WITH PASSWORD ‘profiler’ WITH ALL PRIVILEGES

2. Build/download statsd-jvm-profiler jar

https://github.com/etsy/statsd-jvm-profiler

Deploy the jar to the machines which are running the executor processes. One way to do this is to using spark-submit’s –jars attribute, which will deploy it to the executor.

--jars /path/to/statsd-jvm-profiler-2.1.0-jar-with-dependencies.jar

Specify the Java agent in your executor processes. This can be done, for example, by using spark-submit’s –conf attribute

-javaagent:statsd-jvm-profiler-2.1.0-jar-with-dependencies.jar=server=<INFLUX_DB_HOST>,port=<INFLUX_DB_PORT>,reporter=InfluxDBReporter,database=<INFLUX_DB_DATABASE_NAME>,username=<INFLUX_DB_USERNAME>,password=<INFLUX_DB_PASSWORD>,prefix=<TAG_VALUE_1>.<TAG_VALUE_2>.….<TAG_VALUE_N>,tagMapping=<TAG_NAME_1>.<TAG_NAME_2>.….<TAG_NAME_N>

3. Download influxdb_dump.py

https://github.com/aviemzur/statsd-jvm-profiler/blob/master/visualization/influxdb_dump.py

Install all required python modules that influxdb_dump.py imports.

Run influxdb_dump.py to create text files with stack traces as input for flame graphs:

python influxdb_dump.py -o "<INFLUX_DB_HOST>" -u <INFLUX_DB_USERNAME> -p <INFLUX_DB_PASSWORD> -d <INFLUX_DB_DATABASE_NAME> -t <TAGS> -e <VALUES> -x "<OUTPUT_DIR>"

4. Download flamegraph.pl

https://github.com/brendangregg/FlameGraph/blob/master/flamegraph.pl

Generate flame graphs using the text files you dumped from DB:

flamegraph.pl <INPUT_FILES> > <OUTPUT_FILES>

Example

The following is an example of running and profiling a Spark application.

Submit spark application:

spark-submit \
--deploy-mode cluster \
--master yarn \
--class com.mycorporation.MySparkApplication \
--conf “spark.executor.extraJavaOptions=-javaagent:statsd-jvm-profiler-2.1.0-jar-with-dependencies.jar=server=influxdbhost.mycorporation.com,port=8086,reporter=InfluxDBReporter,database=profiler,username=profiler,password=profiler,prefix=MyNamespace.MySparkApplication,tagMapping=namespace.application” \
--name MySparkApplication \
-jars /path/to/profiling/statsd-jvm-profiler-2.1.0-jar-with-dependencies.jar \
MySparkApplication.jar

Dump stack traces from DB:

influxdb_dump.py -o "influxdbhost.mycorporation.com" -u profiler -p profiler -d profiler -t namespace.application -e MyNamespace.MySparkApplication -x "my_stack_traces"

Generate Flame Graph (In this case for all executors):

flamegraph.pl my_stack_traces/all_*.txt > my_flame_graph.svg

Python Packaging at PayPal

By

Year after year, Pythonists all over are churning out more code than ever. People are learning, the ecosystem is flourishing, and everything is running smoothly, right up until packaging. Packaging Python is fundamentally un-Pythonic. It can be a tough lesson to learn, but across all environments and applications, there is no one obvious, right way to deploy. Frankly, it’s hard to think of an area where Python’s Zen applies less.

At PayPal, we write and deploy our fair share of Python, and we wanted to devote a couple minutes to our story and give credit where credit is due. For conclusion seekers, without doubt or further ado: Continuum Analytics’ Anaconda Python distribution has made our lives so much easier. For small- and medium-sized teams, no matter the deployment scale, Anaconda has big implications. But let’s talk about how we got here.

Beginnings

Right now, PayPal Python Infrastructure provides equitable support for Windows, OS X, Linux, and Solaris, supporting various combinations of 32-bit and 64-bit Python 2.6, Python 2.7, and PyPy 5.

Glossing over the primordial days, when Kurt and I started building the Python platform at PayPal, we didn’t know we would be building the first cross-platform stack the company had ever seen. It was December 2012, we just wanted to see every developer unwrap a brand new laptop running PayPal Python services locally.

What ensued was the most intense engineering sprint I had ever experienced. We ported critical functionality previously only available in shared objects we had been calling into with ctypes. Several key parts were available in binary form only and had to be disassembled. But with the New Year, 2013, we were feeling like a whole new stack. All the PayPal-specific parts of our framework were pure-Python and portable. Just needed to install a few open-source libraries, like gevent, greenlet, maybe lxml. Just pip install, right?

Up the hill

In an environment where Python is still a new technology to most, pip is often not available, let alone understood. This learning curve can represent a major hurdle to many. We wanted more people to be able to write Python, and even more to be able to run it, as many places as possible, regardless of whether they were career Pythonists. So with a judicious shake of Python simplicity, we adopted a policy of “vendoring in” all of our core dependencies, including compiled extensions, like gevent.

This model yields somewhat larger repositories, but the benefits outweighed a few extra seconds of clone time. Of all the local development stories, there is still no option more empowering than the fully self-contained repository. Clone and run. A process so seamless, it’s like a miniature demo that goes perfect every time. In a world of multi-hour C++ and Java builds, it might as well be magic.

“So what’s the problem?”

Static builds. Every few months (or every CVE) the Python team would have to sit down to refresh, regression test, and certify a new set of libraries. New libraries were added sparingly, which is great for auditability, but not so great for flexibility. All of this is fine for a tight set of networking, cryptography, and serialization libraries, but no way could we support the dozens of dependencies necessary for machine learning and other advanced Python use cases.

And then came Anaconda. With the Anaconda Python distribution, Continuum is doing effectively what our team had been doing, but for free, for everyone, for hundreds of libraries. Finally, there was a standard option that made Python even simpler for our developers.

Adopting and adapting

As soon as we had the opportunity, we made Anaconda a supported platform for development. From then on, regardless of platform, Python beginners got one of two introductions: Install Anaconda, or visit our shared Jupyter Notebook, also backed by Anaconda.

The good kind of escalation

Today, Anaconda has gone beyond development environments to enable production PayPal machine learning applications for the better part of a year. And it’s doing so with more optimizations than we can shake a stick at, including running all the intensive numerical operations on Intel’s MKL. From now on, Python applications exist on a moving walkway to production perfection.

This was realized through two Anaconda packaging models that work for us. The first preinstalls a complete Anaconda on top of one of PayPal’s base Docker images. This works, and is buzzword-compliant, but for reasons outside the scope of this post, also entails maintaining a single large Docker image with the dependencies of all our downstream users.

As with all packaging, there’s always another way. One alternative approach that has worked well for us involves a little Continuum project known as Miniconda. This minimalist distribution has just enough to make Python and conda work. At build time, our applications package Miniconda, the bzip2 conda archives of the dependencies, and a Python installer, wrapped up with a CalVer filename. At deploy time, we install Miniconda, then conda install the dependencies. No downloads, no compilation, no outside dependencies. The code is only a little longer than the description of the process. Conda envs are more powerful than virtualenvs, and have a better cross-platform, cross-dev/prod story, as well. Developers enjoy the increased control, smaller packages, and applicability across both standard and containerized environments.

Packages to come

As stated in Enterprise Software with Python, packaging and deployment is not the last step. The key to deployment success is uniform, well-specified environments, with minimal variation between development and production. Or use Anaconda and call it good enough! We sincerely thank the Anaconda contributors for their open-source contributions, and hope that their reach spreads to ever more environments and runtimes.