# An Intro to Plotnine

## Getting started with Plotnine, Python's ggplot2 clone

Many data scientists, analysts and visualization gurus start their careers (or academic work) using the R language and statistical framework. And the large majority of those people, this author included, become intimately familiar with R's most popular visualization library: `ggplot2`. The syntax of most Python visualization libraries is pretty different from `ggplot2`, so to make the transition easier, there have been a few attempts at recreating `ggplot2` in Python.

The most recent of those efforts is `plotnine` [documentation, github], a library that describes itself as A grammar of graphics for Python (also known as: a clone of `ggplot2`).

## A Basic Chart

Even though usually frowned upon due to polluting the global namespace, the common way to import the library so you can use it as you would in R is via `from plotnine import *`. If you're using Google Colaboratory environment, as of this post, `plotnine` is not included so you'll have to download it using the command `!pip install plotnine`.

``````# Load plotnine.
from plotnine import *

# Import vega datasets and load iris dataset.
from vega_datasets import data

df = data.iris()

# Create a simple scatter plot.
# Note, the parens wrapping the statement allow you to use `+` at the end of the line
# without escaping with a backslash.
(ggplot(df, aes('petalWidth', 'petalLength')) +
geom_point())
`````` Let's break that down quickly:

• Use `ggplot()` to create the base figure.
• `ggplot()` takes your data as its first argument and the "aesthetic mapping" as the second; basically, how you want to map your data to the figure and axes.
• `aes()` defines your mapping, the first argument being the `x` and the second the `y`. You can also explicitly map, e.g. `aes(x='petalWidth', y='petalLength')
• We add layers to the plot using the plus sign `+`; the main layer here being the points we want to add for each x, y pair. We use `geom_point()` to do that.

## Simple Style Changes

Style changes are easy and intuitive in Plotnine. For the marks themselves, just add arguments to the `geom_<type>()` function.

``````(ggplot(df, aes('petalWidth', 'petalLength')) +
geom_point(color='darkgreen', size=4)
)
`````` ## Adding More Dimensions to the Aesthetic

What about adding another dimension to the chart, e.g. the species of the flower? Again, it's very simple and pretty intuitive: we just add another mapping to the aesthetic (`aes()`). For example, `aes(..., color='species')` to map different colors to the `species` column of the dataset.

Just to see how powerful the grammar of graphics is, let's add trendlines with confidence bands as well via adding on `stat_smooth(method='lm')`.

``````(ggplot(df, aes('petalWidth', 'petalLength', color='species')) +
geom_point() +
stat_smooth(method='lm')
)
`````` This library is immensely powerful with an intuitive and consistent API. There are many more things to show which we'll follow up with in future posts. Hope that gives you a basic feel for `plotnine`.