This tutorial will guide you through the process of integrating WebDataRocks Pivot Table with Jupyter Notebook – a web-based interactive application for sharing notebooks with live code, visualizations, text, and other media. As a result, you will get a notebook empowered with a pivot table for interactive data exploration and data analysis.
Before installing the Jupyter Notebook, make sure Python is installed on your machine. Jupyter requires Python 3.3 or greater, or Python 2.7.
Next, start using Jupyter in one of the following ways:
To integrate WebDataRocks Pivot Table with Jupyter Notebook, follow the next steps:
Step 1. Start the notebook server. Then, create a new Notebook or open the existing one.
You can start the notebook server in the following ways:
Step 2. Import the following Python libraries into the notebook:
IPython.display
– an API for display tools in IPython. From this module, import only the HTML
class for rendering the HTML content in the notebook.json
– a module for serializing and de-serializing Python objects.pandas
– a library for working with data frames.from IPython.display import HTML import json import pandas as pd
Step 3. Define a function that accepts JSON-formatted string and renders it as a pivot table on the HTML page:
def pivot(webdatarocks_json_object): code = ''' <link href="https://cdn.webdatarocks.com/latest/webdatarocks.min.css" rel="stylesheet"/> <script src="https://cdn.webdatarocks.com/latest/webdatarocks.toolbar.min.js"></script> <script src="https://cdn.webdatarocks.com/latest/webdatarocks.js"></script> <h1>WebDataRocks Integration with Jupyter Notebook</h1> <div id="pivot-container"></div> <script> new WebDataRocks({0}); </script> '''.format(webdatarocks_json_object) return HTML(code)
Step 4. Create a pandas DataFrame and fill it with data. The data can come from a CSV/JSON/Excel file, SQL database, or other type of storage. For simplicity, we will use the static inline data:
df = pd.DataFrame( [["Apple pie", "20"], ["Lemon cake", "30"]], index=["row 1", "row 2"], columns=["Product", "Quantity"], )
Step 5. Convert the DataFrame to a JSON string:
json_data = df.to_json(orient='records')
Note that it’s important to set the orient
parameter to 'records'
. This way, the object will be translated into a list-like structure: [{column -> value}, … , {column -> value}]
. This structure corresponds to the format of a JSON data source accepted by WebDataRocks.
Step 6. Create an instance of WebDataRocks using the dictionary as follows:
webdatarocks = { "container": "#pivot-container", "width": "100%", "height": 430, "toolbar": True, "report": { "dataSource": { "type": "json", "data": json.loads(json_data) }, "slice": { "rows": [ { "uniqueName": "Product" } ], "columns": [ { "uniqueName": "[Measures]" } ], "measures": [ { "uniqueName": "Quantity", "aggregation": "sum" } ] } } }
Here we have specified the initialization parameters and set the slice. For connecting to a data source, we have decoded JSON using the json.loads()
method and set the response to the dataSource.data
property of the pivot table.
Step 7. Use json.dumps()
to encode the webdatarocks
object to a JSON-formatted string.
webdatarocks_json_object = json.dumps(webdatarocks)
Step 8. Pass the JSON-formatted string with the pivot table’s configuration to the previously defined pivot
function:
pivot(webdatarocks_json_object)
Step 9. Run the code contained in the notebook’s cells by selecting Run > Run All Cells. You can also run the cells separately, one after another.
The pivot table component will be rendered as an output of the cell where the pivot
function is called.
Step 1. Download or clone our sample project from GitHub:
git clone https://github.com/WebDataRocks/pivot-jupyter-notebook/
Step 2. Start the notebook server. You can do it in the following ways:
Step 3. Import the WebDataRocks_in_Jupyter_Notebook.ipynb
file into the Jupyter working directory.
Step 4. Run the code in the notebook’s cells by selecting Run > Run All Cells. You can also run the cells separately, one after another.
The pivot table component will be rendered as an output of the cell where the pivot
function is called.