In 2016 the food delivery wars in the United States were in full swing. Grubhub was the incumbent with around 50% of the market. The company had recently acquired Seamless and locked up business ordering but there were two major players nipping at its heels for consumer marketshare. UberEats and Doordash were both growing very quickly. Doordash was aggressively acquiring restaurants in the suburbs, and UberEats was leveraging billions in funding and its massive driver network to get its food delivery business off to a strong start.
The diner data and analytics team at Grubhub was less than half a dozen team members, mostly managing instrumentation and analysis of diner apps via Google Analytics tags. The team knew it needed more nuanced data collection to get a deeper insight into diner behavior and enable AB experimentation analysis beyond session-level insights and into diner lifetime insights. The team wasn't sure about the best way to design, instrument, trigger, and interpret diner behavioral data.
The challenge
How do we capture deeper diner behavioral data to evaluate product experiments and improve the user experience and the business?
The approach
This effort was a significant investment of time and energy from many teams across data analytics, data engineering, front-end engineering, product engineering, data science, product management, and finance.
I was fortunate to have had exposure to a very simple version of a clickstream analytics framework built and launched during my time at Etsy so I understood what was possible and where the begin.
Establish a framework. The first step was to establish a framework for breaking down the user experience into defined and repeatable components, defining naming conventions and web events tied to each of these components and then capturing when these components were triggered, loaded and visible to diners
Rollout the framework. Each new front-end feature had a parallel development process to instrument the front-end clickstream analytics tagging.
Trigger the key events. With every new app or page load there are a series of clickstream events that are triggered to capture the components of the application loaded and visible to a diner and what if anything the diner engaged with and clicked on.
Capture only what's necessary. Meta data tied to the component that the diner was seeing e.g. the menu item, restaurant, cuisine, image, price, was all captured and surfaced server side so that the front-end analytics tag could be as simple and light as possible, capturing only what was essential and only what the front-end application uniquely knew which was the position the diner saw it and wether the diner engaged with it.
The result
The resulting clickstream analytics framework, data set and team was extensive.
More than 300M web events were captured each day and ingested and transformed through an ETL process and organized along with being married to critical back-end server-side data such as the diner's profile and information about their exposure to active AB experiments.
This data was utilized to analyze AB experiments on more impactful metrics such as diner lifetime value rather than just in-session behavior, which many third-party AB experimentation platforms are limited to.
This data was also fed into machine learning models such as the company's diner search and sort algorithm to improve the order in which restaurants and dishes were being shown to diners. Additionally, this data also unlocked in-depth ad-hoc analysis into diner behavior and restaurant and menu item performance.