When it comes to delivering eatables to customers through an app, on-time delivery matters the most. Munchies, a Pakistani e-commerce snacking solution, was founded in 2019 by Unilever and VentureDive, with the aim to deliver snacks instantly to customers using an app. The app should allow them to choose from a variety of snacks including ice cream, chips, chocolates, and more.
Being the only on demand delivery app for snacks in Pakistan, Munchies got consumers’ attention swiftly, and began receiving a reasonable amount of orders, daily. Munchies, however, had to face challenges of on-time delivery, as well as, updating the customer on an accurate Estimated Time of Arrival (ETA). In other words, the Estimated Time of Arrival (ETA) for Munchies was taking longer than expected – it was inaccurate.
Munchies on demand delivery system
Similar to other on demand delivery apps for the food industry, Munchies was designed to focus on customers, stores, and riders. Whenever a customer places an order, a rider is sent a request to accept the order. After an order is accepted by a rider, a store is shortlisted to collect order items from. Eventually, when all order items are collected, the rider is all set to leave for the dropoff location and deliver the order to the customer. The figure below shows the current dispatch system.
Problems with Munchies snack delivery workflow
While the on demand delivery flow looks manageable, it has many unforeseen complexities to it. In a hypothetical situation, if an item is not available at the store, then the rider might need to ask the customer either to cancel that item or go to another store, which will result in an increased ETA. Similarly, the larger the basket size (number of items in an order) of an order, the larger the time to collect all these items. Another reason for altering the ETA can be the long queues at stores given the pandemic.
The current system that estimated the time of arrival was Google Maps, that is, Google Distance Matrix API, which simply estimates the time from order acceptance to arrival at the store and then from store to the dropoff location of the customer. The total ETA is simply the sum of these two plus a fixed buffer of a few minutes. While Google is no doubt an efficient system to get ETA, but for Munchies, it just concentrates on the longitude and latitude of rider, store, and customer’s dropoff location. It is not aware of the working of Munchies and all the complexities discussed above.
The obstacle was that Google was providing an accurate ETA for the latitude, longitude pair, giving a time of 25 min, but because of the working and structure of Munchies, an order would, almost all of the time, take longer than what Google had provided. All this was not leaving a good impression on the customer. Another reason for a longer ETA was that in some areas, there were huge amounts of orders while it had somewhat fewer stores. This led to the suggestion that some stores can be added to reduce ETA and deliver quicker.
VentureDive’s strategy to resolve the ETA prediction issue
In order to solve this obstacle, the data science team at VentureDive was put in charge to dig deeper and come up with a viable solution.
After a thorough analysis of the data with respect to ETA, a number of key problems of using Google Maps were identified by the team. The main problem highlighted was that while Google is aware of the traffic conditions or roads of the city, it knows nothing about the internal operations of Munchies. For example, there might be a specific area in the city that takes too long or there might be some specific stores responsible for longer ETA. Munchies orders data is being stored on a daily basis but Google does not consider any of the data recorded for predicting ETA except latitudes and longitudes.
The main purpose was to make use of the historical data we had at our disposal to predict ETA. The team provided the solution which was simple but efficient. An in-house system ( a prediction model) that will predict ETA for orders.
Benefits of accurate ETA prediction for on demand delivery
The benefit of using this solution is that we will be using several more features to predict ETA than just latitude and longitude. Following are the features:
- Time & Location
When an order is placed, we can tell our prediction system that the order is for this specific area or store respectively.
- Basket size
We can also specify the basket size (the number of items in an order or even the rider information or vehicle type.
We can add weather data which, of course, has an impact on delivery time.
The foremost benefit of having an in-house prediction system is that since it will be using historical data, it will automatically take into account all those complexities that we discussed above. It is also cost-efficient and entirely under our control. We can make changes anytime required as per the necessities which is not the case with Google Maps.
Building the ETA Prediction solution to enable on-time deliveries
Munchies on demand delivery model is designed in such a way that cities are divided into service areas and each service area has its own dedicated fleet of delivery that is instructed to remain near the restaurants to make delivery as quick as possible.
The team started with exploring the data and analyzing it specifically in terms of ETA. Data was already being collected from the app so the data acquisition stage was relatively simple. In the analysis phase, the team research came with quite interesting insights. Three research highlighted that there were some areas in the city that were constantly taking longer than expected. Similarly, orders from some stores were also taking too long to deliver. The most fascinating find was the dispatch algorithm currently in use could be improved which would reduce ETA.
After all the thorough research and solutions, now it was time to start building a machine learning model that would be trained on order data and then use it to predict the ETA of future orders. In order to build this model, the team did processing on the data which included:
- Feature Engineering
The data had some timestamp features, so a month, day, week, an hour, or even minute and seconds can be extracted from a single DateTime feature which can possibly add to the model’s performance and data analysis.
- Outliers Detection and Removal
The real-world data is always disordered and comes with outliers, but not all machine learning models can handle them. Using statistical techniques, outliers were removed from the order data since it not only affects analysis but also the model’s performance as well.
- Missing Values
Another problem with real-world data is that it contains some missing values. The same was the case with our data. Again, with the help of statistical techniques, the team solved the problem of missing values.
There were some features that were categorical in nature. Since many machine learning models cannot process categorical data, these features needed to be converted into numerical form, and for this reason, encoding techniques were used.
- Drop Irrelevant Columns
Not all the features are important for a machine learning model. Columns like timestamps, Ids, etc. So these features can be dropped.
Once the data was filtered, meaning, it had no outliers, missing values, or categorical features, it was time to move towards the machine learning part where the data was first split into training, testing, and validation sets.
Overcoming the hurdles & challenges
One of the challenges in machine learning is choosing an appropriate machine learning algorithm. Since predicting ETA (continuous values) is a regression problem, we had to choose a regression algorithm. The most common choice was to go for XGBoost since it is widely used in the industry and usually outperforms other algorithms. Therefore we tried 3 different algorithms and XGBoost outperformed all other algorithms. We evaluated our models using regression metrics. A number of metrics could be used including RMSLE, RMSE, MAE, R2 Score. We used some of these metrics for our case.
Initially, our models were not performing properly and there was an enormous error between the actual and the predicted values. This led to the hyperparameter tuning of models and which improved the results and reduced errors between the actual and predicted time of arrival.
One final step was to test the model in production. We deployed our model in production to see its comparison with the existing model. After months of comparison between the two, it turned out that an in-house solution was performing well in estimating arrival time by an adequate margin.
Long story short, with the help of data science and machine learning, our team was able to find the root cause of the problem. We provided viable solutions to the development team on how to reduce ETA to enable timely, on demand delivery of snacks, and build an in-house model that started predicting accurate arrival time for orders. Ultimately improved our customer experience in a cost-effective manner.
Looking to build a high-quality, robust & reliable on demand delivery solution? Explore Movanos, our fully white-labeled delivery management system that can be customized for your specific use case.