A Sequential Decision-Making Process for Smart Vending Machine Replenishment Systems

Feb 1, 2019 5 min read

Vending machines provide people with an easily accessible selection of beverages and snacks, helping them to quickly quiet their rumbling stomachs while on the go. However, vending machines can pose frustrations from time to time — coins or bills being rejected, purchased items getting stuck in the machine, products being out-of-stock, and more.

Today’s vending machines don’t look like the ordinary snack dispensers of the past. With the rapid evolution of vending technology, smart vending machines are now engaging shoppers like never before, with innovative features such as touch-screen controls, video, audio, gesture-based interactions, and cashless payment. They can also interact with customers’ smart phones.

JD’s Smart Vending Machine is designed to enable purchases by simply opening the door, picking up the desired product and walking away. By integrating a range of advanced technologies, including computer vision, intelligent sensors, etc., the system can identify consumers, recognize when they take goods from the shelves and automatically manage stock replenishment. The system is designed to help merchants to manage their inventory intelligently and efficiently by automating many aspects of the process, from providing notifications when stock needs to be replenished to automatically changing the prices of products during promotional periods.

While new vending technologies, such as JD’s Smart Vending Machine have greatly enhanced the shopping experience, vending machines can still only hold limited amount of inventory. Shelf capacity constraints mean that new stock must be added frequently, resulting in high operational costs.

In this post, we will discuss a sequential decision-making process that addresses the challenge of vending machine replenishment.

The complexity of the problem lies in a series of decision issues. First, we need an accurate demand forecast to generate inputs for the replenishment model; then we optimize the replenishment model to maximize the total expected revenue; finally, we build a truckload model on top of the replenishment policy to determine which items and how many units should be put into a truck for a single replenishment.

Demand Forecast

Uncovering and identifying patterns in demand is a challenge faced by many companies. The main challenge is how quickly demand can change. Furthermore, we have what we call censored observations, which refer to the fact that, once a product is out of stock, we have no way of knowing if more people are trying to buy it, resulting in observations from the sales data being ‘censored’ with regard to the true demand. Since shelf capacity is very limited, this situation can happen quite often. What we do know is that the true demand was greater or at least equal to the actual sales.

To express the likelihood of different sales scenarios occurring on a daily basis, we define product set I for the products for which we observe uncensored demand (positive remaining stock at the end of the day) and product set J for those for which we observe censored demand (zero stock at the end of the day). Letting D_i represents the true demand for product i, we can express the likelihood of a sales scenario by

Here obs={x_1,…,x_n } is the sales observation per day for each product; θ is the latent parameter vector which is used to depict the distribution of the true demand. Figure 1 shows how to derive the likelihood of a sales scenario for an illustrative case.

Figure 1 Example of an observation and its likelihood

Maximizing this likelihood provides us with an estimate of the latent parameter that provides the best approximation of the true demand, which is usually referred to as the Maximum Likelihood Estimator (MLE). The MLE then allows us to model the distribution of the true demand from the censored information.

Replenishment Optimization

Now that we have a model to describe the distribution of the demand, we can decide how much of each product to put on shelves. This is a constrained optimization problem. We need to maximize the revenue of the vending machine with the constraint that we can only place so many products in it.

This problem can be converted to a classical knapsack problem. For each product, we know its revenue and volume. Given the shelf capacity of the vending machine and the distribution of product demand, we want to maximize the total expected revenue while keeping the total occupied volume below maximum capacity. We need to establish M[i,j] — a probability matrix that maps product i to the possible demand j; the value in the matrix indicates the probability that product i is bought more than j times during the next replenishment cycle. Then we can solve the following problem:

Truckload Dispatch

Now that we know what we would like to place in the machine, we need to decide what to put in the trucks that will stock the machines. The key decision is when to replenish and how often. The decision process is straightforward: we send a truck for replenishment if the total expected lost sales incurred is greater than the shipment cost K. We inspect the expected lost sales with an induction method. For instance, let’s say we first consider replenishment for 1 day: if the lost sales are already greater than K then we send a truck to replenish for 1 day, otherwise we consider replenishment for 2 days. This process continues for n days until the lost sales are greater than K, and then a truck will be loaded and dispatched to replenish for n days.

In this post, we review a three-step sequential decision-making process for smart vending machine replenishment. First, we identify the product demand pattern; second, we link the demand pattern with an optimization problem for replenishment; and third, we decide when to dispatch a truck for replenishment. While supporting a much better shopping experience at vending machines, this more sophisticated decision-making process also improves demand forecast accuracy, helping the company increase its revenue by 5% compared to following traditional rule-based replenishment decisions.

Original Medium Post

Titouan Jehl, PhD

Data Scientist

PhD in Industrial Engineering and Operations Research focusing his research on decision making algorithms for the transportation industry. Currently working as a Research Scientist at Lyft