Identification of patterns/profiles and solid waste production prediction in the city of Lisbon. 


To identify patterns to support the prediction of the production of urban waste associated with a variety of context information (e.g. events, climate situation, etc.)

Research question

What is the expected mixed waste production for a set of mixed waste collection circuits for every week of the following month?


Analytical Service Specifications

Analytical Service Specifications

Learn More
This image for Image Layouts addon

Analytical Model Code

Learn More
This image for Image Layouts addon

Analytical Service Dashboard

Learn More

Challenge Brainstorm Session


Business understanding

To understand the Urban Hygiene department needs regarding their daily operational activities, several meetings took place in which was identified the necessity of creating a service that allow the prediction of mixed waste for a specific set of waste collection circuits.

With this service is pretended that the Urban Hygiene department can anticipate the production of mixed waste, to optimize their waste collection operations, namely the number of trucks needed to collect the waste on a specific period.

In this sense was defined that the service that was created had the capability to predict the amount of mixed waste that will be produced for each week of the following month.


Data Understanding

The literature identifies that waste production can be affected by external factors which transcend the standard patterns found in the day-to-day waste deposition.

Several factors such as population income or Gross Domestic Product (GDP) (Dangi, Pretz, Urynowicz, Gerow, & Reddy, 2011; Liu & Wu, 2010; Namlis & Komilis, 2019), population size (Estay-Ossandon & Mena-Nieto, 2018), the average age of the population (Callan & Thomas, 2017), education and household size (Mattar, Abiad, Chalak, Diab, & Hassan, 2018) have been pointed out in the literature as affecting waste production. An additional factor that has not been widely studied is tourism (Diaz-Farina, Díaz-Hernández, & Padrón-Fumero, 2020).

Indeed, the fluctuations in tourist numbers can be one of the main drivers in waste generation (Estay-Ossandon & Mena-Nieto, 2018) in touristic locations. For example, small businesses related to tourism, hotels, restaurants, and cafés constitute the main driver for food waste generation in touristic locations (Wang, Filimonau, & Li, 2021). As tourism is associated with holidays and special events, these also impact short-term fluctuations of waste generation (Han et al., 2018; Johnson et al., 2017).

For the #2 Waste management use case we had access to 4 years of mixed waste collection, more concretely access to data between 01/01/2017 to 30/10/2020. Each observation corresponds to a waste collection load, having information about the day of collection, the waste collection circuit (that corresponds to a set of waste collection points in which waste trucks collect the waste), the identification of the waste collected (i.e., mixed, paper, plastic, or glass) and the total amount of waste collected in each waste collection load (in kg.).

Table 9 summarizes the number of waste collection loads in the dataset for mixed waste.

Table 9. Summary of mixed waste collected in Lisbon between 2017 and 31st October 2020

In Figure 15 is presented the volume of undifferentiated waste collected between 01/01/2017 to 30/10/2020. To notice that the mixed waste collected varies along time between the 4000 tons and 5000 tons of waste collected. However, after the end of March due to the lockdown because of COVID-19 pandemic, there was a substantial decrease in the waste collected.

For mixed waste 23343 georeferenced collection points are grouped in 114 circuits. Each circuit is regularly covered by a waste collection vehicle, (see Figure 16 for the spatial distribution of the mixed waste collection points in Lisbon).

Figure 15. Undifferentiated waste collected in Lisbon between 01/01/2017 to 30/10/2020
Figure 16. Spatial distribution of the mixed waste collection points in the city of Lisbon

Data preparation

As the truck loads are reported at circuit level and to have a better spatial representation of each circuit, mixed waste collection areas were created using Thiessen polygons (Figure 17A).

Each area corresponds to the collection area associated to a collection point. These collection areas associated with the collection points are dissolved by the circuit id, creating mixed waste collection areas for each collection circuit (Figure 17B).

The waste collection areas that were scattered along the study area and with less than 3ha were aggregated to the adjacent waste collection area.

This resulted in a mixed waste collection areas for 97 circuits.



Figure 17. Creation of undifferentiated collection areas. A) Example of the Thiessen polygons created for each one of the collection points (represented by the black dots). Colours indicate different collection circuits. B) Collection circuit areas estimated from the Thiessen polygons.
Table 10. Features selected for modelling the use case #2 Waste management


Several ARIMA models (Box, Jenkins, & Reinsel, 2008) were trained and tested, namely ARIMA, SARIMA, ARIMAX and SARIMAX. The models were trained with mixed waste collection data from 10/04/2017 to 31/10/2019 and tested with data from 01/11/2019 to 31/10/2020. In Figure 14 is presented an example of the results for mixed waste collection circuit 7757, using ARIMA, ARIMAX, SARIMA and SARIMAX model with the parameters at station level. For the ARIMAX and SARIMAX models were used holidays as exogenous variable.

In Figure 19 is presented the observed and predicted values using a SARIMAX model for the train and test data regarding the mixed waste collection circuit 7757.


Figure 18. Observed and predicted values for training (top figure) and test datasets (bottom figure) using ARIMA, ARIMAX, SARIMA and SARIMAX models for mixed waste collection circuit 7757 with the time-series parameters at circuit level
Figure 19. Observed and predicted values of SARIMAX model for the train (top figure) and test (bottom figure) sample for mixed waste collection circuit 7757

The quality assessment of the models was made through the computation of the mean absolute error (MAE), root mean squared error (RMSE), and the mean absolute percentage error (MAPE) (de Myttenaere, Golden, Le Grand, & Rossi, 2016).

In Table 11 are presented the weekly mean values for MAE, RMSE and MAPE for the 95 mixed waste collection circuits analysed. Considering the forecast quality based on MAPE value, using the scale developed by Lewis (1982) (i.e., Highly accurate forecast (MAPE <= 10%); Good forecast (MAPE = 11% - 20%); Reasonable forecast (MAPE = 21% - 50%); and Inaccurate forecast (MAPE >= 50%), in mean all models presented a good forecast.

Table 11. Mean MAE, mean RMSE and mean MAPE for the 95 mixed waste collection circuits analysed


To validate the results with the Lisbon City Council Urban Hygiene Department, was elaborated a dashboard with several reports based on a star-schema dimensional model.

Figure 20. Star schema dimensional model for the #2 Waste management use case


Box, G., Jenkins, G., & Reinsel, G. (2008). Time Series Analysis (4th Editio). Hoboken.

Callan, S. J., & Thomas, J. M. (2017). Analyzing Demand for Disposal and Recycling Services : A Systems Approach Published by : Palgrave Macmillan Journals Stable URL : http://www.jstor.org/stable/40326269 ANALYZING DEMAND FOR DISPOSAL AND RECYCLING SERVICES : 32(2), 221–240.

Dangi, M. B., Pretz, C. R., Urynowicz, M. A., Gerow, K. G., & Reddy, J. M. (2011). Municipal solid waste generation in Kathmandu, Nepal. Journal of Environmental Management, 92(1), 240–249. https://doi.org/10.1016/J.JENVMAN.2010.09.005

de Myttenaere, A., Golden, B., Le Grand, B., & Rossi, F. (2016). Mean Absolute Percentage Error for regression models. Neurocomputing, 192, 38–48. https://doi.org/10.1016/j.neucom.2015.12.114

Diaz-Farina, E., Díaz-Hernández, J. J., & Padrón-Fumero, N. (2020). The contribution of tourism to municipal solid waste generation: A mixed demand-supply approach on the island of Tenerife. Waste Management, 102, 587–597. https://doi.org/10.1016/J.WASMAN.2019.11.023

Estay-Ossandon, C., & Mena-Nieto, A. (2018). Modelling the driving forces of the municipal solid waste generation in touristic islands. A case study of the Balearic Islands (2000–2030). Waste Management, 75, 70–81. https://doi.org/10.1016/J.WASMAN.2017.12.029

Han, Z., Liu, Y., Zhong, M., Shi, G., Li, Q., Zeng, D., … Xie, Y. (2018). Influencing factors of domestic waste characteristics in rural areas of developing countries. Waste Management, 72, 45–54. https://doi.org/10.1016/J.WASMAN.2017.11.039

Johnson, N. E., Ianiuk, O., Cazap, D., Liu, L., Starobin, D., Dobler, G., & Ghandehari, M. (2017). Patterns of waste generation: A gradient boosting model for short-term waste prediction in New York City. Waste Management, 62, 3–11. https://doi.org/10.1016/j.wasman.2017.01.037

Lewis, C. D. (1982). Industrial and business forecasting methods. London: Butterworths.

Liu, C., & Wu, X.-W. (2010). Factors influencing municipal solid waste generation in China: A multiple statistical analysis study. Waste Management & Research, 29(4), 371–378. https://doi.org/10.1177/0734242X10380114

Mattar, L., Abiad, M. G., Chalak, A., Diab, M., & Hassan, H. (2018). Attitudes and behaviors shaping household food waste generation: Lessons from Lebanon. Journal of Cleaner Production, 198, 1219–1223. https://doi.org/10.1016/J.JCLEPRO.2018.07.085

Namlis, K. G., & Komilis, D. (2019). Influence of four socioeconomic indices and the impact of economic crisis on solid waste generation in Europe. Waste Management, 89, 190–200. https://doi.org/10.1016/J.WASMAN.2019.04.012

Wang, L. en, Filimonau, V., & Li, Y. (2021). Exploring the patterns of food waste generation by tourists in a popular destination. Journal of Cleaner Production, 279, 123890. https://doi.org/10.1016/J.JCLEPRO.2020.123890