top of page
  • Vaibhav Verma

How Big Data and Advanced Analytics can transform manufacturing

Aktualisiert: 3. Jan.

Big Data

Big Data is characterized by three V’s: volume, velocity, and variety. It is defined as data with greater variety, arriving in high volumes and high velocity. Big data means more complex data sets from new data sources. Data sets are so voluminous that legacy software for data processing can’t manage them. But these datasets can be used to solve business problems that weren’t possible before (Oracle).

Advanced Analytics

The application of statistics and other mathematical tools to business data to assess and gain insights is referred to as Advanced Analytics. In the manufacturing sector, advanced analytics can be used by operation managers to gain a deeper understanding of historical process data, identify patterns and relationships among discrete process steps and inputs, and then optimize the relevant factors that have the greatest effect on the yield of the output (Eric Auschitzky, Markus Hammer, and Agesan Rajagopaul, 2014).

The earliest adopter of Big Data were in the field of science and technology, finance, and marketing because the data was better structured and available in digital form. This wasn’t the case with the manufacturing sector but the Industrial Internet of Things (IIOT) has provided the missing piece in the puzzle.

Big Data and Advanced Analytics layout in a manufacturing company

Source: (Chu, 2016)

Description of the layout (Chu, 2016):

1. The on-premise gateway manages the authentication and security between the remote Internet of Things (IoT) gateway and the facility edge nodes. An edge node is a computer that acts as an end-user portal for communication with other nodes. It is the intermediary between nodes, remote data, and control sources for the transfer of data. It may also perform analytics processing and handle data normalization. The gateway isn’t directly connected to the Internet and is located behind a firewall.

2. The data center IoT gateway manages the two-way communication between remote devices and gateways. It is responsible for authentication, security, and connectivity. The gateway isn’t directly connected to the Internet and is located behind a firewall.

3. The data ingestion handles the incoming data from a variety of sources and distributes it further.

4. The stream processing layer performs real-time analytics on the incoming data. It doesn't hold data for longer periods.

5. The data will be held in the Big Data system. But many companies prefer to collect and hold the data in a Data Warehouse for security and performance issues.

Use cases of Big Data in vehicle manufacturing

Centro Ricerche FIAT (CRF) is a private research center in Italy and represents Fiat Chrysler Automobiles (FCA). In the European Horizon 2020 I-BiDaaS project (I-BiDaas), CRF identified two use cases in which complex data sets were identified (Edward Curry, 2022). The following two cases are based on the study published in the book referenced book.

1. Process of Aluminium Die-Casting

The aim is to predict whether an engine block will be produced correctly during the casting process to avoid further processing and scraps. This would result in cost savings for the manufacturers.

The “Production process of aluminium die-casting” use case generated complex datasets from the production of the engine blocks. In the die-casting process, molten aluminium is injected into a die cavity which is mounted in a machine and then it solidifies quickly. The flow behavior of molten metal inside the cavity is influenced by a large number of interconnected process parameters and thus impacting the quality and productivity (Winkler, 2015) (Fiorese, 2016) (Chandrasekaran, 2019).

The data provided for the analyses consist of casting process parameters such as piston speed in the first and second phases, intensification pressures, and others. Additionally, thermal images of the engine block casting process were also provided. Keeping in mind the complexity of the process, it is important to design the parameters and temperature carefully and also control them because they have a direct impact on the quality of the casting.

2. Maintenance and Monitoring of Production Assets

Data were retrieved from sensors installed on machines along the production lines of vehicles. The focus was on welding lines in which robots are used to assemble vehicle components. Due to the continuously changing types of components and vehicles, flexibility is required on welding lines. Sensor data was collected on data servers and categorized in two categories: SCADA and MES.

SCADA is a category of software applications for controlling industrial processes. The SCADA datasets contain daily vehicle production, process, and control parameters. Manufacturing execution systems are computerized systems used in manufacturing. The MES dataset contains specific data related to the type of vehicle being produced.

The data analysis aim is to predict unnecessary actions and improve the efficiency of manufacturing plants by reducing production losses.

Results and lessons for the manufacturing industry

An integrated platform was developed for processing and extracting actionable knowledge from Big Data in manufacturing under the I-BiDaaS project. The following guidelines are suggested for the implementation of Big Data Analytics in the manufacturing sector:

1. Data storage and uptake from different data sources and its preparation:

A production line that has installed digital instruments, and certain devices set up operating values. During the production process, they also adjust and control parameters. It is important to understand how the data is transferred and managed from data sources over time and how it is managed and access is granted. It underlines the importance to break data silos by extracting the data clues from several devices and levels. It may be necessary to engage different departments of the same or different companies.

2. Data Cleaning

It is important to identify which data is useful for analysis among the generated dataset. It is important to identify inaccurate, incomplete, and irrelevant parts from the generated dataset.

3. Fabrication of realistic synthetic data for experimentation and testing

The company data is confidential and thus lies the challenge of data sharing externally. Manufacturers must evaluate the possibility of fabrication of realistic synthetic data for experimentation of the analytical models. These analytical models will thus be developed and then used to test the same models with anonymized data.

4. To increase the speed of data analytics, batch and stream analytics should be used

Post collecting and analyzing the data, it is required to understand which Big Data technologies are suitable for the required business requirements. Batch processing refers to the processing of a high volume of data in a batch within a specific period: Stream processing refers to the processing of a continuous stream of data immediately as it is produced. (Difference between Batch Processing and Stream Processing, 2022)

5. Advanced visualizations tools displaying the result for the end users

Visualization tools would provide the experts and non-expert end-users (e.g. engineers and operators, manufacturers) with insights, value, and operational knowledge extracted from the available data.

The CRF use case shows that existing production lines may be improved to maximize the quality of their product through the solutions provided by Big Data technologies. Improved process efficiency can result in cost reduction and energy saving resulting in manufacturers achieving greater competitiveness and sustainability.


Chandrasekaran, R. C. (2019). Reduction of scrap percentage of cast parts by optimizing the process parameters. Procedia Manufacturing. doi:

Chu, L. P. (2016). Data Science for Modern Manufacturing. In L. P. Chu, Data Science for Modern Manufacturing (p. 29). O'Reilly.

Difference between Batch Processing and Stream Processing. (2022, October 29). Retrieved from Geeksforgeeks:

Edward Curry, S. A. (2022). Technologies and Applications for Big Data Value. Springer. doi:

Eric Auschitzky, Markus Hammer, and Agesan Rajagopaul. (2014, July 1). How big data can improve manufacturing. Retrieved from McKinsey:

Fiorese, E. &. (2016). Process parameters affecting quality of high-pressure diecast Al-Si alloy.

I-BiDaas. (n.d.). Retrieved from

Oracle. (n.d.). Big Data. Retrieved from Oracle:

Winkler, M. K. (2015). Correlation between process parameters and quality characteristics in aluminum high-pressure die casting. NADCA.

32 Ansichten0 Kommentare

Aktuelle Beiträge

Alle ansehen
bottom of page