Preventing COVID-19 Pandemic Using Big Data
Preventing COVID-19 Pandemic Using Big Data
Taiwan has controlled the pandemic properly and is now marching towards the new life of the pandemic prevention stage; Taiwan’s outstanding performance of the outbreak prevention has also been reported by international media repeatedly, especially during the outbreak of group infection on the Diamond Princess cruise ship when 2,700 passengers entered Taiwan at that time. The Executive Yuan team used the mobile phone signals of the passengers to analyze their footprints and used the base station number signals of telecommunication operators and geographic location positioning of mobile phones to successfully locate over 620,000 people who had contact with the passengers using big data analysis technology, and then followed up and screened them. In such pandemic prevention work that races against time, only new generation cloud data warehouse technology can be relied on to quickly process large amounts of data in real-time.
Quick and expandable BigQuery that does not require infrastructure management can save half the cost for enterprises
Cloud data warehouse has been around for seven to eight years already, as enterprises move more and more application services onto the cloud, they also adopted hybrid cloud and multi-cloud architectures to actually feel the benefits brought by cloud computing, such as real-time, economic and easy management, etc. Many enterprises also started to gradually move applications such as business intelligence (BI) and data warehouse up to the cloud, or adopt new generation cloud data warehouse services directly. According to the “2019 Technology Spending Intentions Survey” by the Enterprise Strategy Group, 47% of the enterprises that have already adopted cloud Iaas/PaaS services will also execute BI query and big data analysis for the same purposes.
Compared to the traditional data warehouse system built locally by enterprises that require large amounts of software/hardware construction costs and infrastructure maintenance manpower, through cloud data warehouse services with their server-less feature, they allow enterprises to focus on data analysis tasks without having to worry about system updates and upgrades or related security issues.
Also according to the survey by ESG, the Data Warehouse service of Google Cloud, BigQuery, can save enterprises 52%~41% total cost of ownership compared to traditional data warehouse systems or the practice of migrating data warehouse systems to IaaS.
Secondly, before querying data with BigQuery, high-speed streaming data can be written to Cloud Bigtable first for data processing. Nowadays there are many sources of data on website user behaviors, real-time mobile devices, and IoT message sources; by having BigQuery perform machine learning, data labels can be read directly, which is the same as converting unstructured data to structured data to accelerate modeling. Thirdly, it takes care of both automation and high availability. Through the Data Transfer Service tool, SaaS data can be loaded into Bigquery automatically according to schedule for analysis; it also provides high availability copy and storage spaces at several locations automatically and enterprises do not have to pay additional fees to make additional adjustments or settings. Local data warehouse systems always required estimations and allocations of computing resources in the past; if it was found that operations were too slow when query instructions were given, it was more inconvenient to expand resources so suddenly.
The greatest feature of Google BigQuery that differs from other cloud data warehouse services is that developers can focus on developing tasks without having to configure in advance how many machines the cluster needs; Google Cloud can adjust resources dynamically so that developers only need to focus on writing the SQL syntax and do not have to manage or adjust the operation or computing resource of the cluster.
3 key points for evaluating cloud warehouse services
CloudMile’s professional technical consultant team indicated that currently, approximately 40% of the enterprises across different industries in Taiwan adopted cloud data warehousing. It is mainly used for big data analysis to generate BI decisions or used for machine learning to get predictions, which can be divided into three categories according to the data attributes. The first is the system log data from IoT or machine equipment; manufacturers can use this to predict material needs on the production line and use historical data to predict future changes. The second is that the e-commerce industry will use the collected user online transaction behaviors and clicking behaviors to perform real-time analysis. And the third type is that existing databases can be integrated into a single data warehouse for aggregated analysis.
Recently, CloudMile helped a carrier company convert their past application system database into cloud data warehouse through Google BigQuery, and through the historic data of the geographic data location where customers hailed services, in combination with machine learning technology, predict which locations would have passengers requiring services in a few hours and dispatch vehicles to these locations in advance. With BigQuery, not only was its computing performance faster than local executions in the past, but the company also gained a 30%~40% return on investment.
As regulations gradually loosen in the future, cloud data warehouse service is expected to become the mainstream trend; many traditional data warehouse providers have already entered the cloud data warehouse market. When enterprises are evaluating the different cloud data warehouse services, they should select data warehouse services with the following features and partner capabilities:
The cloud data warehouse service must be able to adjust resources dynamically and does not require pre-configured storage and computing machines, allowing developers to focus on SQL development.
The partner must be able to help optimize cloud data warehouse performance. Since the performance is closely related to cloud service fees, therefore if the service provider can help provide the following related services, it would help increase the return on investment for the cloud data warehouse service. This includes suggesting precautions for writing SQL syntaxes, for example, which syntaxes should be avoided or used carefully. They should also notice the data structure and the design of the dataset, as well as the design and placement of the datasheet, and suggest how they should be split, etc. Also, recommend and make good use of the free functions of the data warehouse service; for example, the cache of Google BigQuery can accelerate queries. Making good use of partition settings and saving the data prepared to be used for querying in the cache first can reduce query time, as well as save the on the total amount of query results, and further lower the usage cost of the BigQuery service.
In addition to providing syntax writing suggestions before analysis, also continue to help optimization and provide consultation after the customer performed analysis.
The pandemic not only changed many consumer behavior patterns in order to reduce interpersonal interaction and maintain social distancing, but it also accelerated enterprises’ adoption of cloud services and the pace of digital transformation. When everybody knows that data is king and is working hard to collect data to perform big data analysis, only by using less cost and performing faster and more accurate analyses than others can there be chances to lead the industry and seize opportunities!