Preparing and using data for analysis
Certainly, the primary purpose of ingesting and storing data is to analyze and report upon that data in support of organizational decision making. Managing the data analysis process can make the difference for a data engineering project. You could have the most efficient ingestion and storage systems, but if your analysis and presentation layer are lacking then that is all the stakeholders will remember.
Topics Include:- Preparing data for visualization. There are a few techniques which GCP recommends to ensure a high performance architecture to power bi tools. These include BigQuery materialized views, time granularity considerations and data loss prevention.
- Sharing data by defining rules to share data, publishing data sets, reports, and visualizations, and using Google's Analytics Hub.
- Exploring and analyzing data. This including data discovery and preparing data for feature engineering.
GCP Professional Data Engineer Certification Preparation Guide (Nov 2023)
→ Preparing and using data for analysis
Preparing data for visualization
GCP has a large number of effective data visualization tools and best practices which can ensure a high quality product which respects the security and privacy requirements of your stakeholders and users.
Topics Include:- Connecting to tools
- Precalculating fields
- BigQuery materialized views (view logic)
- Determining granularity of time data
- Troubleshooting poor performing queries
- Identity and Access Management (IAM) and Cloud Data Loss Prevention (Cloud DLP)
Sharing data
Cloud native philosophy enables a high degree of transferability among data and processes. GCP's BigQuery Architecture has built in tools which enable sharing data efficiently and safely to audiences across your organization and across the world.
Topics Include:- Defining rules to share data
- Publishing datasets
- Publishing reports and visualizations
- Analytics Hub
Exploring and analyzing data
Exploring and analyzing data is useful precursory step when performing feature engineering in preparation for machine learning development. Learn some techniques for data discovery and feature engineering in GCP's common data stack.