Streaming raw Google Analytics data means duplicating all the hits (events, page views, transactions, etc.) you send to Google Analytics in a different database (in BigQuery).
It means you will get all the raw unfiltered hits and you can either process this data further or use it for any kind of analysis.
Advantages of the Raw Data Streaming to BigQuery
- Having full unsampled data
- The data is available for analysis in a couple of minutes after it was collected
- You can collect an unlimited amount of data (in the free Google Analytics you can only have 10 million hits per month per property, 200,000 hits per user per day and 500 hits per session. Here is the document on this https://developers.google.com/analytics/devguides/collection/analyticsjs/limits-quotas?hl=en)
- Building custom views in BigQuery and merging your BigQuery with the data from other sources
- In case you build your own custom data processing you can always reprocess the data as you wish.
Drawbacks of the Raw Data Streaming to BigQuery
- The data you are getting is not processed. It means you do not have a daily number of users and sessions available in a separate table. Although you still have clientId field in the streaming table and can use it for calculation it takes time to do this calculation.
- The data will not be filtered. All dev, test and staging hots will be streamed together with other hits and reduce the accuracy of data. It will be necessary to take some action in order to clean the data
- There are no pre-configured reports that are available in Google Analytics. You need either connect Data Studio or write SQL-like requests to analyze the data.
- Reporting on complicated segments is not possible without processing the data.
After evaluating the benefits and the disadvantages of raw data streaming to BigQuery I can conclude that this technology makes sense to implement in case you have a data scientist in your team and quite a lot of traffic on the site. In this case, the raw data will be really useful to run advanced analysis and building sophisticated reports.
Export Processed Google Analytics Data to BigQuery Using Google Analytics Reporting API
Another way to get high-quality data available for analysis in BigQuery for free Google Analytics users is by using Google Analytics reporting API (https://developers.google.com/analytics/devguides/reporting/core/v4/?hl=en). Using the API it is possible to query custom timeframe and get the metrics and dimensions you are interested in.
Advantages of Exporting Google Analytics Data to BigQuery with API
- Daily export of data to BigQuery allows avoiding sampling when analyzing large timeframes
- Processed metrics such as users, sessions, bounce rate are available for export.
- Geo dimensions are available for export
- Demographics and interests dimensions can be available for export depending on the amount of traffic you get (Google Analytics does not allow to query these types of dimensions on small segments of users)
- Data for segments and cohorts can be exported
- All the exported data was already filtered by Google Analytics
Disadvantages of Exporting Google Analytics Data to BigQuery with API
- 48 hours timeout for the exported data. It takes time for Google Analytics to process the data so you need to wait for sometime before exporting it.
- Your data might get sampled before you start to export it. Even if you export the data daily there is a chance that you reach the limit of 100k sessions in your request. However, if the is the case for you have probably already reached the limit of 10 million hits per month per property and need either to switch to Google Analytics 360 or split your traffic between multiple Google Analytics properties.
- Google Analytics Reporting API has its limits and quotas (https://developers.google.com/analytics/devguides/reporting/core/v4/limits-quotas?hl=en).
- Some metrics and dimensions are not compatible and cannot be queried together.
- It’s not possible to query “everything”. It’s necessary to design the requests and database structure before you start exporting the data.
All the benefits and drawbacks of using Google Analytics reporting API for data export make it a good solution for avoiding data sampling on large timeframes. This is also useful for combining Google Analytics data with any data from third-party systems like marketing platforms or CRM.
Depending on your needs and the resources you have for data analysis and tracking development you can use either of these opportunities or combine both of them.
In case you plan to export the Google Analytics data with the reporting API it makes sense to track clientID and userID as custom dimensions in order to have wider opportunities for merging your data later.