The Virtual Data Warehouse

OCTOBER 1, 2019

by Rod Beecham
Partnering Lead at Zetaris

Every organisation wants real-time capability to join, clean and analyse comprehensive data presented in a way that is easy to understand and act on. IT experts continue to work on solutions to achieve this. But, as the volume and variety of data increases exponentially, these solutions continue to prove temporary and partial.

The main problems, with which every organisation will be familiar, are:

  1. That data takes many forms, necessitating massive back-end efforts in data transformation and consolidation (which also means, of course, data duplication);
  2. Vendor restrictions, which can lock much of the organisation’s data in a proprietary format;
  3. Legacy systems which must be re-engineered to make the data they contain portable;
  4. Limitations on the capacity of outside systems and BI tools to ingest all the data;
  5. Security issues arising from the assimilation of data from different, specific-purpose locations;
  6. The difficulty of envisioning and planning for all the uses to which consolidated data can be put, meaning that the potential benefits of the data transformation project are never realised; and
  7. Cost, which blows out when the missing pieces are identified and re-work begins.

Physical Data Warehouses (PDWs) and Logical Data Warehouses (LDWs) are of limited value. Cloud-based solutions are growing – the Gartner Group estimates that the shift to the cloud will account for $1.3T of total IT spending by 2022 – but, hitherto, the cloud has merely represented a different location for the problems listed above.

Enter the Virtual Data Warehouse.

The Virtual Data Warehouse (VDW) is a breakthrough solution that eliminates all the headaches associated with a traditional data warehousing effort. The VDW allows users a complete view of all data via a single Web interface. The need to extract, transform and load (ETL) data from its sources is eliminated because the data is accessed in place: it is not moved. Any outside system or BI tool can be used to examine the data because the VDW will feed whatever is being used at the front-end. The VDW integrates and conforms with the organisation’s data governance and data security systems. The VDW retrieves and joins data of all kinds – internal, external, structured, unstructured – in real time to facilitate rapid query turnaround and deeper analysis, driving down costs, improving efficiency, and allowing broader and deeper insights into the organisation’s customers, marketspace and operations.

The technical challenges of a cloud-based VDW are Application Programming Interfaces (APIs) and network connectivity. Workload design and testing must be thorough to ensure compatibility with all public cloud provider APIs. But the elimination of ETL removes the lock-in risks associated with hard-wiring data consumers to back-end data and the cost of re-writing ETL pipelines. Implemented carefully, the VDW will interface directly with an Azure SQL data warehouse, Google BigQuery, Amazon Redshift, Snowflake, on-premise Teradata, Oracle and DB2, and leverage machine-learned optimisation to drive down processing costs.

Rod Beecham has worked in the IT industry for thirty years as a programmer and project manager, testing and deploying software, overhauling network operations and reviewing systems architecture.

As Partnering Lead at Zetaris he drives industry uptake of analytical data virtualisation technology, a space in which Zetaris has been recognised as a leader by Gartner.