We have a large relational database ( ~ 500 GB, hundreds of tables ). We have summary tables that we rebuild from scratch each night that takes about 10 hours. >From these summary tables, we have a web interface that accesses the summary tables to build reports.
There is a business reason for doing a complete rebuild of the summary tables each night, and using views (as in the sense of Oracle views) is not an option at this time. If I wanted to leverage Big Data technologies to speed up the summary table rebuild, what would be the first step into getting all data into some big data storage technology? Ideally in the end, we want to retain the summary tables in a relational database and have reporting work the same without modifications. It's just the crunching of the data and building these relational summary tables where we need a significant performance increase.