Hi, Currently in carbondata we have datamaps like preaggregate, lucene, bloom, mv and we have lazy and non-lazy methods to load data to datamaps. But lazy load is not allowed for datamaps like preagg, lucene, bloom.but, it is allowed for mv datamap. In lazy load of mv datamap, for every rebuild(load to datamap) we load the complete data of main table and overwrite the existing segment in datamap based on datamap query.
This is very costly in terms of performance and we also need to support the lazy and non-lazy load for all the datamaps. This can help in reduce the actual dataload time to main table and whenever user wants, he can do the lazy load for the datamaps present for that table. Basically we need not overwrite the existing data every time we load to datamap, we need to increment the data in new segments similar to main table. This will help to get better performance. I will be attaching the design document in subsequent mail. Please give your inputs or get back for any clarifications. Regards, Akash