Hi, We have a use case where we are planning to keep sparkcontext alive in a server and run queries on it. But the issue is we have a continuous flowing data the comes in batches of constant duration(say, 1hour). Now we want to exploit the schemaRDD and its benefits of columnar caching and compression. Is there a way I can append the new batch (uncached) to the older(cached) batch without losing the older data from cache and caching the whole dataset.
Thanks and Regards, Archit Thakur. Sr Software Developer, Guavus, Inc.