Hi,

We have a use case where we are planning to keep sparkcontext alive in a
server and run queries on it. But the issue is we have  a continuous
flowing data the comes in batches of constant duration(say, 1hour). Now we
want to exploit the schemaRDD and its benefits of columnar caching and
compression. Is there a way I can append the new batch (uncached) to the
older(cached) batch without losing the older data from cache and caching
the whole dataset.

Thanks and Regards,


Archit Thakur.
Sr Software Developer,
Guavus, Inc.

Reply via email to