Hi manhua,

Thanks for the inputs.

1. No need to take care separately to invalidate the cache, i agree that it
will have limit. Since we already have eviction policy, when next query
comes, whenever required, it will evict and load the segments required, so
better not to have a separate mechanism to invalidate cache during
pre-prime.

2.
i. For configuration support of pre-prime, already we can have the database
name or table name, about the regex support, we will note it, and based on
other use case and impacts, i will update the design document.
ii. During load no need to load the table or read any configuration for
pre-prime. During load pre-prime, just take the current new segment and
load into cache.

3. For command support, can you please explain with more use cases. Because
current index server startup will load, and when you say command, even if i
do count(*) also, that will load all the segments. So i think new command
won't be necessary.

Please get back for any clarifications or doubts.

Thanks

Regards,
Akash R Nilugal

On Fri, Aug 16, 2019, 4:26 PM Akash Nilugal <akashnilu...@gmail.com> wrote:

> Hi All,
>
> I have raised a jira and attached the design doc there .please refer
>
> CARBONDATA - 3492
>
> Regards,
> Akash
>
> On Thu, Aug 15, 2019, 5:33 PM Akash Nilugal <akashnilu...@gmail.com>
> wrote:
>
>> Hi Community,
>>
>> Currently, we have an index server which basically helps in distributed
>> caching of the datamaps in a separate spark application.
>>
>> The caching of the datamaps in index server will start once the query is
>> fired on the table for the first time, all the datamaps will be loaded
>>
>> if the count(*) is fired and only required will be loaded for any filter
>> query.
>>
>>
>> Here the problem or the bottleneck is, until and unless the query is
>> fired on table, the caching won’t be done for the table datamaps.
>>
>> So consider a scenario where we are just loading the data to table for
>> whole day and then next day we query,
>>
>> so all the segments will start loading into cache. So first time the
>> query will be slow.
>>
>>
>> What if we load the datamaps into cache or preprime the cache without
>> waititng for any query on the table?
>>
>> Yes, what if we load the cache after every load is done, what if we load
>> the cache for all the segments at once,
>>
>> so that first time query need not do all this job, which makes it faster.
>>
>>
>> Here i have attached the design document for the pre-priming of cache
>> into index server. Please have a look at it
>>
>> and any suggestions or inputs on this are most welcomed.
>>
>>
>>
>> https://drive.google.com/file/d/1YUpDUv7ZPUyZQQYwQYcQK2t2aBQH18PB/view?usp=sharing
>>
>>
>>
>> Regards,
>>
>> Akash R Nilugal
>>
>

Reply via email to