Hi Jacky, Default blocklet size currently is 64 MB, so if block size is 256 MB then at most blocklets per block is 4.
Regards, Ravindra. On 17 May 2017 at 19:59, Jacky Li <jacky.li...@qq.com> wrote: > +1 for both proposal 1 & 2, > > For point 2, do you have idea how many blocklet within one block roughly? > This will help to estimate the length of array in driver side. > > Regards, > Jacky > > > 在 2017年5月17日,下午7:33,Ravindra Pesala <ravi.pes...@gmail.com> 写道: > > > > Hi, > > > > *1. Current problem.* > > 1.There is more size taking on java heap to create Btree for index file. > > It is because we create multiple objects for each leaf node so it takes > > more memory inside heap than actual size of index file. while doing LRU > > cache also we are considering only index file size instead of objects > size > > so it impacts the eviction process of LRU cache. > > 2. Currently we load one btree on driver side to find the blocks and load > > another btree on executor side to find the blocklets. After we have > > increased the blocklet size to 128 mb and decrease the table_block size > to > > 256 mb the number of nodes inside driver side btree and executor side > btree > > is not much different. So it would be overhead to read the same > information > > twice. > > And also chances of loading btree on each executor is more for every > query > > because there is no guarantee that same block goes to same executor every > > time. It will be worse in case of dynamic containers. > > > > *2. Proposed solution.* > > 1. To reduce the java heap for Btree , we can remove the Btree data > > structure and use simple single array and do binary search on it. And > also > > we should move this cached arrays to unsafe (offheap/onheap) to reduce > the > > burden on GC. > > 2. Unify the btree to single Btree instead of 2 and load at driver side. > > So that only one lookup can be done to find the blocklets directly. And > > executors are not required to load the btree for every query. > > We can consider moving this to separate metadata service eventually > > once the memory footprint get reduced. > > > > First I will consider point 1 reduce the btree size after that I consider > > merging of btrees. > > > > Please comment on it. > > > > -- > > Thanks & Regards, > > Ravindra. > > > > -- Thanks & Regards, Ravi