>> A more fundament question Is it possible that all those datasets share a global budget in a multi-tenant way? In principle, the budget should just be a upper-bound. If a dataset doesn't need that much, it shouldn't pre-allocate all "storage.memorycomponent.numpages" pages.
However, in the current implementation, we pre-allocate all in-memory pages upfront: https://github.com/apache/incubator-asterixdb-hyracks/blob/master/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/VirtualBufferCache.java#L247 I think we should fix it to dynamically allocate memory when needed. (Disk buffer cache already does that.) Best, Yingyi On Thu, Mar 10, 2016 at 2:46 PM, Jianfeng Jia <[email protected]> wrote: > Dear Devs, > > I have some questions about the memory management of the in-memory > components for different datasets. > > The current AsterixDB backing the cloudberry demo is down every few days. > It always throws an exception like following: > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: Failed > to open index with resource ID 7 since it does not exist. > > As described in ASTERIXDB-1337, each dataset has a fixed budget no matter > how small/big it is. Then the number of datasets can be loaded at the same > time is also fixed by $number = > storage.memorycomponent.globalbudget/storage.memorycomponent.numpages. My > question is if we have more than $number of datasets, then the eviction > will happen? Will it evict a entire dataset of the victim? Base on the > symptom of above exception, it seems the metadata get evicted? Could we > protect the metadata from eviction? > > A more fundament question Is it possible that all those datasets share a > global budget in a multi-tenant way? > In my workload there are one main dataset( ~10Gb) and five tiny auxiliary > datasets (each size <20M). In addition, the client will create a bunch of > temporary datasets depends on how many concurrent users are and each > temp-dataset will be “refreshed" for a new query. (The refresh is done by > drop and create the temp-dataset). It’s hard to find one > storage.memorycomponent.numpages that make every dataset happy. > > > > Best, > > Jianfeng Jia > PhD Candidate of Computer Science > University of California, Irvine > >
