Dear Devs, I have some questions about the memory management of the in-memory components for different datasets.
The current AsterixDB backing the cloudberry demo is down every few days. It always throws an exception like following: Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: Failed to open index with resource ID 7 since it does not exist. As described in ASTERIXDB-1337, each dataset has a fixed budget no matter how small/big it is. Then the number of datasets can be loaded at the same time is also fixed by $number = storage.memorycomponent.globalbudget/storage.memorycomponent.numpages. My question is if we have more than $number of datasets, then the eviction will happen? Will it evict a entire dataset of the victim? Base on the symptom of above exception, it seems the metadata get evicted? Could we protect the metadata from eviction? A more fundament question Is it possible that all those datasets share a global budget in a multi-tenant way? In my workload there are one main dataset( ~10Gb) and five tiny auxiliary datasets (each size <20M). In addition, the client will create a bunch of temporary datasets depends on how many concurrent users are and each temp-dataset will be “refreshed" for a new query. (The refresh is done by drop and create the temp-dataset). It’s hard to find one storage.memorycomponent.numpages that make every dataset happy. Best, Jianfeng Jia PhD Candidate of Computer Science University of California, Irvine
