Dear Devs,

I have some questions about the memory management of the in-memory components 
for different datasets.

The current AsterixDB backing the cloudberry demo is down every few days. It 
always throws an exception like following: 
Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: Failed to 
open index with resource ID 7 since it does not exist.

As described in ASTERIXDB-1337, each dataset has a fixed budget no matter how 
small/big it is. Then the number of datasets can be loaded at the same time is 
also fixed by $number = 
storage.memorycomponent.globalbudget/storage.memorycomponent.numpages. My 
question is if we have more than $number of datasets, then the eviction will 
happen? Will it evict a entire dataset of the victim? Base on the symptom of 
above exception, it seems the metadata get evicted? Could we protect the 
metadata from eviction? 

A more fundament question Is it possible that all those datasets share a global 
budget in a multi-tenant way? 
In my workload there are one main dataset( ~10Gb) and five tiny auxiliary 
datasets (each size <20M). In addition, the client will create a bunch of 
temporary datasets depends on how many concurrent users are and each 
temp-dataset will be “refreshed" for a new query. (The refresh is done by drop 
and create the temp-dataset). It’s hard to find one 
storage.memorycomponent.numpages that make every dataset happy.



Best,

Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine

Reply via email to