Ulrich Kramer created CALCITE-6728:
--------------------------------------
Summary: Lazy loading of database tables and schemas
Key: CALCITE-6728
URL: https://issues.apache.org/jira/browse/CALCITE-6728
Project: Calcite
Issue Type: Wish
Components: core
Affects Versions: 1.38.0
Reporter: Ulrich Kramer
We are facing the issue that our Database contains 500000 schemas with up to
500000 tables each.
Caching all these schemas and tables is not an option
1. It will require a lot of memory
2. The eviction of the cache must happen quite often since it's likely that
every second one of these table is changed.
We worked around this issue by introducing some kind of lazy loading. We did
this in the following way:
Instead of having pairs of methods to lookup tables and schemas (e.g.
{{getTable}} and {{{}getTableNames{}}}), we introduced a new {{Lookup}}
interface, which bundles these two methods
{code:java}
public interface Lookup<T> {
/**
* Returns an entity with a given name, or null if not found.
* The name is matched case sensitive.
*
* @param name Name
* @return Entity, or null
*/
@Nullable T get(String name) ;
/**
* Returns a named entity with a given name ignoring the case, or null if not
found.
*
* @param name Name
* @return Entity, or null
*/
@Nullable Named<T> getIgnoreCase(String name) ;
/**
* Returns the names of the entities in matching pattern.
*
* @return Names of the entities
*/
Set<String> getNames(LikePattern pattern);
}
{code}
and modified the {{Schema}} interface accordingly
{code:java}
public interface Schema {
/**
* Returns a lookup object to find tables
*
* @return Lookup
*/
Lookup<Table> tables();
/**
* Returns a lookup object to find schemas
*
* @return Lookup
*/
Lookup<? extends Schema> subSchemas();
...
}
{code}
Most of the changes can be found in [this
commit|https://github.com/sap-contributions/calcite/commit/edfbf29209b02b08d895da5fa1f61044bd207f3b#diff-a6ee53ffb06bb5a6d6a732372eb341a0bbaa374f6757b50f94bdc05ec71567d4].
With theses changes we have been able to manage the huge amount of schemas and
tables.
Here are my questions related to my wish/proposal:
- Does it make sense to provide a PR wich includes these modifications
although they might break compatibility with older versions?
- Does someone else have a great idea to introduce lazy loading with a smaller
change?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)