Ulrich Kramer created CALCITE-6728:
--------------------------------------

             Summary: Lazy loading of database tables and schemas
                 Key: CALCITE-6728
                 URL: https://issues.apache.org/jira/browse/CALCITE-6728
             Project: Calcite
          Issue Type: Wish
          Components: core
    Affects Versions: 1.38.0
            Reporter: Ulrich Kramer


We are facing the issue that our Database contains 500000 schemas with up to 
500000 tables each.

Caching all these schemas and tables is not an option
1. It will require a lot of memory
2. The eviction of the cache must happen quite often since it's likely that 
every second one of these table is changed.

We worked around this issue by introducing some kind of lazy loading. We did 
this in the following way:

Instead of having pairs of methods to lookup tables and schemas (e.g. 
{{getTable}} and {{{}getTableNames{}}}), we introduced a new {{Lookup}} 
interface, which bundles these two methods
{code:java}
public interface Lookup<T> {
  /**
   * Returns an entity with a given name, or null if not found.
   * The name is matched case sensitive.
   *
   * @param name Name
   * @return Entity, or null
   */
  @Nullable  T get(String name) ;

  /**
   * Returns a named entity with a given name ignoring the case, or null if not 
found.
   *
   * @param name Name
   * @return Entity, or null
   */
  @Nullable Named<T> getIgnoreCase(String name) ;

  /**
   * Returns the names of the entities in matching pattern.
   *
   * @return Names of the entities
   */
  Set<String> getNames(LikePattern pattern);
}
{code}
and modified the {{Schema}} interface accordingly
{code:java}
public interface Schema {

  /**
   * Returns a lookup object to find tables
   *
   * @return Lookup
   */
  Lookup<Table> tables();

  /**
   * Returns a lookup object to find schemas
   *
   * @return Lookup
   */
  Lookup<? extends Schema> subSchemas();
  ...
}
{code}
Most of the changes can be found in [this 
commit|https://github.com/sap-contributions/calcite/commit/edfbf29209b02b08d895da5fa1f61044bd207f3b#diff-a6ee53ffb06bb5a6d6a732372eb341a0bbaa374f6757b50f94bdc05ec71567d4].

With theses changes we have been able to manage the huge amount of schemas and 
tables.

Here are my questions related to my wish/proposal:
 - Does it make sense to provide a PR wich includes these modifications 
although they might break compatibility with older versions?
 - Does someone else have a great idea to introduce lazy loading with a smaller 
change?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to