[
https://issues.apache.org/jira/browse/IMPALA-13491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18048356#comment-18048356
]
Arnab Karmakar commented on IMPALA-13491:
-----------------------------------------
Hi [~stigahuang],
As per your suggestion, we should first discuss the design and the key points
before the implementation:
The Key Components:
1. We'll need to have a configuration flag in catalog-server.cc. eg.
catalog_max_parallel_load_operations that controls the max no of concurrent
table load/refresh operations. We can set a default value and 0 will mean
unlimited (for backward compatibility).
2. Semaphore in CatalogServiceCatalog.java similar to partialObjectFetchAccess_
eg:
{code:java}
// Controls concurrent access to doGetPartialCatalogObject() call. Limits the
number
// of parallel requests to --catalog_max_parallel_partial_fetch_rpc.
private final Semaphore partialObjectFetchAccess_ = new
Semaphore(MAX_PARALLEL_PARTIAL_FETCH_RPC_COUNT, /*fair =*/ true);{code}
3. We'll control the no of parallel load/refresh operations using Semaphore
acquisition/release at relevant points:
a. Single table refresh (CatalogServiceCatalog.reloadTableMetadata())
b. Global Invalidate metadata (CatalogServiceCatalog.reset())
c. Background table loading (CatalogServiceCatalog.getOrLoadTable())
4. We'll need some helper methods for acquiring and releasing the Semaphore and
BackendConfig accessor.
5. Maybe also a timeout mechanism to prevent indefinite hangs for long-waiting
operations.
I'd like to know your thoughts and kindly provide some pointers.
> Add config on catalogd for controlling the number of concurrent
> loading/refresh commands
> ----------------------------------------------------------------------------------------
>
> Key: IMPALA-13491
> URL: https://issues.apache.org/jira/browse/IMPALA-13491
> Project: IMPALA
> Issue Type: Improvement
> Reporter: Manish Maheshwari
> Assignee: Arnab Karmakar
> Priority: Critical
>
> When running Table Loading or Refresh commands, catalogd requires working
> memory in proportion to the number of tables been refreshed. While we have a
> table level lock, we dont have a config to control concurrent load/refresh
> operations.
> In case of customers that run refresh in parallel in multiple threads, the
> number of load/refresh command can cause OOM on the catalog due to running
> out of working memory.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]