[
https://issues.apache.org/jira/browse/PHOENIX-7724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Viraj Jasani updated PHOENIX-7724:
----------------------------------
Description:
Phoenix implements it's own version of major and minor compactions using
CompactionScanner to avoid data integrity issues at the row version retentions,
rather than operate at the column family level.
CompactionScanner is initialized with PTable object for the given table. We use
conn.getTableNoCache() to retrieve the PTable for both minor and major
compactions, which results in rpc call to SYSTEM.CATALOG regardless of whether
the regionserver has the PTable object in it's cache.
With large amount of data and higher num of regions, compaction occurrence
would be more frequent. It has tendency to generate large amount of rpc calls
at some point.
To avoid this, the purpose of this Jira is to use conn.getTable(), which uses
client side cache if PTable is already available to prevent rpc call to syscat.
Sample error logs if compaction cannot reach out to syscat:
{code:java}
2025-11-03 19:43:00,226 ERROR
[gionserver/regionserver-1:60020-shortCompactions-1]
coprocessor.UngroupedAggregateRegionObserver - Unable to modify compaction
scanner to retain deleted cells for a table with disabled Index; TEST1
org.apache.phoenix.exception.PhoenixIOException: SYSTEM.CATALOG is disabled.
at
org.apache.phoenix.util.ClientUtil.parseServerException(ClientUtil.java:72)
at
org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:2412)
at
org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:2358)
at
org.apache.phoenix.query.ConnectionQueryServicesImpl.getTable(ConnectionQueryServicesImpl.java:2658)
at
org.apache.phoenix.query.DelegateConnectionQueryServices.getTable(DelegateConnectionQueryServices.java:152)
at
org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:605)
at
org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:481)
at
org.apache.phoenix.jdbc.PhoenixConnection.getTableNoCache(PhoenixConnection.java:654)
at
org.apache.phoenix.jdbc.PhoenixConnection.getTableNoCache(PhoenixConnection.java:663)
at
org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver$4.run(UngroupedAggregateRegionObserver.java:642)
at
org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver$4.run(UngroupedAggregateRegionObserver.java:625){code}
was:
Phoenix implements it's own version of major and minor compactions using
CompactionScanner to avoid data integrity issues at the row version retentions,
rather than operate at the column family level.
CompactionScanner is initialized with PTable object for the given table. We use
conn.getTableNoCache() to retrieve the PTable for both minor and major
compactions, which results in rpc call to SYSTEM.CATALOG regardless of whether
the regionserver has the PTable object in it's cache.
With large amount of data and higher num of regions, compaction occurrence
would be more frequent. It has tendency to generate large amount of rpc calls
at some point.
To avoid this, the purpose of this Jira is to use conn.getTable(), which uses
client side cache if PTable is already available to prevent rpc call to syscat.
> CompactionScanner PTable usage should use getTable()
> ----------------------------------------------------
>
> Key: PHOENIX-7724
> URL: https://issues.apache.org/jira/browse/PHOENIX-7724
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Viraj Jasani
> Priority: Major
>
> Phoenix implements it's own version of major and minor compactions using
> CompactionScanner to avoid data integrity issues at the row version
> retentions, rather than operate at the column family level.
> CompactionScanner is initialized with PTable object for the given table. We
> use
> conn.getTableNoCache() to retrieve the PTable for both minor and major
> compactions, which results in rpc call to SYSTEM.CATALOG regardless of
> whether the regionserver has the PTable object in it's cache.
> With large amount of data and higher num of regions, compaction occurrence
> would be more frequent. It has tendency to generate large amount of rpc calls
> at some point.
>
> To avoid this, the purpose of this Jira is to use conn.getTable(), which uses
> client side cache if PTable is already available to prevent rpc call to
> syscat.
>
> Sample error logs if compaction cannot reach out to syscat:
> {code:java}
> 2025-11-03 19:43:00,226 ERROR
> [gionserver/regionserver-1:60020-shortCompactions-1]
> coprocessor.UngroupedAggregateRegionObserver - Unable to modify compaction
> scanner to retain deleted cells for a table with disabled Index; TEST1
> org.apache.phoenix.exception.PhoenixIOException: SYSTEM.CATALOG is disabled.
> at
> org.apache.phoenix.util.ClientUtil.parseServerException(ClientUtil.java:72)
> at
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:2412)
> at
> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:2358)
> at
> org.apache.phoenix.query.ConnectionQueryServicesImpl.getTable(ConnectionQueryServicesImpl.java:2658)
> at
> org.apache.phoenix.query.DelegateConnectionQueryServices.getTable(DelegateConnectionQueryServices.java:152)
> at
> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:605)
> at
> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:481)
> at
> org.apache.phoenix.jdbc.PhoenixConnection.getTableNoCache(PhoenixConnection.java:654)
> at
> org.apache.phoenix.jdbc.PhoenixConnection.getTableNoCache(PhoenixConnection.java:663)
> at
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver$4.run(UngroupedAggregateRegionObserver.java:642)
> at
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver$4.run(UngroupedAggregateRegionObserver.java:625){code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)