Emil Kleszcz created PHOENIX-7764:
-------------------------------------

             Summary: Phoenix UngroupedAggregateRegionObserver causes extremely 
slow HBase major compactions by forcing statistics recomputation
                 Key: PHOENIX-7764
                 URL: https://issues.apache.org/jira/browse/PHOENIX-7764
             Project: Phoenix
          Issue Type: Improvement
    Affects Versions: 5.2.1
            Reporter: Emil Kleszcz


On HBase 2.5.10 with Phoenix 5.2.1, major compactions become _orders of 
magnitude slower_ when the Phoenix coprocessor
_org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver_ is enabled in 
a given table (by default).

Compactions that normally complete in minutes instead run for tens of hours, 
even when compacting only a few GB per column family.
Thread dumps and logs show that Phoenix wraps HBase compaction with its own 
scanner chain and recomputes Phoenix statistics (guideposts) during compaction, 
dominating runtime.

This makes large Phoenix tables effectively unmaintainable under heavy delete 
or split workloads.

*Environment*
 * HBase: 2.5.10
 * Phoenix: 5.2.1
 * Hadoop: 3.3.6
 * JDK: 11.0.24
 * Table: multi-CF (A/B/C/D), billions of rows, heavy deletes

*Observed behavior*

Major compactions on CF A routinely take 20–30 hours for ~4–6 GB of compressed 
region data (depending on the number of tombstones, number of cells, and cell 
sizes):
{code:java}
Completed major compaction ... store A ... into size=3.9 G
This selection was in queue for 58hrs, and took 27hrs, 14mins to execute.{code}
At the same time, compactions on other CFs of similar or larger size complete 
in minutes.

*Evidence: Phoenix on compaction hot path*

1. *Thread dumps during compaction*
All long-running compaction threads are executing Phoenix code:
{code:java}
org.apache.phoenix.coprocessor.CompactionScanner$PhoenixLevelRowCompactor.compactRegionLevel
org.apache.phoenix.schema.stats.StatisticsScanner.next
org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction{code}
2. *RegionServer logs*
{code:java}
Starting CompactionScanner ... store A major compaction
Closing CompactionScanner ... retained N of N cells phoenix level only
{code}
This shows Phoenix intercepting the HBase compaction and running a 
Phoenix-level scan.

3. *HFile inspection*

Large store files show hundreds of millions of delete markers and billions of 
entries.
Phoenix statistics recomputation during compaction requires scanning and 
processing all rows, which dominates runtime.

*Controlled experiment*
 * Removing only _UngroupedAggregateRegionObserver_ from the table:
 ** CF A major compactions complete in minutes (comparable to other CFs).
 ** Normal point lookups, scans, joins still work.
 ** Phoenix statistics collection still enabled globally.

 * Side effect:
 ** Ungrouped aggregate queries ({_}COUNT( * ){_}, {_}MIN/MAX{_}, _SUM_ without 
{_}GROUP BY{_}) fail, because Phoenix does not fall back to client-side 
aggregation and still plans {_}SERVER AGGREGATE INTO SINGLE ROW{_}.

This confirms:
 * The coprocessor is the source of extreme compaction slowdown.
 * Phoenix tightly couples aggregate execution and compaction-time statistics 
recomputation.

*Problem*
 * Phoenix performs expensive statistics work during HBase major compaction, a 
critical maintenance operation.
 * This work is opaque, unavoidable, and not configurable.
 * Large Phoenix tables with deletes/splits can remain under compaction for 
weeks, causing:
 ** prolonged compaction backlogs,
 ** blocked balancing,
 ** unpredictable query latency spikes.

*Expected*
One of the following (any would be acceptable):
# A configuration to disable Phoenix statistics recomputation during compaction.
# A way to decouple {{UngroupedAggregateRegionObserver}} from compaction-time 
scanning.
# Clear documentation that Phoenix majorly alters HBase compaction cost, with 
guidance for large tables.
# A fix so Phoenix falls back to client-side aggregation when the coprocessor 
is absent (so operators can safely remove it).

At minimum, confirmation whether this behavior is expected and unavoidable in 
Phoenix 5.2.x on HBase 2.5.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to