Kadir Ozdemir created PHOENIX-7425:
--------------------------------------
Summary: Partitioned CDC Index for eliminating salting
Key: PHOENIX-7425
URL: https://issues.apache.org/jira/browse/PHOENIX-7425
Project: Phoenix
Issue Type: Improvement
Reporter: Kadir Ozdemir
The CDC future (PHOENX-7001) uses a global uncovered time
(PHOENIX_ROW_TIMESTAM()) based index. Such an index will likely create hot
spotting during writes. This is because the same index region will keep getting
updated as the row key of the index table isĀ PHOENIX_ROW_TIMESTAMP() + data
table row key.
The same hot spotting can happen during reads as a small subset of index
regions can be used for a given time range. For example, the most recent
changes will be retrieved through one or two index regions.
To address these hot spotting issues, PHOENX-7001 suggests salting the index.
There are three main issues with salting.
The first one is that the number of salt buckets is static and needs to be
determined when the index is created.
The second is that salting does not work well with batch writes as it results
in breaking a batch of writes into separate mini batches, one for each salt
bucket. This leads to using more client threads and server RPC handlers, one
for each salt bucket.
The last issue is that the salt buckets are not visible to applications and
thus they cannot take advantage of the parallelism that comes with salting
during reads. For example, there is no way for applications to use multiple
threads, one thread for each salt bucket, for their queries.
To address all these issues that come with salting, this Jira introduces a
built-in function for CDC indexes called PARTITION_ID(). PARTITION_ID() will be
the prefix of an index row key (= PARTITION_ID() + PHOENIX_ROW_TIMESTAMP() +
data table row key). PARTITION_ID() will identify the data table region of the
data table row key. PARTITION_ID() can be the data table region start key and
region timestamp. The timestamp is used to get different partition ids during
region splits.
The information required to form the partition id is readily available for
observer coprocessors. IndexRegionObserver can generate the value for
PARTITION_ID() while generating index mutations.
Like PHOENIX_ROW_TIMESTAMP(), PARTITION_ID() can be used in CDC index queries.
By including PARTITION_ID() in the row key of an index table, we essentially
create the effect of local index such that all index mutations for a given data
table region are written to one index region determined by the PARITION_ID().
However, here we will not have the local index problem with region splits where
copying index rows during data table region splits is required.
It is worthwhile to note that even if we attempt to use local index as CDC
index, applications cannot directly query individual local index regions, which
will be available with global indexes with PARTITION_ID(). The PARTITION_ID()
creates a new class of global indexes that can be called partitioned global
indexes. These will likely be the new local indexes for Phoenix.
The partitioned CDC indexes will eliminate the need for salting CDC indexes.
Adding partition id will increase the row key size of the CDC index. This will
not be an issue for storage footprint as the partition id will be the row key
prefix and it will be compressed using row kew prefix encoding or block
compression.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)