[
https://issues.apache.org/jira/browse/PHOENIX-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lars Hofhansl updated PHOENIX-5577:
-----------------------------------
Description:
One of the strengths of local indexes is that they are the only indexes that
work when not all columns needed for a query are copied into the index,
allowing them to be *much* smaller. However the merging of the missing columns
is done one-by-one per row.
See RegionScannerFactory.getWrappedScanner(...) -> new
RegionScanner(...).nextRaw(...) -> IndexUtil.wrapResultUsingOffset(...)
For index scans this issues a Get back to the same region for each single
scanned row. While the Get is local, it still needs to setup a scanner and seek
to the right key each time. This is pretty inefficient. Local indexes could be
much, much faster at read time for larger scans. This should use a SkipScan
instead for a batch of scanned keys.
(This is mitigated some by setting the block encoding to ROW_INDEX_V1, but
still less than ideal.)
was:
See RegionScannerFactory.getWrappedScanner(...) -> new
RegionScanner(...).nextRaw(...) -> IndexUtil.wrapResultUsingOffset(...)
For index scans this issues a Get back to the same region for each single
scanned row. While the Get is local, it still needs to setup a scanner and seek
to the right key each time. This is pretty inefficient. Local indexes could be
much, much faster at read time for larger scans. This should use a SkipScan
instead for a batch of scanned keys.
(This is mitigated some by setting the block encoding to ROW_INDEX_V1, but
still less than ideal.)
> Uncovered columns are retrieved one-by-one in local index scans.
> ----------------------------------------------------------------
>
> Key: PHOENIX-5577
> URL: https://issues.apache.org/jira/browse/PHOENIX-5577
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Lars Hofhansl
> Priority: Major
> Labels: performance
>
> One of the strengths of local indexes is that they are the only indexes that
> work when not all columns needed for a query are copied into the index,
> allowing them to be *much* smaller. However the merging of the missing
> columns is done one-by-one per row.
> See RegionScannerFactory.getWrappedScanner(...) -> new
> RegionScanner(...).nextRaw(...) -> IndexUtil.wrapResultUsingOffset(...)
> For index scans this issues a Get back to the same region for each single
> scanned row. While the Get is local, it still needs to setup a scanner and
> seek to the right key each time. This is pretty inefficient. Local indexes
> could be much, much faster at read time for larger scans. This should use a
> SkipScan instead for a batch of scanned keys.
> (This is mitigated some by setting the block encoding to ROW_INDEX_V1, but
> still less than ideal.)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)