Emilio Setiadarma created NIFI-12825: ----------------------------------------
Summary: Implement processor to get row key ranges for HBase regions Key: NIFI-12825 URL: https://issues.apache.org/jira/browse/NIFI-12825 Project: Apache NiFi Issue Type: New Feature Reporter: Emilio Setiadarma Assignee: Emilio Setiadarma A common way for parallelizing scan operations to HBase is to scan by row key ranges. In the HBase architecture, HBase splits tables into regions, each with a range of row keys. These row key ranges are mutually exclusive, and they include all the row keys. The manual approach currently to parallelize scans to HBase via row key ranges is to go to HBase shell, perform the "list_regions" function to obtain row key ranges. This approach has its downsides, most importantly being the fact that row key ranges are not static. HBase regions may also split, creating two regions with the row key range split in the middle. Providing a way for NiFi to obtain these row key ranges per HBase region could help improve the ease of creating a flow that performs scans to HBase parallelized by row key range. Once we know row key ranges, this information could be easily fed into a scanning processor (i.e. ScanHBase). -- This message was sent by Atlassian Jira (v8.20.10#820010)