[ https://issues.apache.org/jira/browse/HBASE-20769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524915#comment-16524915 ]
Hudson commented on HBASE-20769: -------------------------------- Results for branch branch-2 [build #913 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/913/]: (x) *{color:red}-1 overall{color}* ---- details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/913//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/913//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/913//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > getSplits() has a out of bounds problem in TableSnapshotInputFormatImpl > ----------------------------------------------------------------------- > > Key: HBASE-20769 > URL: https://issues.apache.org/jira/browse/HBASE-20769 > Project: HBase > Issue Type: Bug > Affects Versions: 1.3.0, 1.4.0, 2.0.0 > Reporter: Jingyun Tian > Assignee: Jingyun Tian > Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-20769.master.001.patch, > HBASE-20769.master.002.patch, HBASE-20769.master.003.patch, > HBASE-20769.master.004.patch > > > When numSplits > 1, getSplits may create split that has start row smaller > than user specified scan's start row or stop row larger than user specified > scan's stop row. > {code} > byte[][] sp = sa.split(hri.getStartKey(), hri.getEndKey(), numSplits, > true); > for (int i = 0; i < sp.length - 1; i++) { > if (PrivateCellUtil.overlappingKeys(scan.getStartRow(), > scan.getStopRow(), sp[i], > sp[i + 1])) { > List<String> hosts = > calculateLocationsForInputSplit(conf, htd, hri, tableDir, > localityEnabled); > Scan boundedScan = new Scan(scan); > boundedScan.setStartRow(sp[i]); > boundedScan.setStopRow(sp[i + 1]); > splits.add(new InputSplit(htd, hri, hosts, boundedScan, > restoreDir)); > } > } > {code} > Since we split keys by the range of regions, when sp[i] < scan.getStartRow() > or sp[i + 1] > scan.getStopRow(), the created bounded scan may contain range > that over user defined scan. > fix should be simple: > {code} > boundedScan.setStartRow( > Bytes.compareTo(scan.getStartRow(), sp[i]) > 0 ? scan.getStartRow() : sp[i]); > boundedScan.setStopRow( > Bytes.compareTo(scan.getStopRow(), sp[i + 1]) < 0 ? scan.getStopRow() : sp[i > + 1]); > {code} > I will also try to add UTs to help discover this problem -- This message was sent by Atlassian JIRA (v7.6.3#76005)