Github user ramkrish86 commented on the pull request:
https://github.com/apache/phoenix/pull/12#issuecomment-55643833
protected List<KeyRange> genKeyRanges(List<HRegionLocation> regions) {
if (regions.isEmpty()) { return Collections.emptyList(); }
Scan scan = context.getScan();
PTable table = this.tableRef.getTable();
byte[] defaultCF = SchemaUtil.getEmptyColumnFamily(table);
List<byte[]> gps = null;
try {
if (table.getColumnFamilies().isEmpty()) {
// For sure we can get the defaultCF from the table
gps = table.getTableStats().getGuidePosts().get(defaultCF);
} else {
if (scan.getFamilyMap().size() > 0) {
if (scan.getFamilyMap().containsKey(defaultCF)) { //
Favor using default CF if it's used in scan
gps =
table.getColumnFamily(defaultCF).getGuidePosts();
} else { // Otherwise, just use first CF in use by scan
gps =
table.getColumnFamily(scan.getFamilyMap().keySet().iterator().next()).getGuidePosts();
}
} else {
gps = table.getColumnFamily(defaultCF).getGuidePosts();
}
}
} catch (Exception cfne) {
logger.error("Error while getting guideposts for the cf " +
Bytes.toString(defaultCF));
}
List<KeyRange> guidePosts =
Lists.newArrayListWithCapacity(regions.size());
List<KeyRange> regionStartEndKey =
Lists.newArrayListWithExpectedSize(regions.size());
for (HRegionLocation region : regions) {
regionStartEndKey.add(KeyRange.getKeyRange(region.getRegionInfo().getStartKey(),
region
.getRegionInfo().getEndKey()));
}
if (gps != null) {
byte[] startKey = regions.get(0).getRegionInfo().getStartKey();
int regionSize = regions.size();
int regionIndex = 0;
int guideIndex = 0;
int gpsSize = gps.size();
while ((regionIndex <= regionSize - 1) && (guideIndex <=
gpsSize - 1)) {
byte[] guidePost = gps.get(guideIndex);
PhoenixArray array =
(PhoenixArray)PDataType.VARBINARY_ARRAY.toObject(guidePost);
byte[] regionEndKey =
regions.get(regionIndex).getRegionInfo().getEndKey();
if (array != null && array.getDimensions() != 0) {
boolean intersects = false ;
for (int j = 0; j < array.getDimensions(); j++) {
byte[] currentGuidePost = array.toBytes(j);
if (Bytes.compareTo(currentGuidePost, regionEndKey)
<= 0) {
KeyRange keyRange =
KeyRange.getKeyRange(startKey, currentGuidePost);
// Contains check may be too coslty
if(keyRange != KeyRange.EMPTY_RANGE) {
guidePosts.add(keyRange);
}
startKey = currentGuidePost;
if (!intersects) {
guideIndex++;
intersects = true;
}
}
}
}
guidePosts.add(KeyRange.getKeyRange(startKey,
regionEndKey));
regionIndex++;
if (regionIndex <= regionSize - 1) {
startKey =
regions.get(regionIndex).getRegionInfo().getStartKey();
}
}
if (guidePosts.size() > 0) {
List<KeyRange> intersect = KeyRange.intersect(guidePosts,
regionStartEndKey);
return intersect;
} else {
return regionStartEndKey;
}
} else {
return regionStartEndKey;
}
Pls see the last intersect. I will do this for now and ensure that we are
able to get the splits correctly. Without that as split code does not work
correctly - because from the split hook not able to do any updations on the
split table. So we could better target in another JIRA.
The last intersect would ensure that though we get new guideposts based on
the new regions and also the the other guide posts also matches with other
region's end key we get overlapping regions. So finally to resolve that we
could intersect it with the region's start and end key. @JamesRTaylor What do
you think?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---