[jira] [Commented] (HBASE-14206) MultiRowRangeFilter returns records whose rowKeys are out of allowed ranges
[ https://issues.apache.org/jira/browse/HBASE-14206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681831#comment-14681831 ] Jiajia Li commented on HBASE-14206: --- I run the test based on trunk, but with one minor change in test: {code} filter.filterRowKey(badKey, 0, 1); {code} to {code} filter.filterRowKey(KeyValueUtil.createFirstOnRow(badKey)); {code} I think the fix is ok. > MultiRowRangeFilter returns records whose rowKeys are out of allowed ranges > --- > > Key: HBASE-14206 > URL: https://issues.apache.org/jira/browse/HBASE-14206 > Project: HBase > Issue Type: Bug >Affects Versions: 1.1.0 > Environment: linux, java7 >Reporter: Anton Nazaruk >Assignee: Ted Yu >Priority: Critical > Labels: filter > Attachments: 14206-test.patch > > > I haven't found a way to attach test program to JIRA issue, so put it below : > {code} > public class MultiRowRangeFilterTest { > > byte[] key1Start = new byte[] {-3}; > byte[] key1End = new byte[] {-2}; > byte[] key2Start = new byte[] {5}; > byte[] key2End = new byte[] {6}; > byte[] badKey = new byte[] {-10}; > @Test > public void testRanges() throws IOException { > MultiRowRangeFilter filter = new MultiRowRangeFilter(Arrays.asList( > new MultiRowRangeFilter.RowRange(key1Start, true, key1End, > false), > new MultiRowRangeFilter.RowRange(key2Start, true, key2End, > false) > )); > filter.filterRowKey(badKey, 0, 1); > /* > * FAILS -- includes BAD key! > * Expected :SEEK_NEXT_USING_HINT > * Actual :INCLUDE > * */ > assertEquals(Filter.ReturnCode.SEEK_NEXT_USING_HINT, > filter.filterKeyValue(null)); > } > } > {code} > It seems to happen on 2.0.0-SNAPSHOT too, but I wasn't able to link one with > included class. > I have played some time with algorithm, and found that quick fix may be > applied to "getNextRangeIndex(byte[] rowKey)" method (hbase-client:1.1.0) : > {code} > if (insertionPosition == 0 && > !rangeList.get(insertionPosition).contains(rowKey)) { > return ROW_BEFORE_FIRST_RANGE; > } > // FIX START > if(!this.initialized) { > this.initialized = true; > } > // FIX END > return insertionPosition; > {code} > Thanks, hope it will help. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13704) Hbase throws OutOfOrderScannerNextException exception when MultiRowRangeFilter is used.
[ https://issues.apache.org/jira/browse/HBASE-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551719#comment-14551719 ] Jiajia Li commented on HBASE-13704: --- The patch is great for me. > Hbase throws OutOfOrderScannerNextException exception when > MultiRowRangeFilter is used. > --- > > Key: HBASE-13704 > URL: https://issues.apache.org/jira/browse/HBASE-13704 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 1.1.0 >Reporter: Aleksandr Maksymenko >Assignee: Aleksandr Maksymenko > Fix For: 2.0.0, 1.2.0, 1.1.1 > > Attachments: 13704-v1.txt > > > When using filter MultiRowRangeFilter with ranges closed to each other that > there are no rows between ranges, then OutOfOrderScannerNextException is > throwed. > In filterRowKey method when range is switched to the next range, > currentReturnCode is set to SEEK_NEXT_USING_HINT (MultiRowRangeFilter: 118 in > v1.1.0). But if new range is already contain this row, then we should include > this row, not to seek for another one. > Replacing line 118 to this code seems to be working fine: > {code} > if (range.contains(buffer, offset, length)) { > currentReturnCode = ReturnCode.INCLUDE; > } else { > currentReturnCode = ReturnCode.SEEK_NEXT_USING_HINT; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13704) Hbase throws OutOfOrderScannerNextException exception when MultiRowRangeFilter is used.
[ https://issues.apache.org/jira/browse/HBASE-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549736#comment-14549736 ] Jiajia Li commented on HBASE-13704: --- I think if the range already contains row, it will from Line91 to Line121. Can you provide your test more detailed, maybe I got it wrong, thanks. > Hbase throws OutOfOrderScannerNextException exception when > MultiRowRangeFilter is used. > --- > > Key: HBASE-13704 > URL: https://issues.apache.org/jira/browse/HBASE-13704 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 1.1.0 >Reporter: Aleksandr Maksymenko > > When using filter MultiRowRangeFilter with ranges closed to each other that > there are no rows between ranges, then OutOfOrderScannerNextException is > throwed. > In filterRowKey method when range is switched to the next range, > currentReturnCode is set to SEEK_NEXT_USING_HINT (MultiRowRangeFilter: 118 in > v1.1.0). But if new range is already contain this row, then we should include > this row, not to seek for another one. > Replacing line 118 to this code seems to be working fine: > {code} > if (range.contains(buffer, offset, length)) { > currentReturnCode = ReturnCode.INCLUDE; > } else { > currentReturnCode = ReturnCode.SEEK_NEXT_USING_HINT; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13705) MultiRowRangeFilter seems to be working incorrect if RowRange.startRowInclusive = false
[ https://issues.apache.org/jira/browse/HBASE-13705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549730#comment-14549730 ] Jiajia Li commented on HBASE-13705: --- For the second issue, suppose we have a range [10,12), and only one rowkey "12" in hbase table, steps of scan: 1. Line 91: Check if current range contains row "12" 2. Line 94: Search for the next RowRange in method getNextRangeIndex 3. Line 223: index = -2 4. Line 225: insertionPosition = 1 5. Line 95: get the the index=1 will equal to rangList.size()=1, so return false, the row "12" will not be included. [~masyaman], I think this test is ok, please advise if I didn't get your point. > MultiRowRangeFilter seems to be working incorrect if > RowRange.startRowInclusive = false > --- > > Key: HBASE-13705 > URL: https://issues.apache.org/jira/browse/HBASE-13705 > Project: HBase > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Aleksandr Maksymenko > > I've found the issue during code review, so I don't have tests and I even > didn't test this case manualy. So I'll try to describe it in words. > Pre-condition: we're using scan with MultiRowRangeFilter with some RowRange's > with startRowInclusive = false. This means that we want to include all rows > that are strictly greater than startRow (and less then stopRow, but it > doesn't matter for now). > What happens in MultiRowRangeFilter.filterRowKey (worth case is described): > 1. Line 91: Check if current range contains a row. Lets follow the case when > it doesn't. > 2. Line 94: Search for the next RowRange in method getNextRangeIndex. > 3. Line 238: We've found a RowRange, check if startRowInclusive == false and > set EXCLUSIVE = true. This variable indicates if next row should be excluded. > 4. Line 105: Check if EXCLUSIVE == true, if so skip this row. > The problem: we've skipped first row we got in this range, but we never > checked if this row is a RowRange.startRow . In distributed system may not > get RowRange.startRow on current instance, so we may exclude some another > row. Moreover, we may not have RowRange.startRow at all in the DB, we will > exclude some rows that are (possible) close to RowRange.startRow, but not > equals to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13012) Add a shell to trigger the mob file compactor by commands
[ https://issues.apache.org/jira/browse/HBASE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-13012: -- Attachment: hbase-13012-V4.diff update the patch(hbase-13012-V4) according jingcheng's comments. > Add a shell to trigger the mob file compactor by commands > - > > Key: HBASE-13012 > URL: https://issues.apache.org/jira/browse/HBASE-13012 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Reporter: Jingcheng Du >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: hbase-13012-V1.patch, hbase-13012-V2.diff, > hbase-13012-V3.diff, hbase-13012-V4.diff > > > Currently the MobFileCompactor is run by HMaster periodically, we need to add > a shell to trigger the compactor by commands. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13012) Add a shell to trigger the mob file compactor by commands
[ https://issues.apache.org/jira/browse/HBASE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-13012: -- Attachment: hbase-13012-V3.diff thanks [~jingcheng...@intel.com], I've uploaded the patch(hbase-13012-V3.diff) according your comments. Hi, [~j...@cloudera.com], [~ram_krish],[~anoopsamjohn], can you review on this patch? thanks. > Add a shell to trigger the mob file compactor by commands > - > > Key: HBASE-13012 > URL: https://issues.apache.org/jira/browse/HBASE-13012 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Reporter: Jingcheng Du >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: hbase-13012-V1.patch, hbase-13012-V2.diff, > hbase-13012-V3.diff > > > Currently the MobFileCompactor is run by HMaster periodically, we need to add > a shell to trigger the compactor by commands. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13012) Add a shell to trigger the mob file compactor by commands
[ https://issues.apache.org/jira/browse/HBASE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-13012: -- Attachment: hbase-13012-V2.diff update the patch(hbase-13012-V2.diff) according jingcheng's commets. > Add a shell to trigger the mob file compactor by commands > - > > Key: HBASE-13012 > URL: https://issues.apache.org/jira/browse/HBASE-13012 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Reporter: Jingcheng Du >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: hbase-13012-V1.patch, hbase-13012-V2.diff > > > Currently the MobFileCompactor is run by HMaster periodically, we need to add > a shell to trigger the compactor by commands. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13012) Add a shell to trigger the mob file compactor by commands
[ https://issues.apache.org/jira/browse/HBASE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-13012: -- Attachment: hbase-13012-V1.patch Upload the hbase-13012-V1.patch, RB: https://reviews.apache.org/r/31053/ Hi, [~j...@cloudera.com], [~anoopsamjohn], [~ram_krish], [~jingcheng...@intel.com], please look at it, thanks. > Add a shell to trigger the mob file compactor by commands > - > > Key: HBASE-13012 > URL: https://issues.apache.org/jira/browse/HBASE-13012 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Reporter: Jingcheng Du >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: hbase-13012-V1.patch > > > Currently the MobFileCompactor is run by HMaster periodically, we need to add > a shell to trigger the compactor by commands. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313322#comment-14313322 ] Jiajia Li commented on HBASE-12332: --- Hi, [~j...@cloudera.com], what is the plan for this jira? > [mob] use filelink instead of retry when resolving mobfiles > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, > HBASE-12332-V3.patch, HBASE-12332-V5.patch, hbase-12332.link.v4.patch, > hbase-12332.patch > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285092#comment-14285092 ] Jiajia Li commented on HBASE-12332: --- Updating to V5 is because when I test the case (try to read a opened hfile after deleting the orignal table), the FileNotFoundException from namenode will be found, but this FNFE is wrapped by IOException in StoreFileScanner.seek. So I add the e.getCause() to avoid hbase client retry when fail to read cell. > [mob] use filelink instead of retry when resolving mobfiles > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, > HBASE-12332-V3.patch, HBASE-12332-V5.patch, hbase-12332.link.v4.patch, > hbase-12332.patch > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-12332: -- Attachment: HBASE-12332-V5.patch hi, [~j...@cloudera.com], HBASE-12332-V5 based on HBASE-12332-V3, add (e.getCause() instanceof FileNotFoundException) when read cell, can you look at it? Thanks > [mob] use filelink instead of retry when resolving mobfiles > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, > HBASE-12332-V3.patch, HBASE-12332-V5.patch, hbase-12332.link.v4.patch, > hbase-12332.patch > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scanning multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Release Note: MultiRowRangeFilter is a filter to support scanning multiple row key ranges. If the number of the ranges is small, using multiple scans can also do the same thing and can work well. But when the number of ranges are quite big (e.g. millions), use the MultiRowRangeFilter will be nice. In this filter, the ranges will be sorted and merged, so users do not have to take care of ranges are not continuous. And if users are using something like rest, thrift or pig to access the data the filter might be the practical solution. (was: MultiRowRangeFilter is a filter to support scanning multiple row key ranges.) > Filter to support scanning multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Fix For: 2.0.0, 1.1.0 > > Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, > HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, > HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V16.patch, > HBASE_11144_V17.patch, HBASE_11144_V18.patch, HBASE_11144_V5.patch, > HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, > MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11144) Filter to support scanning multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276522#comment-14276522 ] Jiajia Li commented on HBASE-11144: --- I've only test between the filterlist and multirowrangefilter, the RowFilter is used by the filterlist. > Filter to support scanning multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Fix For: 2.0.0, 1.1.0 > > Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, > HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, > HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V16.patch, > HBASE_11144_V17.patch, HBASE_11144_V18.patch, HBASE_11144_V5.patch, > HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, > MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scanning multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Release Note: MultiRowRangeFilter is a filter to support scanning multiple row key ranges. > Filter to support scanning multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Fix For: 2.0.0, 1.1.0 > > Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, > HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, > HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V16.patch, > HBASE_11144_V17.patch, HBASE_11144_V18.patch, HBASE_11144_V5.patch, > HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, > MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: HBASE_11144_V18.patch fix the check style error. > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, > HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, > HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V16.patch, > HBASE_11144_V17.patch, HBASE_11144_V18.patch, HBASE_11144_V5.patch, > HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, > MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: HBASE_11144_V17.patch upload V17, fix the findbugs. > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, > HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, > HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V16.patch, > HBASE_11144_V17.patch, HBASE_11144_V5.patch, HBASE_11144_V6.patch, > HBASE_11144_V7.patch, HBASE_11144_V9.patch, MultiRowRangeFilter.patch, > MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: HBASE_11144_V16.patch upload the path V16(fix the test error). > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, > HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, > HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V16.patch, > HBASE_11144_V5.patch, HBASE_11144_V6.patch, HBASE_11144_V7.patch, > HBASE_11144_V9.patch, MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: HBASE_11144_V15.patch hi [~stack], thanks for your review, V15 add the inclusive/exclusive in the rowrange and rename the RowKeyRange to RowRange, can you take some time to look at it? Thanks > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, > HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, > HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V5.patch, > HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, > MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: HBASE_11144_V14.patch fix the hadoop QA issue. > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, > HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, > HBASE_11144_V14.patch, HBASE_11144_V5.patch, HBASE_11144_V6.patch, > HBASE_11144_V7.patch, HBASE_11144_V9.patch, MultiRowRangeFilter.patch, > MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: HBASE_11144_V13.patch fix the checkstyle error and add the InterfaceAudience in RowKeyRange Class. > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, > HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, > HBASE_11144_V5.patch, HBASE_11144_V6.patch, HBASE_11144_V7.patch, > HBASE_11144_V9.patch, MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: HBASE_11144_V12.patch Thanks [~tedyu] for your comments, I've update the patch(V12). > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, > HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V5.patch, > HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, > MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268867#comment-14268867 ] Jiajia Li commented on HBASE-11144: --- [~ram_krish] thanks for your review [~tedyu] [~anoopsamjohn], can you take some time to look at the latest patch? The review link is https://reviews.apache.org/r/21370/ > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, > HBASE_11144_V11.patch, HBASE_11144_V5.patch, HBASE_11144_V6.patch, > HBASE_11144_V7.patch, HBASE_11144_V9.patch, MultiRowRangeFilter.patch, > MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: HBASE_11144_V11.patch > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, > HBASE_11144_V11.patch, HBASE_11144_V5.patch, HBASE_11144_V6.patch, > HBASE_11144_V7.patch, HBASE_11144_V9.patch, MultiRowRangeFilter.patch, > MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: HBASE_11144_V10.patch upload the new patch(HBASE_11144_V10.patch) to fix the checkstyle error. > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, > HBASE_11144_V5.patch, HBASE_11144_V6.patch, HBASE_11144_V7.patch, > HBASE_11144_V9.patch, MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: HBASE_11144_V9.patch upload the HBASE_11144_V9.patch, according Ram's comment. > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, > HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, > MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267194#comment-14267194 ] Jiajia Li commented on HBASE-11144: --- hi, [~ram_krish], can you look at the comment on https://reviews.apache.org/r/21370/, and give your advise? Thanks > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, > HBASE_11144_V6.patch, HBASE_11144_V7.patch, MultiRowRangeFilter.patch, > MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265551#comment-14265551 ] Jiajia Li commented on HBASE-12332: --- [~jmhsieh], can you tell me the problem you have meet? Thanks > [mob] use filelink instead of retry when resolving mobfiles > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, > HBASE-12332-V3.patch, hbase-12332.link.v4.patch, hbase-12332.patch > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263725#comment-14263725 ] Jiajia Li commented on HBASE-12332: --- hi, [~j...@cloudera.com], now we will use the filelink when reolving mob files? If so, I will try to add the UT in HBASE-12670. > [mob] use filelink instead of retry when resolving mobfiles > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, > HBASE-12332-V3.patch, hbase-12332.link.v4.patch, hbase-12332.patch > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: hbase_11144_V8.patch Thanks [~ram_krish], in this patch I've removed the other check and add the filter in ScannerModel, can you take some time to look at it? Thanks > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, > HBASE_11144_V6.patch, HBASE_11144_V7.patch, MultiRowRangeFilter.patch, > MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch, hbase_11144_V8.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14259782#comment-14259782 ] Jiajia Li commented on HBASE-11144: --- hi,[~tedyu], [~ram_krish],can you review the code (https://reviews.apache.org/r/21370/)? Thanks. > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, > HBASE_11144_V6.patch, HBASE_11144_V7.patch, MultiRowRangeFilter.patch, > MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-12670) Add unit tests that exercise the added hfilelink link mob paths
[ https://issues.apache.org/jira/browse/HBASE-12670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li reassigned HBASE-12670: - Assignee: Jiajia Li > Add unit tests that exercise the added hfilelink link mob paths > --- > > Key: HBASE-12670 > URL: https://issues.apache.org/jira/browse/HBASE-12670 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Reporter: Jonathan Hsieh >Assignee: Jiajia Li > Fix For: hbase-11339 > > > HBASe-12646 introduced the mob path to HFileLink -- we didn't add unit tests > for it however. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: HBASE_11144_V7.patch upload the new patch HBASE_11144_V7, refine the code according Ram's comment > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, > HBASE_11144_V6.patch, HBASE_11144_V7.patch, MultiRowRangeFilter.patch, > MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12331) Shorten the mob snapshot unit tests
[ https://issues.apache.org/jira/browse/HBASE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-12331: -- Attachment: HBASE-12331-V2.patch upload the HBASE-12331-V2.patch(remove the TestMob *WithRegionReplicas and TestMobSecureExportSnapshot) > Shorten the mob snapshot unit tests > --- > > Key: HBASE-12331 > URL: https://issues.apache.org/jira/browse/HBASE-12331 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12331-V1.diff, HBASE-12331-V2.patch > > > The mob snapshot patch introduced a whole log of tests that take a long time > to run and would be better as integration tests. > {code} > --- > T E S T S > --- > Running > org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClientWithRegionReplicas > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 394.803 sec - > in > org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClientWithRegionReplicas > Running org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClient > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 212.377 sec - > in org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClient > Running > org.apache.hadoop.hbase.client.TestMobSnapshotFromClientWithRegionReplicas > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 49.463 sec - > in org.apache.hadoop.hbase.client.TestMobSnapshotFromClientWithRegionReplicas > Running org.apache.hadoop.hbase.client.TestMobSnapshotFromClient > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 46.724 sec - > in org.apache.hadoop.hbase.client.TestMobSnapshotFromClient > Running org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 204.03 sec - > in org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient > Running > org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClientWithRegionReplicas > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 214.052 sec - > in > org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClientWithRegionReplicas > Running org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence > Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 105.139 sec - > in org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence > Running org.apache.hadoop.hbase.regionserver.TestMobStoreScanner > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.42 sec - > in org.apache.hadoop.hbase.regionserver.TestMobStoreScanner > Running org.apache.hadoop.hbase.regionserver.TestDeleteMobTable > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.136 sec - > in org.apache.hadoop.hbase.regionserver.TestDeleteMobTable > Running org.apache.hadoop.hbase.regionserver.TestHMobStore > Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.09 sec - in > org.apache.hadoop.hbase.regionserver.TestHMobStore > Running org.apache.hadoop.hbase.regionserver.TestMobCompaction > Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.629 sec - > in org.apache.hadoop.hbase.regionserver.TestMobCompaction > Running org.apache.hadoop.hbase.mob.TestCachedMobFile > Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.301 sec - > in org.apache.hadoop.hbase.mob.TestCachedMobFile > Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepJob > Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.752 sec - > in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepJob > Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepReducer > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.276 sec - > in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepReducer > Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepMapper > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.46 sec - > in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepMapper > Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 173.05 sec - > in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper > Running org.apache.hadoop.hbase.mob.TestMobDataBlockEncoding > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.86 sec - > in org.apache.hadoop.hbase.mob.TestMobDataBlockEncoding > Running org.apache.hadoop.hbase.mob.TestExpiredMobFileCleaner > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.029 sec - > in org.apache.hadoop.hbase.mob.TestExpiredMobFileCleaner > Running org.apache.hadoop.hbase.mob.TestMobFile > Tests run:
[jira] [Updated] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-12332: -- Attachment: HBASE-12332-V3.patch Hi, [~j...@cloudera.com], the HBASE-12332-V3.patch change the synchronized to IdLock.Entry based on HBASE-12332-V2.patch, I think it is more easier to resolve the mobfiles. If using the HFileLink, it may be reduce the performance of reading cell(because every time to read cell will create the hfilelink). Can you review on this patch and give the advise? Thanks > [mob] use filelink instead of retry when resolving mobfiles > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, > HBASE-12332-V3.patch, hbase-12332.patch > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14256639#comment-14256639 ] Jiajia Li commented on HBASE-12332: --- hi, [~j...@cloudera.com], in this patch, I found each reading mob cell operate will create the HFileLink and StoreFileInfo, will this reduce the efficiency of reading? I think through reading the possible locations will be easier and more effective, how do you think? > [mob] use filelink instead of retry when resolving mobfiles > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, > hbase-12332.patch > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14256535#comment-14256535 ] Jiajia Li commented on HBASE-11144: --- thanks, I've updated the reviewboard, https://reviews.apache.org/r/21370/ > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, > HBASE_11144_V6.patch, MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: HBASE_11144_V6.patch upload new patch HBASE_11144_V6.patch which remove the warned about missing interface audience. > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, > HBASE_11144_V6.patch, MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir
[ https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14256488#comment-14256488 ] Jiajia Li commented on HBASE-12673: --- hi, [~j...@cloudera.com], sorry I can't understand the meaning of "it is starting to be used in other places and needs to be done". Do you means it used in the SnapshotInfo tool? The process of reading MOB cell is through getting the filename from the reference cell, then open the file through the possible locations. The reference cell in hbase is only a filename, so I think hfilelink can't be used in reading mob cell. > Add a UT to read mob file when the mob hfile moving from the mob dir to the > archive dir > --- > > Key: HBASE-12673 > URL: https://issues.apache.org/jira/browse/HBASE-12673 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: HBASE-12673-V2.patch, HBASE-12673.patch > > > add a unit test to scan the cloned table when deleting the original table, > and the steps as following: > 1) create a table with mobs, > 2) snapshot it, > 3) clone it as a a different table > 4) have a read workload on the snapshot > 5) delete the original table -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12331) Shorten the mob snapshot unit tests
[ https://issues.apache.org/jira/browse/HBASE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14256456#comment-14256456 ] Jiajia Li commented on HBASE-12331: --- Thanks Jon, I will drop these tests and upload a new patch. > Shorten the mob snapshot unit tests > --- > > Key: HBASE-12331 > URL: https://issues.apache.org/jira/browse/HBASE-12331 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12331-V1.diff > > > The mob snapshot patch introduced a whole log of tests that take a long time > to run and would be better as integration tests. > {code} > --- > T E S T S > --- > Running > org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClientWithRegionReplicas > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 394.803 sec - > in > org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClientWithRegionReplicas > Running org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClient > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 212.377 sec - > in org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClient > Running > org.apache.hadoop.hbase.client.TestMobSnapshotFromClientWithRegionReplicas > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 49.463 sec - > in org.apache.hadoop.hbase.client.TestMobSnapshotFromClientWithRegionReplicas > Running org.apache.hadoop.hbase.client.TestMobSnapshotFromClient > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 46.724 sec - > in org.apache.hadoop.hbase.client.TestMobSnapshotFromClient > Running org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 204.03 sec - > in org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient > Running > org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClientWithRegionReplicas > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 214.052 sec - > in > org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClientWithRegionReplicas > Running org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence > Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 105.139 sec - > in org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence > Running org.apache.hadoop.hbase.regionserver.TestMobStoreScanner > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.42 sec - > in org.apache.hadoop.hbase.regionserver.TestMobStoreScanner > Running org.apache.hadoop.hbase.regionserver.TestDeleteMobTable > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.136 sec - > in org.apache.hadoop.hbase.regionserver.TestDeleteMobTable > Running org.apache.hadoop.hbase.regionserver.TestHMobStore > Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.09 sec - in > org.apache.hadoop.hbase.regionserver.TestHMobStore > Running org.apache.hadoop.hbase.regionserver.TestMobCompaction > Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.629 sec - > in org.apache.hadoop.hbase.regionserver.TestMobCompaction > Running org.apache.hadoop.hbase.mob.TestCachedMobFile > Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.301 sec - > in org.apache.hadoop.hbase.mob.TestCachedMobFile > Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepJob > Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.752 sec - > in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepJob > Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepReducer > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.276 sec - > in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepReducer > Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepMapper > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.46 sec - > in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepMapper > Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 173.05 sec - > in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper > Running org.apache.hadoop.hbase.mob.TestMobDataBlockEncoding > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.86 sec - > in org.apache.hadoop.hbase.mob.TestMobDataBlockEncoding > Running org.apache.hadoop.hbase.mob.TestExpiredMobFileCleaner > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.029 sec - > in org.apache.hadoop.hbase.mob.TestExpiredMobFileCleaner > Running org.apache.hadoop.hbase.mob.TestMobFile > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time ela
[jira] [Updated] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir
[ https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-12673: -- Attachment: HBASE-12673-V2.patch Hi, [~j...@cloudera.com], I've upload a new patch, in this UT will try to read a opened hfile after deleting the orignal table, and IOException(not FileNotFoundException) will be found, so I will change the patch in https://issues.apache.org/jira/browse/HBASE-12332 > Add a UT to read mob file when the mob hfile moving from the mob dir to the > archive dir > --- > > Key: HBASE-12673 > URL: https://issues.apache.org/jira/browse/HBASE-12673 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: HBASE-12673-V2.patch, HBASE-12673.patch > > > add a unit test to scan the cloned table when deleting the original table, > and the steps as following: > 1) create a table with mobs, > 2) snapshot it, > 3) clone it as a a different table > 4) have a read workload on the snapshot > 5) delete the original table -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14255534#comment-14255534 ] Jiajia Li commented on HBASE-11144: --- hi, [~ram_krish], [~anoopsamjohn], [~j...@cloudera.com], if you are interested, can you take some time in this patch and give some advise? Thanks > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, > MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14255524#comment-14255524 ] Jiajia Li commented on HBASE-11144: --- hi,[~tedyu], thanks for you review, I've upload the new patch. > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, > MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Status: Patch Available (was: Open) > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, > MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Status: Open (was: Patch Available) > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, > MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-11144: -- Attachment: HBASE_11144_V5.patch upload the HBASE_11144_V5.patch against trunk. > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, > MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, > MultiRowRangeFilter3.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12332) [mob] use filelink instad of retry when resolving an hfilelink.
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253042#comment-14253042 ] Jiajia Li commented on HBASE-12332: --- upload the HBASE-12332-V2.patch, catch more exceptions when read the cell.[~j...@cloudera.com] > [mob] use filelink instad of retry when resolving an hfilelink. > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir
[ https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253040#comment-14253040 ] Jiajia Li commented on HBASE-12673: --- Thanks, [~j...@cloudera.com], I have changed the patch in (https://issues.apache.org/jira/browse/HBASE-12332) which catch the NullPointerException and AssertionError, can you look on that? > Add a UT to read mob file when the mob hfile moving from the mob dir to the > archive dir > --- > > Key: HBASE-12673 > URL: https://issues.apache.org/jira/browse/HBASE-12673 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: HBASE-12673.patch > > > add a unit test to scan the cloned table when deleting the original table, > and the steps as following: > 1) create a table with mobs, > 2) snapshot it, > 3) clone it as a a different table > 4) have a read workload on the snapshot > 5) delete the original table -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12332) [mob] use filelink instad of retry when resolving an hfilelink.
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-12332: -- Attachment: HBASE-12332-V2.patch > [mob] use filelink instad of retry when resolving an hfilelink. > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir
[ https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251411#comment-14251411 ] Jiajia Li commented on HBASE-12673: --- Hi, [~j...@cloudera.com], can you look at above comment and give some advise? Thanks > Add a UT to read mob file when the mob hfile moving from the mob dir to the > archive dir > --- > > Key: HBASE-12673 > URL: https://issues.apache.org/jira/browse/HBASE-12673 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: HBASE-12673.patch > > > add a unit test to scan the cloned table when deleting the original table, > and the steps as following: > 1) create a table with mobs, > 2) snapshot it, > 3) clone it as a a different table > 4) have a read workload on the snapshot > 5) delete the original table -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir
[ https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249836#comment-14249836 ] Jiajia Li commented on HBASE-12673: --- Hi,[~j...@cloudera.com], I have tested to close a deleted mobfile , that will not get other exception . It's hard to use the FileLink in the Mob when read cell, because the mob reference cell only contain the filename, and this is not a hfilelink pattern, so I think the HFileLink can't used here, do you have any idea? Can we also capture the other exceptions that could be caught in HFileLink? Now we are trying to reproduce this case more fine-grained to catch more exception. > Add a UT to read mob file when the mob hfile moving from the mob dir to the > archive dir > --- > > Key: HBASE-12673 > URL: https://issues.apache.org/jira/browse/HBASE-12673 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: HBASE-12673.patch > > > add a unit test to scan the cloned table when deleting the original table, > and the steps as following: > 1) create a table with mobs, > 2) snapshot it, > 3) clone it as a a different table > 4) have a read workload on the snapshot > 5) delete the original table -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir
[ https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247995#comment-14247995 ] Jiajia Li commented on HBASE-12673: --- hi, [~j...@cloudera.com], as you know when hbase reads mob cell, it has two steps. # Read the ref cell from the HBase, and get the cell value which is the mob file name. # HBase has two possible locations to read the mob cell, one is the mobWorkingDir/fileName, archiveDir/fileName. When the mob file is not in the mobWorkingDir, HBase will try the second location.But now we only retry after the FileNotFoundException(https://github.com/apache/hbase/blob/hbase-11339/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HMobStore.java#L312) Do you means a table deletion on the original table happens in the middle of the read operation will throw other IOExceptions? I don't know the HFileLink how to guarantee the case in the MOB? Can you please give a more detailed description? Thanks > Add a UT to read mob file when the mob hfile moving from the mob dir to the > archive dir > --- > > Key: HBASE-12673 > URL: https://issues.apache.org/jira/browse/HBASE-12673 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: HBASE-12673.patch > > > add a unit test to scan the cloned table when deleting the original table, > and the steps as following: > 1) create a table with mobs, > 2) snapshot it, > 3) clone it as a a different table > 4) have a read workload on the snapshot > 5) delete the original table -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12332) [mob] use filelink instad of retry when resolving an hfilelink.
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246357#comment-14246357 ] Jiajia Li commented on HBASE-12332: --- hi, [~j...@cloudera.com], the case you mentioned is added in https://issues.apache.org/jira/browse/HBASE-12673, can you look at it? Thanks~ > [mob] use filelink instad of retry when resolving an hfilelink. > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir
[ https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246344#comment-14246344 ] Jiajia Li commented on HBASE-12673: --- hi, [~j...@cloudera.com], can you review this patch? > Add a UT to read mob file when the mob hfile moving from the mob dir to the > archive dir > --- > > Key: HBASE-12673 > URL: https://issues.apache.org/jira/browse/HBASE-12673 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: HBASE-12673.patch > > > add a unit test to scan the cloned table when deleting the original table, > and the steps as following: > 1) create a table with mobs, > 2) snapshot it, > 3) clone it as a a different table > 4) have a read workload on the snapshot > 5) delete the original table -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir
[ https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-12673: -- Description: add a unit test to scan the cloned table when deleting the original table, and the steps as following: 1) create a table with mobs, 2) snapshot it, 3) clone it as a a different table 4) have a read workload on the snapshot 5) delete the original table was: 1) create a table with mobs, 2) snapshot it, 3) clone/restore it as a a different table 4) have a read workload on the snapshot 5) delete the original table > Add a UT to read mob file when the mob hfile moving from the mob dir to the > archive dir > --- > > Key: HBASE-12673 > URL: https://issues.apache.org/jira/browse/HBASE-12673 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: HBASE-12673.patch > > > add a unit test to scan the cloned table when deleting the original table, > and the steps as following: > 1) create a table with mobs, > 2) snapshot it, > 3) clone it as a a different table > 4) have a read workload on the snapshot > 5) delete the original table -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir
[ https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-12673: -- Status: Patch Available (was: Open) > Add a UT to read mob file when the mob hfile moving from the mob dir to the > archive dir > --- > > Key: HBASE-12673 > URL: https://issues.apache.org/jira/browse/HBASE-12673 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: HBASE-12673.patch > > > 1) create a table with mobs, > 2) snapshot it, > 3) clone/restore it as a a different table > 4) have a read workload on the snapshot > 5) delete the original table -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir
[ https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-12673: -- Attachment: HBASE-12673.patch upload the patch > Add a UT to read mob file when the mob hfile moving from the mob dir to the > archive dir > --- > > Key: HBASE-12673 > URL: https://issues.apache.org/jira/browse/HBASE-12673 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li > Fix For: hbase-11339 > > Attachments: HBASE-12673.patch > > > 1) create a table with mobs, > 2) snapshot it, > 3) clone/restore it as a a different table > 4) have a read workload on the snapshot > 5) delete the original table -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12540) TestRegionServerMetrics#testMobMetrics test failure
[ https://issues.apache.org/jira/browse/HBASE-12540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-12540: -- Attachment: (was: HBASE-12540-V3.patch) > TestRegionServerMetrics#testMobMetrics test failure > --- > > Key: HBASE-12540 > URL: https://issues.apache.org/jira/browse/HBASE-12540 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: hbase-11339 >Reporter: stack >Assignee: Jingcheng Du > Fix For: hbase-11339 > > Attachments: HBASE-12540-V2.patch, HBASE-12540.diff, > hbase-12540-v3.patch, log.txt > > > Got this on an internal rig run. Maybe you want to take a looksee > [~jingchengdu]? > {code} > Error Message > Metrics Counters should be equal expected:<5> but was:<2> > Stacktrace > java.lang.AssertionError: Metrics Counters should be equal expected:<5> but > was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.hbase.test.MetricsAssertHelperImpl.assertCounter(MetricsAssertHelperImpl.java:185) > at > org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics.testMobMetrics(TestRegionServerMetrics.java:448) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at org.junit.runners.Suite.runChild(Suite.java:127) > at org.junit.runners.Suite.runChild(Suite.java:26) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12540) TestRegionServerMetrics#testMobMetrics test failure
[ https://issues.apache.org/jira/browse/HBASE-12540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-12540: -- Attachment: HBASE-12540-V3.patch [~j...@cloudera.com], the V2 missing the fix, thanks > TestRegionServerMetrics#testMobMetrics test failure > --- > > Key: HBASE-12540 > URL: https://issues.apache.org/jira/browse/HBASE-12540 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: hbase-11339 >Reporter: stack >Assignee: Jingcheng Du > Fix For: hbase-11339 > > Attachments: HBASE-12540-V2.patch, HBASE-12540-V3.patch, > HBASE-12540.diff, hbase-12540-v3.patch, log.txt > > > Got this on an internal rig run. Maybe you want to take a looksee > [~jingchengdu]? > {code} > Error Message > Metrics Counters should be equal expected:<5> but was:<2> > Stacktrace > java.lang.AssertionError: Metrics Counters should be equal expected:<5> but > was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.hbase.test.MetricsAssertHelperImpl.assertCounter(MetricsAssertHelperImpl.java:185) > at > org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics.testMobMetrics(TestRegionServerMetrics.java:448) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at org.junit.runners.Suite.runChild(Suite.java:127) > at org.junit.runners.Suite.runChild(Suite.java:26) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12540) TestRegionServerMetrics#testMobMetrics test failure
[ https://issues.apache.org/jira/browse/HBASE-12540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242088#comment-14242088 ] Jiajia Li commented on HBASE-12540: --- hi, [~jmhsieh], sorry for missing some in patch, I will upload the new patch soon~ > TestRegionServerMetrics#testMobMetrics test failure > --- > > Key: HBASE-12540 > URL: https://issues.apache.org/jira/browse/HBASE-12540 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: hbase-11339 >Reporter: stack >Assignee: Jingcheng Du > Fix For: hbase-11339 > > Attachments: HBASE-12540-V2.patch, HBASE-12540.diff, log.txt > > > Got this on an internal rig run. Maybe you want to take a looksee > [~jingchengdu]? > {code} > Error Message > Metrics Counters should be equal expected:<5> but was:<2> > Stacktrace > java.lang.AssertionError: Metrics Counters should be equal expected:<5> but > was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.hbase.test.MetricsAssertHelperImpl.assertCounter(MetricsAssertHelperImpl.java:185) > at > org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics.testMobMetrics(TestRegionServerMetrics.java:448) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at org.junit.runners.Suite.runChild(Suite.java:127) > at org.junit.runners.Suite.runChild(Suite.java:26) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir
Jiajia Li created HBASE-12673: - Summary: Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir Key: HBASE-12673 URL: https://issues.apache.org/jira/browse/HBASE-12673 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jiajia Li Assignee: Jiajia Li Fix For: hbase-11339 1) create a table with mobs, 2) snapshot it, 3) clone/restore it as a a different table 4) have a read workload on the snapshot 5) delete the original table -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12540) TestRegionServerMetrics#testMobMetrics test failure
[ https://issues.apache.org/jira/browse/HBASE-12540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-12540: -- Attachment: HBASE-12540-V2.patch thanks Jon, I upload the HBASE-12540-V2.patch(rename the var compactionThreshold to numHfiles). > TestRegionServerMetrics#testMobMetrics test failure > --- > > Key: HBASE-12540 > URL: https://issues.apache.org/jira/browse/HBASE-12540 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: hbase-11339 >Reporter: stack >Assignee: Jingcheng Du > Fix For: hbase-11339 > > Attachments: HBASE-12540-V2.patch, HBASE-12540.diff, log.txt > > > Got this on an internal rig run. Maybe you want to take a looksee > [~jingchengdu]? > {code} > Error Message > Metrics Counters should be equal expected:<5> but was:<2> > Stacktrace > java.lang.AssertionError: Metrics Counters should be equal expected:<5> but > was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.hbase.test.MetricsAssertHelperImpl.assertCounter(MetricsAssertHelperImpl.java:185) > at > org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics.testMobMetrics(TestRegionServerMetrics.java:448) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at org.junit.runners.Suite.runChild(Suite.java:127) > at org.junit.runners.Suite.runChild(Suite.java:26) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12332) [mob] use filelink instad of retry when resolving an hfilelink.
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240558#comment-14240558 ] Jiajia Li commented on HBASE-12332: --- hi, [~j...@cloudera.com], do you have any idea? > [mob] use filelink instad of retry when resolving an hfilelink. > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239058#comment-14239058 ] Jiajia Li commented on HBASE-11144: --- hi, [~saurabh.wl], this patch haven't reviewed by the committers, so may not release in the next version of HBase, but feel free to take this patch in your case~ > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, MultiRowRangeFilter.patch, > MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-11144) Filter to support scan multiple row key ranges
[ https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li reassigned HBASE-11144: - Assignee: Jiajia Li > Filter to support scan multiple row key ranges > -- > > Key: HBASE-11144 > URL: https://issues.apache.org/jira/browse/HBASE-11144 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HBASE_11144_4.patch, MultiRowRangeFilter.patch, > MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch > > > HBase is quite efficient when scanning only one small row key range. If user > needs to specify multiple row key ranges in one scan, the typical solutions > are: 1. through FilterList which is a list of row key Filters, 2. using the > SQL layer over HBase to join with two table, such as hive, phoenix etc. > However, both solutions are inefficient. Both of them can’t utilize the range > info to perform fast forwarding during scan which is quite time consuming. If > the number of ranges are quite big (e.g. millions), join is a proper solution > though it is slow. However, there are cases that user wants to specify a > small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t > provide satisfactory performance in such case. > We provide this filter (MultiRowRangeFilter) to support such use case (scan > multiple row key ranges), which can construct the row key ranges from user > specified list and perform fast-forwarding during scan. Thus, the scan will > be quite efficient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12332) [mob] use filelink instad of retry when resolving an hfilelink.
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14238899#comment-14238899 ] Jiajia Li commented on HBASE-12332: --- hi, [~jmhsieh], in reading, we don't directly open scanners to all the existing mob files which is easy to know a file link by the matching the name pattern, instead we follow steps below. # Read the file name from the HBase ( this is just a file name, not a file link pattern, we don't know the file link name in this cell). # Read mob cell from the candidate paths( mobworkingDir/filename, mobArchive/filename, the latter two are for cloned snapshot, srcTableMobWorkingDir/filename, srcTableArchive/filename). According to the above read path, it's not possible to know whether the current mob file in the working directory is a file link by the name which is just a mob file name (not a file link pattern). In the latest patch, the possible read path had been reduced from 4 to 2 by comparing the source table tag for the cloned snapshot. It means searching the cloned snapshot is as fast as the normal mob cells. Please advise. Thanks~ > [mob] use filelink instad of retry when resolving an hfilelink. > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12332) [mob] use filelink instad of retry when resolving an hfilelink.
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231069#comment-14231069 ] Jiajia Li commented on HBASE-12332: --- hi, [~jmhsieh], can you give some advise on this patch? > [mob] use filelink instad of retry when resolving an hfilelink. > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12591) Ignore the count of mob compaction metrics when there is issue
[ https://issues.apache.org/jira/browse/HBASE-12591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-12591: -- Attachment: HBASE-12591-V2.patch reupload the patch without giving the option --no-prefix . > Ignore the count of mob compaction metrics when there is issue > -- > > Key: HBASE-12591 > URL: https://issues.apache.org/jira/browse/HBASE-12591 > Project: HBase > Issue Type: Sub-task > Components: regionserver >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li >Priority: Minor > Fix For: hbase-11339 > > Attachments: HBASE-12591-V2.patch, HBASE-12591.patch > > > In mob compaction, the metrics of "mobCompactedFromMobCellsCount" and > "mobCompactedFromMobCellsSize" should not be count when there is issue when > retrieve the mob cell from the mob file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12591) Ignore the count of mob compaction metrics when there is issue
[ https://issues.apache.org/jira/browse/HBASE-12591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14229320#comment-14229320 ] Jiajia Li commented on HBASE-12591: --- hi, [~anoopsamjohn], is this patch ok? > Ignore the count of mob compaction metrics when there is issue > -- > > Key: HBASE-12591 > URL: https://issues.apache.org/jira/browse/HBASE-12591 > Project: HBase > Issue Type: Sub-task > Components: regionserver >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li >Priority: Minor > Fix For: hbase-11339 > > Attachments: HBASE-12591.patch > > > In mob compaction, the metrics of "mobCompactedFromMobCellsCount" and > "mobCompactedFromMobCellsSize" should not be count when there is issue when > retrieve the mob cell from the mob file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12591) Ignore the count of mob compaction metrics when there is issue
[ https://issues.apache.org/jira/browse/HBASE-12591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li updated HBASE-12591: -- Attachment: HBASE-12591.patch upload the patch. > Ignore the count of mob compaction metrics when there is issue > -- > > Key: HBASE-12591 > URL: https://issues.apache.org/jira/browse/HBASE-12591 > Project: HBase > Issue Type: Sub-task > Components: regionserver >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li >Priority: Minor > Fix For: hbase-11339 > > Attachments: HBASE-12591.patch > > > In mob compaction, the metrics of "mobCompactedFromMobCellsCount" and > "mobCompactedFromMobCellsSize" should not be count when there is issue when > retrieve the mob cell from the mob file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12591) Ignore the count of mob compaction metrics when there is issue
Jiajia Li created HBASE-12591: - Summary: Ignore the count of mob compaction metrics when there is issue Key: HBASE-12591 URL: https://issues.apache.org/jira/browse/HBASE-12591 Project: HBase Issue Type: Sub-task Components: regionserver Affects Versions: hbase-11339 Reporter: Jiajia Li Assignee: Jiajia Li Priority: Minor Fix For: hbase-11339 In mob compaction, the metrics of "mobCompactedFromMobCellsCount" and "mobCompactedFromMobCellsSize" should not be count when there is issue when retrieve the mob cell from the mob file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12546) Validate schema options that require server side class availability
[ https://issues.apache.org/jira/browse/HBASE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225610#comment-14225610 ] Jiajia Li commented on HBASE-12546: --- hi, Andy,[~apurtell], I found the jira, https://issues.apache.org/jira/browse/HBASE-12573(Backport HBASE-10591 Sanity check table configuration in createTable),is this what want to do? > Validate schema options that require server side class availability > --- > > Key: HBASE-12546 > URL: https://issues.apache.org/jira/browse/HBASE-12546 > Project: HBase > Issue Type: Bug >Reporter: Andrew Purtell >Assignee: Jiajia Li > Fix For: 2.0.0, 0.98.9, 0.99.2 > > > When processing table create and modification requests we should check the > supplied schema options for settings that require mentioned classes to be > available from the regionserver classpath (split policies, etc.). If we can't > find the class on the classpath when processing the admin request RPC, fail > the operation immediately and return an exception rather than allow problems > later, such as aborts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12546) Validate schema options that require server side class availability
[ https://issues.apache.org/jira/browse/HBASE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224264#comment-14224264 ] Jiajia Li commented on HBASE-12546: --- hi, [~apurtell], I found that in hbase trunk: (https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java#L1169) there is hcd check, such as check the regionsplitpolicy: (https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java#L1225), but in 0.98 branch, the check is not found: (https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java#L1747) , so does the sanityCheckTableDescriptor() function do the schema option validate? please give me some advise, thanks~ > Validate schema options that require server side class availability > --- > > Key: HBASE-12546 > URL: https://issues.apache.org/jira/browse/HBASE-12546 > Project: HBase > Issue Type: Bug >Reporter: Andrew Purtell >Assignee: Jiajia Li > Fix For: 2.0.0, 0.98.9, 0.99.2 > > > When processing table create and modification requests we should check the > supplied schema options for settings that require mentioned classes to be > available from the regionserver classpath (split policies, etc.). If we can't > find the class on the classpath when processing the admin request RPC, fail > the operation immediately and return an exception rather than allow problems > later, such as aborts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-12546) Validate schema options that require server side class availability
[ https://issues.apache.org/jira/browse/HBASE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiajia Li reassigned HBASE-12546: - Assignee: Jiajia Li > Validate schema options that require server side class availability > --- > > Key: HBASE-12546 > URL: https://issues.apache.org/jira/browse/HBASE-12546 > Project: HBase > Issue Type: Bug >Reporter: Andrew Purtell >Assignee: Jiajia Li > Fix For: 2.0.0, 0.98.9, 0.99.2 > > > When processing table create and modification requests we should check the > supplied schema options for settings that require mentioned classes to be > available from the regionserver classpath (split policies, etc.). If we can't > find the class on the classpath when processing the admin request RPC, fail > the operation immediately and return an exception rather than allow problems > later, such as aborts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12543) Incorrect log info in the store compaction of mob
[ https://issues.apache.org/jira/browse/HBASE-12543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220409#comment-14220409 ] Jiajia Li commented on HBASE-12543: --- hi, [~anoopsamjohn], [~jmhsieh] can you look at this patch? thanks~ > Incorrect log info in the store compaction of mob > - > > Key: HBASE-12543 > URL: https://issues.apache.org/jira/browse/HBASE-12543 > Project: HBase > Issue Type: Sub-task > Components: regionserver >Affects Versions: hbase-11339 >Reporter: Jiajia Li >Assignee: Jiajia Li >Priority: Minor > Fix For: hbase-11339 > > Attachments: HBASE-12543.diff > > > Incorrect log info in the store compaction of mob -- This message was sent by Atlassian JIRA (v6.3.4#6332)