from:"\"Jiajia Li \\\\\\\(JIRA\\\\\\\)\""

[jira] [Commented] (HBASE-14206) MultiRowRangeFilter returns records whose rowKeys are out of allowed ranges

2015-08-11 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681831#comment-14681831
 ] 

Jiajia Li commented on HBASE-14206:
---

I run the test based on trunk, but with one minor change in test:
{code}
filter.filterRowKey(badKey, 0, 1);
{code}
to
{code}
filter.filterRowKey(KeyValueUtil.createFirstOnRow(badKey));
{code}
I think the fix is ok.

> MultiRowRangeFilter returns records whose rowKeys are out of allowed ranges
> ---
>
> Key: HBASE-14206
> URL: https://issues.apache.org/jira/browse/HBASE-14206
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: linux, java7
>Reporter: Anton Nazaruk
>Assignee: Ted Yu
>Priority: Critical
>  Labels: filter
> Attachments: 14206-test.patch
>
>
> I haven't found a way to attach test program to JIRA issue, so put it below :
> {code}
> public class MultiRowRangeFilterTest {
>  
> byte[] key1Start = new byte[] {-3};
> byte[] key1End  = new byte[] {-2};
> byte[] key2Start = new byte[] {5};
> byte[] key2End  = new byte[] {6};
> byte[] badKey = new byte[] {-10};
> @Test
> public void testRanges() throws IOException {
> MultiRowRangeFilter filter = new MultiRowRangeFilter(Arrays.asList(
> new MultiRowRangeFilter.RowRange(key1Start, true, key1End, 
> false),
> new MultiRowRangeFilter.RowRange(key2Start, true, key2End, 
> false)
> ));
> filter.filterRowKey(badKey, 0, 1);
> /*
> * FAILS -- includes BAD key!
> * Expected :SEEK_NEXT_USING_HINT
> * Actual   :INCLUDE
> * */
> assertEquals(Filter.ReturnCode.SEEK_NEXT_USING_HINT, 
> filter.filterKeyValue(null));
> }
> }
> {code}
> It seems to happen on 2.0.0-SNAPSHOT too, but I wasn't able to link one with 
> included class.
> I have played some time with algorithm, and found that quick fix may be 
> applied to "getNextRangeIndex(byte[] rowKey)" method (hbase-client:1.1.0) :
> {code}
> if (insertionPosition == 0 && 
> !rangeList.get(insertionPosition).contains(rowKey)) {
> return ROW_BEFORE_FIRST_RANGE;
> }
> // FIX START
> if(!this.initialized) {
> this.initialized = true;
> }
> // FIX END
> return insertionPosition;
> {code} 
> Thanks, hope it will help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13704) Hbase throws OutOfOrderScannerNextException exception when MultiRowRangeFilter is used.

2015-05-19 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551719#comment-14551719
 ] 

Jiajia Li commented on HBASE-13704:
---

The patch is great for me.

> Hbase throws OutOfOrderScannerNextException exception when 
> MultiRowRangeFilter is used.
> ---
>
> Key: HBASE-13704
> URL: https://issues.apache.org/jira/browse/HBASE-13704
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.1.0
>Reporter: Aleksandr Maksymenko
>Assignee: Aleksandr Maksymenko
> Fix For: 2.0.0, 1.2.0, 1.1.1
>
> Attachments: 13704-v1.txt
>
>
> When using filter MultiRowRangeFilter with ranges closed to each other that 
> there are no rows between ranges, then OutOfOrderScannerNextException is 
> throwed.
> In filterRowKey method when range is switched to the next range, 
> currentReturnCode is set to SEEK_NEXT_USING_HINT (MultiRowRangeFilter: 118 in 
> v1.1.0). But if new range is already contain this row, then we should include 
> this row, not to seek for another one.
> Replacing line 118 to this code seems to be working fine:
> {code}
> if (range.contains(buffer, offset, length)) {
> currentReturnCode = ReturnCode.INCLUDE;
> } else {
> currentReturnCode = ReturnCode.SEEK_NEXT_USING_HINT;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13704) Hbase throws OutOfOrderScannerNextException exception when MultiRowRangeFilter is used.

2015-05-18 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549736#comment-14549736
 ] 

Jiajia Li commented on HBASE-13704:
---

I think if the range already contains row, it will from Line91 to Line121.  Can 
you provide your test more detailed, maybe I  got it wrong, thanks.

> Hbase throws OutOfOrderScannerNextException exception when 
> MultiRowRangeFilter is used.
> ---
>
> Key: HBASE-13704
> URL: https://issues.apache.org/jira/browse/HBASE-13704
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.1.0
>Reporter: Aleksandr Maksymenko
>
> When using filter MultiRowRangeFilter with ranges closed to each other that 
> there are no rows between ranges, then OutOfOrderScannerNextException is 
> throwed.
> In filterRowKey method when range is switched to the next range, 
> currentReturnCode is set to SEEK_NEXT_USING_HINT (MultiRowRangeFilter: 118 in 
> v1.1.0). But if new range is already contain this row, then we should include 
> this row, not to seek for another one.
> Replacing line 118 to this code seems to be working fine:
> {code}
> if (range.contains(buffer, offset, length)) {
> currentReturnCode = ReturnCode.INCLUDE;
> } else {
> currentReturnCode = ReturnCode.SEEK_NEXT_USING_HINT;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13705) MultiRowRangeFilter seems to be working incorrect if RowRange.startRowInclusive = false

2015-05-18 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549730#comment-14549730
 ] 

Jiajia Li commented on HBASE-13705:
---

For the second issue, suppose we have a range [10,12), and only one rowkey "12" 
in hbase table, steps of scan:
1. Line 91: Check if current range contains row "12"
2. Line 94: Search for the next RowRange in method getNextRangeIndex
3. Line 223: index = -2
4. Line 225: insertionPosition = 1
5. Line 95: get the the index=1 will equal to rangList.size()=1, so return 
false, the row "12" will not be included.
 [~masyaman],  I think this test is ok, please advise if I didn't  get your 
point.

> MultiRowRangeFilter seems to be working incorrect if 
> RowRange.startRowInclusive = false
> ---
>
> Key: HBASE-13705
> URL: https://issues.apache.org/jira/browse/HBASE-13705
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Aleksandr Maksymenko
>
> I've found the issue during code review, so I don't have tests and I even 
> didn't test this case manualy. So I'll try to describe it in words.
> Pre-condition: we're using scan with MultiRowRangeFilter with some RowRange's 
> with startRowInclusive = false. This means that we want to include all rows 
> that are strictly greater than startRow (and less then stopRow, but it 
> doesn't matter for now). 
> What happens in MultiRowRangeFilter.filterRowKey (worth case is described):
> 1. Line 91: Check if current range contains a row. Lets follow the case when 
> it doesn't.
> 2. Line 94: Search for the next RowRange in method getNextRangeIndex.
> 3. Line 238: We've found a RowRange, check if startRowInclusive == false and 
> set EXCLUSIVE = true. This variable indicates if next row should be excluded.
> 4. Line 105: Check if EXCLUSIVE == true, if so skip this row.
> The problem: we've skipped first row we got in this range, but we never 
> checked if this row is a RowRange.startRow . In distributed system may not 
> get RowRange.startRow on current instance, so we may exclude some another 
> row. Moreover, we may not have RowRange.startRow at all in the DB, we will 
> exclude some rows that are (possible) close to RowRange.startRow, but not 
> equals to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13012) Add a shell to trigger the mob file compactor by commands

2015-02-16 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-13012:
--
Attachment: hbase-13012-V4.diff

update the patch(hbase-13012-V4) according jingcheng's comments.

> Add a shell to trigger the mob file compactor by commands
> -
>
> Key: HBASE-13012
> URL: https://issues.apache.org/jira/browse/HBASE-13012
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Jingcheng Du
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: hbase-13012-V1.patch, hbase-13012-V2.diff, 
> hbase-13012-V3.diff, hbase-13012-V4.diff
>
>
> Currently the MobFileCompactor is run by HMaster periodically, we need to add 
> a shell to trigger the compactor by commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13012) Add a shell to trigger the mob file compactor by commands

2015-02-16 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-13012:
--
Attachment: hbase-13012-V3.diff

thanks [~jingcheng...@intel.com], I've uploaded the patch(hbase-13012-V3.diff) 
according your comments.
Hi, [~j...@cloudera.com], [~ram_krish],[~anoopsamjohn], can you review on this 
patch? thanks.

> Add a shell to trigger the mob file compactor by commands
> -
>
> Key: HBASE-13012
> URL: https://issues.apache.org/jira/browse/HBASE-13012
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Jingcheng Du
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: hbase-13012-V1.patch, hbase-13012-V2.diff, 
> hbase-13012-V3.diff
>
>
> Currently the MobFileCompactor is run by HMaster periodically, we need to add 
> a shell to trigger the compactor by commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13012) Add a shell to trigger the mob file compactor by commands

2015-02-15 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-13012:
--
Attachment: hbase-13012-V2.diff

update the patch(hbase-13012-V2.diff) according jingcheng's commets.

> Add a shell to trigger the mob file compactor by commands
> -
>
> Key: HBASE-13012
> URL: https://issues.apache.org/jira/browse/HBASE-13012
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Jingcheng Du
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: hbase-13012-V1.patch, hbase-13012-V2.diff
>
>
> Currently the MobFileCompactor is run by HMaster periodically, we need to add 
> a shell to trigger the compactor by commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13012) Add a shell to trigger the mob file compactor by commands

2015-02-15 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-13012:
--
Attachment: hbase-13012-V1.patch

Upload the hbase-13012-V1.patch, RB: https://reviews.apache.org/r/31053/
Hi, [~j...@cloudera.com], [~anoopsamjohn], [~ram_krish], 
[~jingcheng...@intel.com], please look at it, thanks.

> Add a shell to trigger the mob file compactor by commands
> -
>
> Key: HBASE-13012
> URL: https://issues.apache.org/jira/browse/HBASE-13012
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Jingcheng Du
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: hbase-13012-V1.patch
>
>
> Currently the MobFileCompactor is run by HMaster periodically, we need to add 
> a shell to trigger the compactor by commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles

2015-02-09 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313322#comment-14313322
 ] 

Jiajia Li commented on HBASE-12332:
---

Hi, [~j...@cloudera.com], what is the plan for this jira?

> [mob] use filelink instead of retry when resolving mobfiles
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, 
> HBASE-12332-V3.patch, HBASE-12332-V5.patch, hbase-12332.link.v4.patch, 
> hbase-12332.patch
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles

2015-01-20 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285092#comment-14285092
 ] 

Jiajia Li commented on HBASE-12332:
---

Updating to V5 is because when I test the case (try to read a opened hfile 
after deleting the orignal table), the FileNotFoundException from namenode will 
be found, but this FNFE is wrapped by IOException in StoreFileScanner.seek. So 
I add the e.getCause() to avoid hbase client retry when fail to read cell.

> [mob] use filelink instead of retry when resolving mobfiles
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, 
> HBASE-12332-V3.patch, HBASE-12332-V5.patch, hbase-12332.link.v4.patch, 
> hbase-12332.patch
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles

2015-01-20 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-12332:
--
Attachment: HBASE-12332-V5.patch

hi, [~j...@cloudera.com], HBASE-12332-V5 based on HBASE-12332-V3, add  
(e.getCause() instanceof FileNotFoundException) when read cell, can you look at 
it? Thanks

> [mob] use filelink instead of retry when resolving mobfiles
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, 
> HBASE-12332-V3.patch, HBASE-12332-V5.patch, hbase-12332.link.v4.patch, 
> hbase-12332.patch
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scanning multiple row key ranges

2015-01-14 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Release Note: MultiRowRangeFilter is a filter to support scanning multiple 
row key ranges. If the number of the ranges is small, using multiple scans can 
also do the same thing and can work well. But when the number of ranges are 
quite big (e.g. millions), use the MultiRowRangeFilter will be nice. In this 
filter, the ranges will be sorted and merged, so users do not have to take care 
of ranges are not continuous. And if users are using something like rest, 
thrift or pig to access the data the filter might be the practical solution.  
(was: MultiRowRangeFilter is a filter to support scanning multiple row key 
ranges.)

> Filter to support scanning multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Fix For: 2.0.0, 1.1.0
>
> Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, 
> HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, 
> HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V16.patch, 
> HBASE_11144_V17.patch, HBASE_11144_V18.patch, HBASE_11144_V5.patch, 
> HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, 
> MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11144) Filter to support scanning multiple row key ranges

2015-01-13 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276522#comment-14276522
 ] 

Jiajia Li commented on HBASE-11144:
---

I've only test between the filterlist and multirowrangefilter, the RowFilter is 
used by the filterlist.

> Filter to support scanning multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Fix For: 2.0.0, 1.1.0
>
> Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, 
> HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, 
> HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V16.patch, 
> HBASE_11144_V17.patch, HBASE_11144_V18.patch, HBASE_11144_V5.patch, 
> HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, 
> MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scanning multiple row key ranges

2015-01-13 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Release Note: MultiRowRangeFilter is a filter to support scanning multiple 
row key ranges.

> Filter to support scanning multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Fix For: 2.0.0, 1.1.0
>
> Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, 
> HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, 
> HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V16.patch, 
> HBASE_11144_V17.patch, HBASE_11144_V18.patch, HBASE_11144_V5.patch, 
> HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, 
> MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2015-01-13 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: HBASE_11144_V18.patch

fix the check style error.

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, 
> HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, 
> HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V16.patch, 
> HBASE_11144_V17.patch, HBASE_11144_V18.patch, HBASE_11144_V5.patch, 
> HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, 
> MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2015-01-13 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: HBASE_11144_V17.patch

upload V17, fix the findbugs.

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, 
> HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, 
> HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V16.patch, 
> HBASE_11144_V17.patch, HBASE_11144_V5.patch, HBASE_11144_V6.patch, 
> HBASE_11144_V7.patch, HBASE_11144_V9.patch, MultiRowRangeFilter.patch, 
> MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2015-01-12 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: HBASE_11144_V16.patch

upload the path V16(fix the test error).

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, 
> HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, 
> HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V16.patch, 
> HBASE_11144_V5.patch, HBASE_11144_V6.patch, HBASE_11144_V7.patch, 
> HBASE_11144_V9.patch, MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2015-01-12 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: HBASE_11144_V15.patch

hi  [~stack], thanks for your review, V15 add the inclusive/exclusive in the 
rowrange and rename the RowKeyRange to RowRange, can you take some time to look 
at it? Thanks

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, 
> HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, 
> HBASE_11144_V14.patch, HBASE_11144_V15.patch, HBASE_11144_V5.patch, 
> HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, 
> MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2015-01-09 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: HBASE_11144_V14.patch

fix the hadoop QA issue.

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, 
> HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, 
> HBASE_11144_V14.patch, HBASE_11144_V5.patch, HBASE_11144_V6.patch, 
> HBASE_11144_V7.patch, HBASE_11144_V9.patch, MultiRowRangeFilter.patch, 
> MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2015-01-08 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: HBASE_11144_V13.patch

fix the checkstyle error and add the InterfaceAudience in RowKeyRange Class.

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, 
> HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V13.patch, 
> HBASE_11144_V5.patch, HBASE_11144_V6.patch, HBASE_11144_V7.patch, 
> HBASE_11144_V9.patch, MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2015-01-08 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: HBASE_11144_V12.patch

Thanks [~tedyu] for your comments, I've update the patch(V12). 

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, 
> HBASE_11144_V11.patch, HBASE_11144_V12.patch, HBASE_11144_V5.patch, 
> HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, 
> MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges

2015-01-07 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268867#comment-14268867
 ] 

Jiajia Li commented on HBASE-11144:
---

 [~ram_krish] thanks for your review
 [~tedyu] [~anoopsamjohn], can you take some time to look at the latest patch? 
The review link is https://reviews.apache.org/r/21370/

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, 
> HBASE_11144_V11.patch, HBASE_11144_V5.patch, HBASE_11144_V6.patch, 
> HBASE_11144_V7.patch, HBASE_11144_V9.patch, MultiRowRangeFilter.patch, 
> MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2015-01-07 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: HBASE_11144_V11.patch

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, 
> HBASE_11144_V11.patch, HBASE_11144_V5.patch, HBASE_11144_V6.patch, 
> HBASE_11144_V7.patch, HBASE_11144_V9.patch, MultiRowRangeFilter.patch, 
> MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2015-01-07 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: HBASE_11144_V10.patch

upload the new patch(HBASE_11144_V10.patch) to fix the checkstyle error.

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V10.patch, 
> HBASE_11144_V5.patch, HBASE_11144_V6.patch, HBASE_11144_V7.patch, 
> HBASE_11144_V9.patch, MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2015-01-06 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: HBASE_11144_V9.patch

upload the HBASE_11144_V9.patch, according Ram's comment.

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, 
> HBASE_11144_V6.patch, HBASE_11144_V7.patch, HBASE_11144_V9.patch, 
> MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges

2015-01-06 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267194#comment-14267194
 ] 

Jiajia Li commented on HBASE-11144:
---

hi, [~ram_krish], can you look at the comment on  
https://reviews.apache.org/r/21370/, and give your advise? Thanks

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, 
> HBASE_11144_V6.patch, HBASE_11144_V7.patch, MultiRowRangeFilter.patch, 
> MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles

2015-01-05 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265551#comment-14265551
 ] 

Jiajia Li commented on HBASE-12332:
---

[~jmhsieh], can you tell me the problem you have meet? Thanks

> [mob] use filelink instead of retry when resolving mobfiles
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, 
> HBASE-12332-V3.patch, hbase-12332.link.v4.patch, hbase-12332.patch
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles

2015-01-03 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263725#comment-14263725
 ] 

Jiajia Li commented on HBASE-12332:
---

hi, [~j...@cloudera.com], now we will use the filelink when reolving mob files? 
 If so, I will try to add the UT in HBASE-12670.

> [mob] use filelink instead of retry when resolving mobfiles
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, 
> HBASE-12332-V3.patch, hbase-12332.link.v4.patch, hbase-12332.patch
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2014-12-29 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: hbase_11144_V8.patch

Thanks [~ram_krish], in this patch I've removed the other check and add the 
filter in ScannerModel, can you take some time to look at it? Thanks

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, 
> HBASE_11144_V6.patch, HBASE_11144_V7.patch, MultiRowRangeFilter.patch, 
> MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch, hbase_11144_V8.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges

2014-12-28 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14259782#comment-14259782
 ] 

Jiajia Li commented on HBASE-11144:
---

hi,[~tedyu], [~ram_krish],can you review the code 
(https://reviews.apache.org/r/21370/)? Thanks.

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, 
> HBASE_11144_V6.patch, HBASE_11144_V7.patch, MultiRowRangeFilter.patch, 
> MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-12670) Add unit tests that exercise the added hfilelink link mob paths

2014-12-25 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li reassigned HBASE-12670:
-

Assignee: Jiajia Li

> Add unit tests that exercise the added hfilelink link mob paths
> ---
>
> Key: HBASE-12670
> URL: https://issues.apache.org/jira/browse/HBASE-12670
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Jonathan Hsieh
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
>
> HBASe-12646 introduced the mob path to HFileLink -- we didn't add unit tests 
> for it however.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2014-12-25 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: HBASE_11144_V7.patch

upload the new patch HBASE_11144_V7, refine the code according Ram's comment

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, 
> HBASE_11144_V6.patch, HBASE_11144_V7.patch, MultiRowRangeFilter.patch, 
> MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12331) Shorten the mob snapshot unit tests

2014-12-23 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-12331:
--
Attachment: HBASE-12331-V2.patch

upload the HBASE-12331-V2.patch(remove the TestMob *WithRegionReplicas and 
TestMobSecureExportSnapshot)

> Shorten the mob snapshot unit tests
> ---
>
> Key: HBASE-12331
> URL: https://issues.apache.org/jira/browse/HBASE-12331
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12331-V1.diff, HBASE-12331-V2.patch
>
>
> The mob snapshot patch introduced a whole log of tests that take a long time 
> to run and would be better as integration tests.
> {code}
> ---
>  T E S T S
> ---
> Running 
> org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClientWithRegionReplicas
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 394.803 sec - 
> in 
> org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClientWithRegionReplicas
> Running org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClient
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 212.377 sec - 
> in org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClient
> Running 
> org.apache.hadoop.hbase.client.TestMobSnapshotFromClientWithRegionReplicas
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 49.463 sec - 
> in org.apache.hadoop.hbase.client.TestMobSnapshotFromClientWithRegionReplicas
> Running org.apache.hadoop.hbase.client.TestMobSnapshotFromClient
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 46.724 sec - 
> in org.apache.hadoop.hbase.client.TestMobSnapshotFromClient
> Running org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 204.03 sec - 
> in org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient
> Running 
> org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClientWithRegionReplicas
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 214.052 sec - 
> in 
> org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClientWithRegionReplicas
> Running org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 105.139 sec - 
> in org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence
> Running org.apache.hadoop.hbase.regionserver.TestMobStoreScanner
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.42 sec - 
> in org.apache.hadoop.hbase.regionserver.TestMobStoreScanner
> Running org.apache.hadoop.hbase.regionserver.TestDeleteMobTable
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.136 sec - 
> in org.apache.hadoop.hbase.regionserver.TestDeleteMobTable
> Running org.apache.hadoop.hbase.regionserver.TestHMobStore
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.09 sec - in 
> org.apache.hadoop.hbase.regionserver.TestHMobStore
> Running org.apache.hadoop.hbase.regionserver.TestMobCompaction
> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.629 sec - 
> in org.apache.hadoop.hbase.regionserver.TestMobCompaction
> Running org.apache.hadoop.hbase.mob.TestCachedMobFile
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.301 sec - 
> in org.apache.hadoop.hbase.mob.TestCachedMobFile
> Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepJob
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.752 sec - 
> in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepJob
> Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepReducer
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.276 sec - 
> in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepReducer
> Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepMapper
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.46 sec - 
> in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepMapper
> Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 173.05 sec - 
> in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper
> Running org.apache.hadoop.hbase.mob.TestMobDataBlockEncoding
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.86 sec - 
> in org.apache.hadoop.hbase.mob.TestMobDataBlockEncoding
> Running org.apache.hadoop.hbase.mob.TestExpiredMobFileCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.029 sec - 
> in org.apache.hadoop.hbase.mob.TestExpiredMobFileCleaner
> Running org.apache.hadoop.hbase.mob.TestMobFile
> Tests run:

[jira] [Updated] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles

2014-12-23 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-12332:
--
Attachment: HBASE-12332-V3.patch

Hi, [~j...@cloudera.com], the HBASE-12332-V3.patch change the synchronized to 
IdLock.Entry based on  HBASE-12332-V2.patch, I think it is more easier to 
resolve the mobfiles. If using the HFileLink, it may be reduce the performance 
of reading cell(because every time to read cell will create the hfilelink). Can 
you review on this patch and give the advise? Thanks

> [mob] use filelink instead of retry when resolving mobfiles
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, 
> HBASE-12332-V3.patch, hbase-12332.patch
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles

2014-12-22 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14256639#comment-14256639
 ] 

Jiajia Li commented on HBASE-12332:
---

hi, [~j...@cloudera.com], in this patch, I found each reading mob cell operate 
will create the HFileLink and StoreFileInfo, will this reduce the efficiency of 
reading? I think through reading the possible locations will be easier and more 
effective, how do you think?

> [mob] use filelink instead of retry when resolving mobfiles
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, 
> hbase-12332.patch
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges

2014-12-22 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14256535#comment-14256535
 ] 

Jiajia Li commented on HBASE-11144:
---

thanks, I've updated the reviewboard, https://reviews.apache.org/r/21370/

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, 
> HBASE_11144_V6.patch, MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2014-12-22 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: HBASE_11144_V6.patch

upload new patch HBASE_11144_V6.patch which remove the warned about missing 
interface audience.

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, 
> HBASE_11144_V6.patch, MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir

2014-12-22 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14256488#comment-14256488
 ] 

Jiajia Li commented on HBASE-12673:
---

hi, [~j...@cloudera.com], sorry I can't understand the meaning of  "it is 
starting to be used in other places and needs to be done". Do you means it used 
in the SnapshotInfo tool? The process of reading MOB cell is through getting 
the filename from the reference cell, then open the file through the possible 
locations. The reference cell in hbase is only a filename, so I think hfilelink 
can't be used in reading mob cell.

> Add a UT to read mob file when the mob hfile moving from the mob dir to the 
> archive dir
> ---
>
> Key: HBASE-12673
> URL: https://issues.apache.org/jira/browse/HBASE-12673
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: HBASE-12673-V2.patch, HBASE-12673.patch
>
>
> add a unit test to scan the cloned table when deleting the original table, 
> and the steps as following:
> 1) create a table with mobs, 
> 2) snapshot it, 
> 3) clone it as a a different table
> 4) have a read workload on the snapshot
> 5) delete the original table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12331) Shorten the mob snapshot unit tests

2014-12-22 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14256456#comment-14256456
 ] 

Jiajia Li commented on HBASE-12331:
---

Thanks Jon, I will drop these tests and upload a new patch.

> Shorten the mob snapshot unit tests
> ---
>
> Key: HBASE-12331
> URL: https://issues.apache.org/jira/browse/HBASE-12331
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12331-V1.diff
>
>
> The mob snapshot patch introduced a whole log of tests that take a long time 
> to run and would be better as integration tests.
> {code}
> ---
>  T E S T S
> ---
> Running 
> org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClientWithRegionReplicas
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 394.803 sec - 
> in 
> org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClientWithRegionReplicas
> Running org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClient
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 212.377 sec - 
> in org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClient
> Running 
> org.apache.hadoop.hbase.client.TestMobSnapshotFromClientWithRegionReplicas
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 49.463 sec - 
> in org.apache.hadoop.hbase.client.TestMobSnapshotFromClientWithRegionReplicas
> Running org.apache.hadoop.hbase.client.TestMobSnapshotFromClient
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 46.724 sec - 
> in org.apache.hadoop.hbase.client.TestMobSnapshotFromClient
> Running org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 204.03 sec - 
> in org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient
> Running 
> org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClientWithRegionReplicas
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 214.052 sec - 
> in 
> org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClientWithRegionReplicas
> Running org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 105.139 sec - 
> in org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence
> Running org.apache.hadoop.hbase.regionserver.TestMobStoreScanner
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.42 sec - 
> in org.apache.hadoop.hbase.regionserver.TestMobStoreScanner
> Running org.apache.hadoop.hbase.regionserver.TestDeleteMobTable
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.136 sec - 
> in org.apache.hadoop.hbase.regionserver.TestDeleteMobTable
> Running org.apache.hadoop.hbase.regionserver.TestHMobStore
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.09 sec - in 
> org.apache.hadoop.hbase.regionserver.TestHMobStore
> Running org.apache.hadoop.hbase.regionserver.TestMobCompaction
> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.629 sec - 
> in org.apache.hadoop.hbase.regionserver.TestMobCompaction
> Running org.apache.hadoop.hbase.mob.TestCachedMobFile
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.301 sec - 
> in org.apache.hadoop.hbase.mob.TestCachedMobFile
> Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepJob
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.752 sec - 
> in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepJob
> Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepReducer
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.276 sec - 
> in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepReducer
> Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepMapper
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.46 sec - 
> in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepMapper
> Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 173.05 sec - 
> in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper
> Running org.apache.hadoop.hbase.mob.TestMobDataBlockEncoding
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.86 sec - 
> in org.apache.hadoop.hbase.mob.TestMobDataBlockEncoding
> Running org.apache.hadoop.hbase.mob.TestExpiredMobFileCleaner
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.029 sec - 
> in org.apache.hadoop.hbase.mob.TestExpiredMobFileCleaner
> Running org.apache.hadoop.hbase.mob.TestMobFile
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time ela

[jira] [Updated] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir

2014-12-22 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-12673:
--
Attachment: HBASE-12673-V2.patch

Hi, [~j...@cloudera.com], I've upload a new patch, in this UT will try to read 
a opened hfile after deleting the orignal table, and IOException(not 
FileNotFoundException) will be found, so I will change the patch in 
https://issues.apache.org/jira/browse/HBASE-12332

> Add a UT to read mob file when the mob hfile moving from the mob dir to the 
> archive dir
> ---
>
> Key: HBASE-12673
> URL: https://issues.apache.org/jira/browse/HBASE-12673
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: HBASE-12673-V2.patch, HBASE-12673.patch
>
>
> add a unit test to scan the cloned table when deleting the original table, 
> and the steps as following:
> 1) create a table with mobs, 
> 2) snapshot it, 
> 3) clone it as a a different table
> 4) have a read workload on the snapshot
> 5) delete the original table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges

2014-12-21 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14255534#comment-14255534
 ] 

Jiajia Li commented on HBASE-11144:
---

hi, [~ram_krish], [~anoopsamjohn], [~j...@cloudera.com], if you are interested, 
can you take some time in this patch and give some advise? Thanks

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, 
> MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges

2014-12-21 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14255524#comment-14255524
 ] 

Jiajia Li commented on HBASE-11144:
---

hi，[~tedyu], thanks for you review, I've upload the new patch.

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, 
> MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2014-12-21 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Status: Patch Available  (was: Open)

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, 
> MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2014-12-21 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Status: Open  (was: Patch Available)

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, 
> MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2014-12-21 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-11144:
--
Attachment: HBASE_11144_V5.patch

upload the HBASE_11144_V5.patch against trunk.

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, HBASE_11144_V5.patch, 
> MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch, 
> MultiRowRangeFilter3.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12332) [mob] use filelink instad of retry when resolving an hfilelink.

2014-12-18 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253042#comment-14253042
 ] 

Jiajia Li commented on HBASE-12332:
---

upload the HBASE-12332-V2.patch, catch more exceptions when read the 
cell.[~j...@cloudera.com]

> [mob] use filelink instad of retry when resolving an hfilelink.
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir

2014-12-18 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253040#comment-14253040
 ] 

Jiajia Li commented on HBASE-12673:
---

Thanks, [~j...@cloudera.com], I have changed the patch in 
(https://issues.apache.org/jira/browse/HBASE-12332) which  catch the 
NullPointerException and AssertionError, can you look on that?

> Add a UT to read mob file when the mob hfile moving from the mob dir to the 
> archive dir
> ---
>
> Key: HBASE-12673
> URL: https://issues.apache.org/jira/browse/HBASE-12673
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: HBASE-12673.patch
>
>
> add a unit test to scan the cloned table when deleting the original table, 
> and the steps as following:
> 1) create a table with mobs, 
> 2) snapshot it, 
> 3) clone it as a a different table
> 4) have a read workload on the snapshot
> 5) delete the original table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12332) [mob] use filelink instad of retry when resolving an hfilelink.

2014-12-18 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-12332:
--
Attachment: HBASE-12332-V2.patch

> [mob] use filelink instad of retry when resolving an hfilelink.
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir

2014-12-18 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251411#comment-14251411
 ] 

Jiajia Li commented on HBASE-12673:
---

Hi, [~j...@cloudera.com], can you look at above comment and give some advise? 
Thanks

> Add a UT to read mob file when the mob hfile moving from the mob dir to the 
> archive dir
> ---
>
> Key: HBASE-12673
> URL: https://issues.apache.org/jira/browse/HBASE-12673
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: HBASE-12673.patch
>
>
> add a unit test to scan the cloned table when deleting the original table, 
> and the steps as following:
> 1) create a table with mobs, 
> 2) snapshot it, 
> 3) clone it as a a different table
> 4) have a read workload on the snapshot
> 5) delete the original table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir

2014-12-17 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249836#comment-14249836
 ] 

Jiajia Li commented on HBASE-12673:
---

Hi,[~j...@cloudera.com], I have tested to close a deleted mobfile , that will 
not get other exception .
It's hard to use the FileLink in the Mob when read cell, because the mob 
reference cell only contain the filename, and this is not a hfilelink pattern, 
so I think the HFileLink can't used here, do you have any idea?
Can we also capture the other exceptions that could be caught in HFileLink? Now 
we are trying to reproduce this case more fine-grained to catch more exception.

> Add a UT to read mob file when the mob hfile moving from the mob dir to the 
> archive dir
> ---
>
> Key: HBASE-12673
> URL: https://issues.apache.org/jira/browse/HBASE-12673
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: HBASE-12673.patch
>
>
> add a unit test to scan the cloned table when deleting the original table, 
> and the steps as following:
> 1) create a table with mobs, 
> 2) snapshot it, 
> 3) clone it as a a different table
> 4) have a read workload on the snapshot
> 5) delete the original table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir

2014-12-16 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247995#comment-14247995
 ] 

Jiajia Li commented on HBASE-12673:
---

hi, [~j...@cloudera.com], as you know when hbase reads mob cell, it has two 
steps.
# Read the ref cell from the HBase, and get the cell value which is the mob 
file name.
# HBase has two possible locations to read the mob cell, one is the 
mobWorkingDir/fileName, archiveDir/fileName. When the mob file is not in the 
mobWorkingDir, HBase will try the second location.But now we only retry after 
the 
FileNotFoundException(https://github.com/apache/hbase/blob/hbase-11339/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HMobStore.java#L312)
Do you means a table deletion on the original table happens in the middle of 
the read operation will throw other IOExceptions? 
I don't know the HFileLink how to guarantee the case in the MOB? Can you please 
give a more detailed description? Thanks

> Add a UT to read mob file when the mob hfile moving from the mob dir to the 
> archive dir
> ---
>
> Key: HBASE-12673
> URL: https://issues.apache.org/jira/browse/HBASE-12673
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: HBASE-12673.patch
>
>
> add a unit test to scan the cloned table when deleting the original table, 
> and the steps as following:
> 1) create a table with mobs, 
> 2) snapshot it, 
> 3) clone it as a a different table
> 4) have a read workload on the snapshot
> 5) delete the original table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12332) [mob] use filelink instad of retry when resolving an hfilelink.

2014-12-14 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246357#comment-14246357
 ] 

Jiajia Li commented on HBASE-12332:
---

hi, [~j...@cloudera.com], the case you mentioned is added in 
https://issues.apache.org/jira/browse/HBASE-12673, can you look at it? Thanks~

> [mob] use filelink instad of retry when resolving an hfilelink.
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir

2014-12-14 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246344#comment-14246344
 ] 

Jiajia Li commented on HBASE-12673:
---

hi, [~j...@cloudera.com], can you review this patch?

> Add a UT to read mob file when the mob hfile moving from the mob dir to the 
> archive dir
> ---
>
> Key: HBASE-12673
> URL: https://issues.apache.org/jira/browse/HBASE-12673
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: HBASE-12673.patch
>
>
> add a unit test to scan the cloned table when deleting the original table, 
> and the steps as following:
> 1) create a table with mobs, 
> 2) snapshot it, 
> 3) clone it as a a different table
> 4) have a read workload on the snapshot
> 5) delete the original table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir

2014-12-12 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-12673:
--
Description: 
add a unit test to scan the cloned table when deleting the original table, and 
the steps as following:
1) create a table with mobs, 
2) snapshot it, 
3) clone it as a a different table
4) have a read workload on the snapshot
5) delete the original table


  was:
1) create a table with mobs, 
2) snapshot it, 
3) clone/restore it as a a different table
4) have a read workload on the snapshot
5) delete the original table


> Add a UT to read mob file when the mob hfile moving from the mob dir to the 
> archive dir
> ---
>
> Key: HBASE-12673
> URL: https://issues.apache.org/jira/browse/HBASE-12673
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: HBASE-12673.patch
>
>
> add a unit test to scan the cloned table when deleting the original table, 
> and the steps as following:
> 1) create a table with mobs, 
> 2) snapshot it, 
> 3) clone it as a a different table
> 4) have a read workload on the snapshot
> 5) delete the original table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir

2014-12-11 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-12673:
--
Status: Patch Available  (was: Open)

> Add a UT to read mob file when the mob hfile moving from the mob dir to the 
> archive dir
> ---
>
> Key: HBASE-12673
> URL: https://issues.apache.org/jira/browse/HBASE-12673
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: HBASE-12673.patch
>
>
> 1) create a table with mobs, 
> 2) snapshot it, 
> 3) clone/restore it as a a different table
> 4) have a read workload on the snapshot
> 5) delete the original table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir

2014-12-11 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-12673:
--
Attachment: HBASE-12673.patch

upload the patch

> Add a UT to read mob file when the mob hfile moving from the mob dir to the 
> archive dir
> ---
>
> Key: HBASE-12673
> URL: https://issues.apache.org/jira/browse/HBASE-12673
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Fix For: hbase-11339
>
> Attachments: HBASE-12673.patch
>
>
> 1) create a table with mobs, 
> 2) snapshot it, 
> 3) clone/restore it as a a different table
> 4) have a read workload on the snapshot
> 5) delete the original table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12540) TestRegionServerMetrics#testMobMetrics test failure

2014-12-10 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-12540:
--
Attachment: (was: HBASE-12540-V3.patch)

> TestRegionServerMetrics#testMobMetrics test failure
> ---
>
> Key: HBASE-12540
> URL: https://issues.apache.org/jira/browse/HBASE-12540
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: hbase-11339
>Reporter: stack
>Assignee: Jingcheng Du
> Fix For: hbase-11339
>
> Attachments: HBASE-12540-V2.patch, HBASE-12540.diff, 
> hbase-12540-v3.patch, log.txt
>
>
> Got this on an internal rig run.  Maybe you want to take a looksee 
> [~jingchengdu]?
> {code}
> Error Message
> Metrics Counters should be equal expected:<5> but was:<2>
> Stacktrace
> java.lang.AssertionError: Metrics Counters should be equal expected:<5> but 
> was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hbase.test.MetricsAssertHelperImpl.assertCounter(MetricsAssertHelperImpl.java:185)
>   at 
> org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics.testMobMetrics(TestRegionServerMetrics.java:448)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runners.Suite.runChild(Suite.java:127)
>   at org.junit.runners.Suite.runChild(Suite.java:26)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12540) TestRegionServerMetrics#testMobMetrics test failure

2014-12-10 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-12540:
--
Attachment: HBASE-12540-V3.patch

[~j...@cloudera.com], the V2 missing the fix, thanks

> TestRegionServerMetrics#testMobMetrics test failure
> ---
>
> Key: HBASE-12540
> URL: https://issues.apache.org/jira/browse/HBASE-12540
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: hbase-11339
>Reporter: stack
>Assignee: Jingcheng Du
> Fix For: hbase-11339
>
> Attachments: HBASE-12540-V2.patch, HBASE-12540-V3.patch, 
> HBASE-12540.diff, hbase-12540-v3.patch, log.txt
>
>
> Got this on an internal rig run.  Maybe you want to take a looksee 
> [~jingchengdu]?
> {code}
> Error Message
> Metrics Counters should be equal expected:<5> but was:<2>
> Stacktrace
> java.lang.AssertionError: Metrics Counters should be equal expected:<5> but 
> was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hbase.test.MetricsAssertHelperImpl.assertCounter(MetricsAssertHelperImpl.java:185)
>   at 
> org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics.testMobMetrics(TestRegionServerMetrics.java:448)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runners.Suite.runChild(Suite.java:127)
>   at org.junit.runners.Suite.runChild(Suite.java:26)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12540) TestRegionServerMetrics#testMobMetrics test failure

2014-12-10 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242088#comment-14242088
 ] 

Jiajia Li commented on HBASE-12540:
---

hi, [~jmhsieh], sorry for missing some in patch, I will upload the new patch 
soon~

> TestRegionServerMetrics#testMobMetrics test failure
> ---
>
> Key: HBASE-12540
> URL: https://issues.apache.org/jira/browse/HBASE-12540
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: hbase-11339
>Reporter: stack
>Assignee: Jingcheng Du
> Fix For: hbase-11339
>
> Attachments: HBASE-12540-V2.patch, HBASE-12540.diff, log.txt
>
>
> Got this on an internal rig run.  Maybe you want to take a looksee 
> [~jingchengdu]?
> {code}
> Error Message
> Metrics Counters should be equal expected:<5> but was:<2>
> Stacktrace
> java.lang.AssertionError: Metrics Counters should be equal expected:<5> but 
> was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hbase.test.MetricsAssertHelperImpl.assertCounter(MetricsAssertHelperImpl.java:185)
>   at 
> org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics.testMobMetrics(TestRegionServerMetrics.java:448)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runners.Suite.runChild(Suite.java:127)
>   at org.junit.runners.Suite.runChild(Suite.java:26)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-12673) Add a UT to read mob file when the mob hfile moving from the mob dir to the archive dir

2014-12-10 Thread Jiajia Li (JIRA)

Jiajia Li created HBASE-12673:
-

 Summary: Add a UT to read mob file when the mob hfile moving from 
the mob dir to the archive dir
 Key: HBASE-12673
 URL: https://issues.apache.org/jira/browse/HBASE-12673
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Affects Versions: hbase-11339
Reporter: Jiajia Li
Assignee: Jiajia Li
 Fix For: hbase-11339


1) create a table with mobs, 
2) snapshot it, 
3) clone/restore it as a a different table
4) have a read workload on the snapshot
5) delete the original table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12540) TestRegionServerMetrics#testMobMetrics test failure

2014-12-10 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-12540:
--
Attachment: HBASE-12540-V2.patch

thanks Jon, I upload the HBASE-12540-V2.patch(rename the var 
compactionThreshold to numHfiles).

> TestRegionServerMetrics#testMobMetrics test failure
> ---
>
> Key: HBASE-12540
> URL: https://issues.apache.org/jira/browse/HBASE-12540
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: hbase-11339
>Reporter: stack
>Assignee: Jingcheng Du
> Fix For: hbase-11339
>
> Attachments: HBASE-12540-V2.patch, HBASE-12540.diff, log.txt
>
>
> Got this on an internal rig run.  Maybe you want to take a looksee 
> [~jingchengdu]?
> {code}
> Error Message
> Metrics Counters should be equal expected:<5> but was:<2>
> Stacktrace
> java.lang.AssertionError: Metrics Counters should be equal expected:<5> but 
> was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hbase.test.MetricsAssertHelperImpl.assertCounter(MetricsAssertHelperImpl.java:185)
>   at 
> org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics.testMobMetrics(TestRegionServerMetrics.java:448)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runners.Suite.runChild(Suite.java:127)
>   at org.junit.runners.Suite.runChild(Suite.java:26)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12332) [mob] use filelink instad of retry when resolving an hfilelink.

2014-12-09 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240558#comment-14240558
 ] 

Jiajia Li commented on HBASE-12332:
---

hi, [~j...@cloudera.com], do you have any idea?

> [mob] use filelink instad of retry when resolving an hfilelink.
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11144) Filter to support scan multiple row key ranges

2014-12-08 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239058#comment-14239058
 ] 

Jiajia Li commented on HBASE-11144:
---

hi, [~saurabh.wl], this patch haven't  reviewed by the committers, so may not 
release in the next version of HBase, but feel free to take this patch in your 
case~

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, MultiRowRangeFilter.patch, 
> MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-11144) Filter to support scan multiple row key ranges

2014-12-08 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li reassigned HBASE-11144:
-

Assignee: Jiajia Li

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HBASE_11144_4.patch, MultiRowRangeFilter.patch, 
> MultiRowRangeFilter2.patch, MultiRowRangeFilter3.patch
>
>
> HBase is quite efficient when scanning only one small row key range. If user 
> needs to specify multiple row key ranges in one scan, the typical solutions 
> are: 1. through FilterList which is a list of row key Filters, 2. using the 
> SQL layer over HBase to join with two table, such as hive, phoenix etc. 
> However, both solutions are inefficient. Both of them can’t utilize the range 
> info to perform fast forwarding during scan which is quite time consuming. If 
> the number of ranges are quite big (e.g. millions), join is a proper solution 
> though it is slow. However, there are cases that user wants to specify a 
> small number of ranges to scan (e.g. <1000 ranges). Both solutions can’t 
> provide satisfactory performance in such case. 
> We provide this filter (MultiRowRangeFilter) to support such use case (scan 
> multiple row key ranges), which can construct the row key ranges from user 
> specified list and perform fast-forwarding during scan. Thus, the scan will 
> be quite efficient. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12332) [mob] use filelink instad of retry when resolving an hfilelink.

2014-12-08 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14238899#comment-14238899
 ] 

Jiajia Li commented on HBASE-12332:
---

hi, [~jmhsieh], 
in reading, we don't directly open scanners to all the existing mob files which 
is easy to know a file link by the matching the name pattern, instead we follow 
steps below.
# Read the file name from the HBase ( this is just a file name, not a file link 
pattern, we don't know the file link name in this cell).
# Read mob cell from the candidate paths( mobworkingDir/filename, 
mobArchive/filename, the latter two are for cloned snapshot, 
srcTableMobWorkingDir/filename, srcTableArchive/filename).
According to the above read path, it's not possible to know whether the current 
mob file in the working directory is a file link by the name which is just a 
mob file name (not a file link pattern).
In the latest patch, the possible read path had been reduced from 4 to 2 by 
comparing the source table tag for the cloned snapshot. It means searching the 
cloned snapshot is as fast as the normal mob cells.
Please advise. Thanks~

> [mob] use filelink instad of retry when resolving an hfilelink.
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12332) [mob] use filelink instad of retry when resolving an hfilelink.

2014-12-01 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231069#comment-14231069
 ] 

Jiajia Li commented on HBASE-12332:
---

hi, [~jmhsieh], can you give some advise on this patch?

> [mob] use filelink instad of retry when resolving an hfilelink.
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12591) Ignore the count of mob compaction metrics when there is issue

2014-12-01 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-12591:
--
Attachment: HBASE-12591-V2.patch

reupload the patch without giving the option --no-prefix .

> Ignore the count of mob compaction metrics when there is issue
> --
>
> Key: HBASE-12591
> URL: https://issues.apache.org/jira/browse/HBASE-12591
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
>Priority: Minor
> Fix For: hbase-11339
>
> Attachments: HBASE-12591-V2.patch, HBASE-12591.patch
>
>
> In mob compaction, the metrics of "mobCompactedFromMobCellsCount" and 
> "mobCompactedFromMobCellsSize" should not be count when there is issue when 
> retrieve the mob cell from the mob file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12591) Ignore the count of mob compaction metrics when there is issue

2014-11-30 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14229320#comment-14229320
 ] 

Jiajia Li commented on HBASE-12591:
---

hi, [~anoopsamjohn], is this patch ok?

> Ignore the count of mob compaction metrics when there is issue
> --
>
> Key: HBASE-12591
> URL: https://issues.apache.org/jira/browse/HBASE-12591
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
>Priority: Minor
> Fix For: hbase-11339
>
> Attachments: HBASE-12591.patch
>
>
> In mob compaction, the metrics of "mobCompactedFromMobCellsCount" and 
> "mobCompactedFromMobCellsSize" should not be count when there is issue when 
> retrieve the mob cell from the mob file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12591) Ignore the count of mob compaction metrics when there is issue

2014-11-26 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HBASE-12591:
--
Attachment: HBASE-12591.patch

upload the patch.

> Ignore the count of mob compaction metrics when there is issue
> --
>
> Key: HBASE-12591
> URL: https://issues.apache.org/jira/browse/HBASE-12591
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
>Priority: Minor
> Fix For: hbase-11339
>
> Attachments: HBASE-12591.patch
>
>
> In mob compaction, the metrics of "mobCompactedFromMobCellsCount" and 
> "mobCompactedFromMobCellsSize" should not be count when there is issue when 
> retrieve the mob cell from the mob file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-12591) Ignore the count of mob compaction metrics when there is issue

2014-11-26 Thread Jiajia Li (JIRA)

Jiajia Li created HBASE-12591:
-

 Summary: Ignore the count of mob compaction metrics when there is 
issue
 Key: HBASE-12591
 URL: https://issues.apache.org/jira/browse/HBASE-12591
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver
Affects Versions: hbase-11339
Reporter: Jiajia Li
Assignee: Jiajia Li
Priority: Minor
 Fix For: hbase-11339


In mob compaction, the metrics of "mobCompactedFromMobCellsCount" and 
"mobCompactedFromMobCellsSize" should not be count when there is issue when 
retrieve the mob cell from the mob file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12546) Validate schema options that require server side class availability

2014-11-25 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225610#comment-14225610
 ] 

Jiajia Li commented on HBASE-12546:
---

hi, Andy,[~apurtell], I found the jira, 
https://issues.apache.org/jira/browse/HBASE-12573(Backport HBASE-10591 Sanity 
check table configuration in createTable)，is this what want to do?

> Validate schema options that require server side class availability
> ---
>
> Key: HBASE-12546
> URL: https://issues.apache.org/jira/browse/HBASE-12546
> Project: HBase
>  Issue Type: Bug
>Reporter: Andrew Purtell
>Assignee: Jiajia Li
> Fix For: 2.0.0, 0.98.9, 0.99.2
>
>
> When processing table create and modification requests we should check the 
> supplied schema options for settings that require mentioned classes to be 
> available from the regionserver classpath (split policies, etc.). If we can't 
> find the class on the classpath when processing the admin request RPC, fail 
> the operation immediately and return an exception rather than allow problems 
> later, such as aborts. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12546) Validate schema options that require server side class availability

2014-11-25 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224264#comment-14224264
 ] 

Jiajia Li commented on HBASE-12546:
---

hi, [~apurtell], I found that in hbase trunk: 
(https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java#L1169)
 there is hcd check, such as check the regionsplitpolicy: 
(https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java#L1225),
 but in 0.98 branch, the check is not found: 
(https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java#L1747)
 , so does the sanityCheckTableDescriptor() function do the schema option 
validate? please give me some advise, thanks~

> Validate schema options that require server side class availability
> ---
>
> Key: HBASE-12546
> URL: https://issues.apache.org/jira/browse/HBASE-12546
> Project: HBase
>  Issue Type: Bug
>Reporter: Andrew Purtell
>Assignee: Jiajia Li
> Fix For: 2.0.0, 0.98.9, 0.99.2
>
>
> When processing table create and modification requests we should check the 
> supplied schema options for settings that require mentioned classes to be 
> available from the regionserver classpath (split policies, etc.). If we can't 
> find the class on the classpath when processing the admin request RPC, fail 
> the operation immediately and return an exception rather than allow problems 
> later, such as aborts. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-12546) Validate schema options that require server side class availability

2014-11-24 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li reassigned HBASE-12546:
-

Assignee: Jiajia Li

> Validate schema options that require server side class availability
> ---
>
> Key: HBASE-12546
> URL: https://issues.apache.org/jira/browse/HBASE-12546
> Project: HBase
>  Issue Type: Bug
>Reporter: Andrew Purtell
>Assignee: Jiajia Li
> Fix For: 2.0.0, 0.98.9, 0.99.2
>
>
> When processing table create and modification requests we should check the 
> supplied schema options for settings that require mentioned classes to be 
> available from the regionserver classpath (split policies, etc.). If we can't 
> find the class on the classpath when processing the admin request RPC, fail 
> the operation immediately and return an exception rather than allow problems 
> later, such as aborts. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12543) Incorrect log info in the store compaction of mob

2014-11-20 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220409#comment-14220409
 ] 

Jiajia Li commented on HBASE-12543:
---

hi, [~anoopsamjohn], [~jmhsieh] can you look at this patch? thanks~

> Incorrect log info in the store compaction of mob
> -
>
> Key: HBASE-12543
> URL: https://issues.apache.org/jira/browse/HBASE-12543
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: hbase-11339
>Reporter: Jiajia Li
>Assignee: Jiajia Li
>Priority: Minor
> Fix For: hbase-11339
>
> Attachments: HBASE-12543.diff
>
>
> Incorrect log info in the store compaction of mob



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

74 matches

Mail list logo