from:"Yu Sun \(JIRA\)"

[jira] [Commented] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1

2016-12-13 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15747155#comment-15747155
 ] 

Yu Sun commented on HBASE-17138:


[~tedyu] As you have listed, the backport needs lots of effort and i think it 
will need weeks to complete this.
I am afraid I dont have enouth time to do this and cause the backport to be 
delayed.
so if any others who have interesting in doing backport, I would like to 
provide any necessary help if needed.

> Backport read-path offheap (HBASE-11425) to branch-1
> 
>
> Key: HBASE-17138
> URL: https://issues.apache.org/jira/browse/HBASE-17138
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yu Li
>Assignee: Yu Sun
> Attachments: 
> 0001-fix-EHB-511-Resolve-client-compatibility-issue-introduced-by-offheap-change.patch,
>  0001-to-EHB-446-offheap-hfile-format-should-keep-compatible-v3.patch, 
> 0001-to-EHB-456-Cell-should-be-compatible-with-branch-1.1.2.patch
>
>
> From the 
> [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E]
>  of sharing our experience and performance data of read-path offheap usage in 
> Alibaba search, we could see people are positive to have HBASE-11425 in 
> branch-1, so I'd like to create a JIRA and move the discussion and decision 
> making here.
> Echoing some comments from the mail thread:
> Bryan:
> Is the backported patch available anywhere? If it ends up not getting 
> officially backported to branch-1 due to 2.0 around the corner, some of us 
> who build our own deploy may want to integrate into our builds
> Andrew:
> Yes, please, the patches will be useful to the community even if we decide 
> not to backport into an official 1.x release.
> Enis:
> I don't see any reason why we cannot backport to branch-1.
> Ted:
> Opening a JIRA would be fine. This makes it easier for people to obtain the 
> patch(es)
> Nick:
> From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to
> support 1.x releases for some time... I would guess these will be
> maintained until 2.2 at least. Therefore, offheap patches that have seen
> production exposure seem like a reasonable candidate for backport, perhaps in 
> a 1.4 or 1.5 release timeframe.
> Anoop:
> Because of some compatibility issues, we decide that this will be done in 2.0 
> only..  Ya as Andy said, it would be great to share the 1.x backported 
> patches.
> The following is all the jira ids we have back ported:
> HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells 
> (Ram)
> HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and 
> AbstractHFileReader; ditto for Scanners and BlockReader, etc.
> HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner.
> HBASE-13450 - Purge RawBytescomparator from the writers and readers for 
> HBASE-10800 (Ram)
> HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo.
> HBASE-12048 Remove deprecated APIs from Filter.
> HBASE-10800 - Use CellComparator instead of KVComparator (Ram)
> HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], 
> int, int.
> HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with 
> byte[],int,int args in favor of taking Cell arg.
> HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int 
> length) in favor of filterRowKey(Cell firstRowCell).
> HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner.
> HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], 
> int, int.
> HBASE-11911 Break up tests into more fine grained categories (Alex Newman)
> HBASE-12059 Create hbase-annotations module
> HBASE-12106 Move test annotations to test artifact (Enis Soztutar)
> HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers.
> HBASE-15679 Assertion on wrong variable in 
> TestReplicationThrottler#testThrottling
> HBASE-13931 Move Unsafe based operations to UnsafeAccess.
> HBASE-12345 Unsafe based ByteBuffer Comparator.
> HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int 
> llength, byte[] right, int roffset, int rlength).
> HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn
> HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport 
> this patch)
> HBASE-13448 New Cell implementation with cached component offsets/lengths.
> HBASE-13387 Add ByteBufferedCell an extension to Cell.
> HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum.
> HBASE-12650 Move ServerName to hbase-common module (partially backport this 
> patch)
> HBASE-12296 Filters should work with ByteBufferedCell.
> HBASE-14120 ByteBufferUtils#compareTo small optimization.
> HBASE-13510 - Purge ByteBloomFilter (Ram)

[jira] [Commented] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1

2016-12-08 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731493#comment-15731493
 ] 

Yu Sun commented on HBASE-17138:


yes, I think so, almost all of patches need to be changed due to merge 
conflict.so this will need some time if we decide to backport.

> Backport read-path offheap (HBASE-11425) to branch-1
> 
>
> Key: HBASE-17138
> URL: https://issues.apache.org/jira/browse/HBASE-17138
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yu Li
>Assignee: Yu Sun
> Attachments: 
> 0001-fix-EHB-511-Resolve-client-compatibility-issue-introduced-by-offheap-change.patch,
>  0001-to-EHB-446-offheap-hfile-format-should-keep-compatible-v3.patch, 
> 0001-to-EHB-456-Cell-should-be-compatible-with-branch-1.1.2.patch
>
>
> From the 
> [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E]
>  of sharing our experience and performance data of read-path offheap usage in 
> Alibaba search, we could see people are positive to have HBASE-11425 in 
> branch-1, so I'd like to create a JIRA and move the discussion and decision 
> making here.
> Echoing some comments from the mail thread:
> Bryan:
> Is the backported patch available anywhere? If it ends up not getting 
> officially backported to branch-1 due to 2.0 around the corner, some of us 
> who build our own deploy may want to integrate into our builds
> Andrew:
> Yes, please, the patches will be useful to the community even if we decide 
> not to backport into an official 1.x release.
> Enis:
> I don't see any reason why we cannot backport to branch-1.
> Ted:
> Opening a JIRA would be fine. This makes it easier for people to obtain the 
> patch(es)
> Nick:
> From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to
> support 1.x releases for some time... I would guess these will be
> maintained until 2.2 at least. Therefore, offheap patches that have seen
> production exposure seem like a reasonable candidate for backport, perhaps in 
> a 1.4 or 1.5 release timeframe.
> Anoop:
> Because of some compatibility issues, we decide that this will be done in 2.0 
> only..  Ya as Andy said, it would be great to share the 1.x backported 
> patches.
> The following is all the jira ids we have back ported:
> HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells 
> (Ram)
> HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and 
> AbstractHFileReader; ditto for Scanners and BlockReader, etc.
> HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner.
> HBASE-13450 - Purge RawBytescomparator from the writers and readers for 
> HBASE-10800 (Ram)
> HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo.
> HBASE-12048 Remove deprecated APIs from Filter.
> HBASE-10800 - Use CellComparator instead of KVComparator (Ram)
> HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], 
> int, int.
> HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with 
> byte[],int,int args in favor of taking Cell arg.
> HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int 
> length) in favor of filterRowKey(Cell firstRowCell).
> HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner.
> HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], 
> int, int.
> HBASE-11911 Break up tests into more fine grained categories (Alex Newman)
> HBASE-12059 Create hbase-annotations module
> HBASE-12106 Move test annotations to test artifact (Enis Soztutar)
> HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers.
> HBASE-15679 Assertion on wrong variable in 
> TestReplicationThrottler#testThrottling
> HBASE-13931 Move Unsafe based operations to UnsafeAccess.
> HBASE-12345 Unsafe based ByteBuffer Comparator.
> HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int 
> llength, byte[] right, int roffset, int rlength).
> HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn
> HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport 
> this patch)
> HBASE-13448 New Cell implementation with cached component offsets/lengths.
> HBASE-13387 Add ByteBufferedCell an extension to Cell.
> HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum.
> HBASE-12650 Move ServerName to hbase-common module (partially backport this 
> patch)
> HBASE-12296 Filters should work with ByteBufferedCell.
> HBASE-14120 ByteBufferUtils#compareTo small optimization.
> HBASE-13510 - Purge ByteBloomFilter (Ram)
> HBASE-13451 - Make the HFileBlockIndex blockKeys to Cells so that it could be 
> easy to use in the CellComparators (Ram)
> HBASE-13614 - Avoid temp KeyOnlyKeyValue temp obje

[jira] [Commented] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1

2016-12-08 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731485#comment-15731485
 ] 

Yu Sun commented on HBASE-17138:


done

> Backport read-path offheap (HBASE-11425) to branch-1
> 
>
> Key: HBASE-17138
> URL: https://issues.apache.org/jira/browse/HBASE-17138
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yu Li
>Assignee: Yu Sun
> Attachments: 
> 0001-fix-EHB-511-Resolve-client-compatibility-issue-introduced-by-offheap-change.patch,
>  0001-to-EHB-446-offheap-hfile-format-should-keep-compatible-v3.patch, 
> 0001-to-EHB-456-Cell-should-be-compatible-with-branch-1.1.2.patch
>
>
> From the 
> [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E]
>  of sharing our experience and performance data of read-path offheap usage in 
> Alibaba search, we could see people are positive to have HBASE-11425 in 
> branch-1, so I'd like to create a JIRA and move the discussion and decision 
> making here.
> Echoing some comments from the mail thread:
> Bryan:
> Is the backported patch available anywhere? If it ends up not getting 
> officially backported to branch-1 due to 2.0 around the corner, some of us 
> who build our own deploy may want to integrate into our builds
> Andrew:
> Yes, please, the patches will be useful to the community even if we decide 
> not to backport into an official 1.x release.
> Enis:
> I don't see any reason why we cannot backport to branch-1.
> Ted:
> Opening a JIRA would be fine. This makes it easier for people to obtain the 
> patch(es)
> Nick:
> From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to
> support 1.x releases for some time... I would guess these will be
> maintained until 2.2 at least. Therefore, offheap patches that have seen
> production exposure seem like a reasonable candidate for backport, perhaps in 
> a 1.4 or 1.5 release timeframe.
> Anoop:
> Because of some compatibility issues, we decide that this will be done in 2.0 
> only..  Ya as Andy said, it would be great to share the 1.x backported 
> patches.
> The following is all the jira ids we have back ported:
> HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells 
> (Ram)
> HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and 
> AbstractHFileReader; ditto for Scanners and BlockReader, etc.
> HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner.
> HBASE-13450 - Purge RawBytescomparator from the writers and readers for 
> HBASE-10800 (Ram)
> HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo.
> HBASE-12048 Remove deprecated APIs from Filter.
> HBASE-10800 - Use CellComparator instead of KVComparator (Ram)
> HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], 
> int, int.
> HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with 
> byte[],int,int args in favor of taking Cell arg.
> HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int 
> length) in favor of filterRowKey(Cell firstRowCell).
> HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner.
> HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], 
> int, int.
> HBASE-11911 Break up tests into more fine grained categories (Alex Newman)
> HBASE-12059 Create hbase-annotations module
> HBASE-12106 Move test annotations to test artifact (Enis Soztutar)
> HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers.
> HBASE-15679 Assertion on wrong variable in 
> TestReplicationThrottler#testThrottling
> HBASE-13931 Move Unsafe based operations to UnsafeAccess.
> HBASE-12345 Unsafe based ByteBuffer Comparator.
> HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int 
> llength, byte[] right, int roffset, int rlength).
> HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn
> HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport 
> this patch)
> HBASE-13448 New Cell implementation with cached component offsets/lengths.
> HBASE-13387 Add ByteBufferedCell an extension to Cell.
> HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum.
> HBASE-12650 Move ServerName to hbase-common module (partially backport this 
> patch)
> HBASE-12296 Filters should work with ByteBufferedCell.
> HBASE-14120 ByteBufferUtils#compareTo small optimization.
> HBASE-13510 - Purge ByteBloomFilter (Ram)
> HBASE-13451 - Make the HFileBlockIndex blockKeys to Cells so that it could be 
> easy to use in the CellComparators (Ram)
> HBASE-13614 - Avoid temp KeyOnlyKeyValue temp objects creations in read hot 
> path (Ram)
> HBASE-13939 - Make HFileReaderImpl.getFirstKeyInBlock() to return a Cell (Ram)
> HBASE-1

[jira] [Updated] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1

2016-12-07 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-17138:
---
Attachment: 
0001-fix-EHB-511-Resolve-client-compatibility-issue-introduced-by-offheap-change.patch

0001-to-EHB-456-Cell-should-be-compatible-with-branch-1.1.2.patch

0001-to-EHB-446-offheap-hfile-format-should-keep-compatible-v3.patch

> Backport read-path offheap (HBASE-11425) to branch-1
> 
>
> Key: HBASE-17138
> URL: https://issues.apache.org/jira/browse/HBASE-17138
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yu Li
>Assignee: Yu Sun
> Attachments: 
> 0001-fix-EHB-511-Resolve-client-compatibility-issue-introduced-by-offheap-change.patch,
>  0001-to-EHB-446-offheap-hfile-format-should-keep-compatible-v3.patch, 
> 0001-to-EHB-456-Cell-should-be-compatible-with-branch-1.1.2.patch
>
>
> From the 
> [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E]
>  of sharing our experience and performance data of read-path offheap usage in 
> Alibaba search, we could see people are positive to have HBASE-11425 in 
> branch-1, so I'd like to create a JIRA and move the discussion and decision 
> making here.
> Echoing some comments from the mail thread:
> Bryan:
> Is the backported patch available anywhere? If it ends up not getting 
> officially backported to branch-1 due to 2.0 around the corner, some of us 
> who build our own deploy may want to integrate into our builds
> Andrew:
> Yes, please, the patches will be useful to the community even if we decide 
> not to backport into an official 1.x release.
> Enis:
> I don't see any reason why we cannot backport to branch-1.
> Ted:
> Opening a JIRA would be fine. This makes it easier for people to obtain the 
> patch(es)
> Nick:
> From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to
> support 1.x releases for some time... I would guess these will be
> maintained until 2.2 at least. Therefore, offheap patches that have seen
> production exposure seem like a reasonable candidate for backport, perhaps in 
> a 1.4 or 1.5 release timeframe.
> Anoop:
> Because of some compatibility issues, we decide that this will be done in 2.0 
> only..  Ya as Andy said, it would be great to share the 1.x backported 
> patches.
> The following is all the jira ids we have back ported:
> HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells 
> (Ram)
> HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and 
> AbstractHFileReader; ditto for Scanners and BlockReader, etc.
> HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner.
> HBASE-13450 - Purge RawBytescomparator from the writers and readers for 
> HBASE-10800 (Ram)
> HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo.
> HBASE-12048 Remove deprecated APIs from Filter.
> HBASE-10800 - Use CellComparator instead of KVComparator (Ram)
> HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], 
> int, int.
> HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with 
> byte[],int,int args in favor of taking Cell arg.
> HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int 
> length) in favor of filterRowKey(Cell firstRowCell).
> HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner.
> HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], 
> int, int.
> HBASE-11911 Break up tests into more fine grained categories (Alex Newman)
> HBASE-12059 Create hbase-annotations module
> HBASE-12106 Move test annotations to test artifact (Enis Soztutar)
> HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers.
> HBASE-15679 Assertion on wrong variable in 
> TestReplicationThrottler#testThrottling
> HBASE-13931 Move Unsafe based operations to UnsafeAccess.
> HBASE-12345 Unsafe based ByteBuffer Comparator.
> HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int 
> llength, byte[] right, int roffset, int rlength).
> HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn
> HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport 
> this patch)
> HBASE-13448 New Cell implementation with cached component offsets/lengths.
> HBASE-13387 Add ByteBufferedCell an extension to Cell.
> HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum.
> HBASE-12650 Move ServerName to hbase-common module (partially backport this 
> patch)
> HBASE-12296 Filters should work with ByteBufferedCell.
> HBASE-14120 ByteBufferUtils#compareTo small optimization.
> HBASE-13510 - Purge ByteBloomFilter (Ram)
> HBASE-13451 - Make the HFileBlockIndex blockKeys to Cells so that it could be 
> easy to use i

[jira] [Commented] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1

2016-12-07 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731381#comment-15731381
 ] 

Yu Sun commented on HBASE-17138:


yes, we have three patches to resolve the issue, but not list above.
1. we still use Cell.getFamily(),Cell.getQualilfier and Cell.getRow() api in 
our existing code. this issue mainly introduced by HBASE-14047.
2. we should ensure offheap hfile format and old branch 1.1.2(without 
HBASE-16189) hfile format  keep compatible with each other for fallback 
purpose. so  change the offheap hfile format the same as our branch-1.1.2, and 
change some code while backport offheap to process this.
3. introduced by HBASE-12084,HBASE-13641,HBASE-12048.

so i think we should at least remove 
HBASE-14047,HBASE-12084,HBASE-13641,HBASE-12048 from backport list.

> Backport read-path offheap (HBASE-11425) to branch-1
> 
>
> Key: HBASE-17138
> URL: https://issues.apache.org/jira/browse/HBASE-17138
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yu Li
>Assignee: Yu Sun
>
> From the 
> [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E]
>  of sharing our experience and performance data of read-path offheap usage in 
> Alibaba search, we could see people are positive to have HBASE-11425 in 
> branch-1, so I'd like to create a JIRA and move the discussion and decision 
> making here.
> Echoing some comments from the mail thread:
> Bryan:
> Is the backported patch available anywhere? If it ends up not getting 
> officially backported to branch-1 due to 2.0 around the corner, some of us 
> who build our own deploy may want to integrate into our builds
> Andrew:
> Yes, please, the patches will be useful to the community even if we decide 
> not to backport into an official 1.x release.
> Enis:
> I don't see any reason why we cannot backport to branch-1.
> Ted:
> Opening a JIRA would be fine. This makes it easier for people to obtain the 
> patch(es)
> Nick:
> From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to
> support 1.x releases for some time... I would guess these will be
> maintained until 2.2 at least. Therefore, offheap patches that have seen
> production exposure seem like a reasonable candidate for backport, perhaps in 
> a 1.4 or 1.5 release timeframe.
> Anoop:
> Because of some compatibility issues, we decide that this will be done in 2.0 
> only..  Ya as Andy said, it would be great to share the 1.x backported 
> patches.
> The following is all the jira ids we have back ported:
> HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells 
> (Ram)
> HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and 
> AbstractHFileReader; ditto for Scanners and BlockReader, etc.
> HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner.
> HBASE-13450 - Purge RawBytescomparator from the writers and readers for 
> HBASE-10800 (Ram)
> HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo.
> HBASE-12048 Remove deprecated APIs from Filter.
> HBASE-10800 - Use CellComparator instead of KVComparator (Ram)
> HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], 
> int, int.
> HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with 
> byte[],int,int args in favor of taking Cell arg.
> HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int 
> length) in favor of filterRowKey(Cell firstRowCell).
> HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner.
> HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], 
> int, int.
> HBASE-11911 Break up tests into more fine grained categories (Alex Newman)
> HBASE-12059 Create hbase-annotations module
> HBASE-12106 Move test annotations to test artifact (Enis Soztutar)
> HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers.
> HBASE-15679 Assertion on wrong variable in 
> TestReplicationThrottler#testThrottling
> HBASE-13931 Move Unsafe based operations to UnsafeAccess.
> HBASE-12345 Unsafe based ByteBuffer Comparator.
> HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int 
> llength, byte[] right, int roffset, int rlength).
> HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn
> HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport 
> this patch)
> HBASE-13448 New Cell implementation with cached component offsets/lengths.
> HBASE-13387 Add ByteBufferedCell an extension to Cell.
> HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum.
> HBASE-12650 Move ServerName to hbase-common module (partially backport this 
> patch)
> HBASE-12296 Filters should work with ByteBufferedCell.
> HBASE-14120 ByteBu

[jira] [Updated] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1

2016-12-06 Thread Yu Sun (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yu Sun updated HBASE-17138:
---
Description:
>From the
>[thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E]
> of sharing our experience and performance data of read-path offheap usage in
>Alibaba search, we could see people are positive to have HBASE-11425 in
>branch-1, so I'd like to create a JIRA and move the discussion and decision
>making here.

Echoing some comments from the mail thread:

Bryan:
Is the backported patch available anywhere? If it ends up not getting
officially backported to branch-1 due to 2.0 around the corner, some of us who
build our own deploy may want to integrate into our builds

Andrew:
Yes, please, the patches will be useful to the community even if we decide not
to backport into an official 1.x release.

Enis:
I don't see any reason why we cannot backport to branch-1.

Ted:
Opening a JIRA would be fine. This makes it easier for people to obtain the
patch(es)

Nick:
>From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to
support 1.x releases for some time... I would guess these will be
maintained until 2.2 at least. Therefore, offheap patches that have seen
production exposure seem like a reasonable candidate for backport, perhaps in a
1.4 or 1.5 release timeframe.

Anoop:
Because of some compatibility issues, we decide that this will be done in 2.0
only.. Ya as Andy said, it would be great to share the 1.x backported patches.

The following is all the jira ids we have back ported:
HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells
(Ram)
HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and
AbstractHFileReader; ditto for Scanners and BlockReader, etc.
HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner.
HBASE-13450 - Purge RawBytescomparator from the writers and readers for
HBASE-10800 (Ram)
HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo.
HBASE-12048 Remove deprecated APIs from Filter.
HBASE-10800 - Use CellComparator instead of KVComparator (Ram)
HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[],
int, int.
HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with
byte[],int,int args in favor of taking Cell arg.
HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int
length) in favor of filterRowKey(Cell firstRowCell).
HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner.
HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], int,
int.

HBASE-11911 Break up tests into more fine grained categories (Alex Newman)
HBASE-12059 Create hbase-annotations module
HBASE-12106 Move test annotations to test artifact (Enis Soztutar)

HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers.
HBASE-15679 Assertion on wrong variable in
TestReplicationThrottler#testThrottling
HBASE-13931 Move Unsafe based operations to UnsafeAccess.
HBASE-12345 Unsafe based ByteBuffer Comparator.
HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int
llength, byte[] right, int roffset, int rlength).
HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn
HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport this
patch)
HBASE-13448 New Cell implementation with cached component offsets/lengths.
HBASE-13387 Add ByteBufferedCell an extension to Cell.
HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum.
HBASE-12650 Move ServerName to hbase-common module (partially backport this
patch)
HBASE-12296 Filters should work with ByteBufferedCell.
HBASE-14120 ByteBufferUtils#compareTo small optimization.
HBASE-13510 - Purge ByteBloomFilter (Ram)
HBASE-13451 - Make the HFileBlockIndex blockKeys to Cells so that it could be
easy to use in the CellComparators (Ram)
HBASE-13614 - Avoid temp KeyOnlyKeyValue temp objects creations in read hot
path (Ram)
HBASE-13939 - Make HFileReaderImpl.getFirstKeyInBlock() to return a Cell (Ram)
HBASE-13307 Making methods under ScannerV2#next inlineable, faster
HBASE-14020 Unsafe based optimized write in ByteBufferOutputStream.
HBASE-13977 - Convert getKey and related APIs to Cell (Ram)
HBASE-11927 Use Native Hadoop Library for HFile checksum. (Apekshit)
HBASE-12213 HFileBlock backed by Array of ByteBuffers (Ram)
HBASE-12084 Remove deprecated APIs from Result.
HBASE-12084 Remove deprecated APIs from Result - shell addendum
HBASE-13754 Allow non KeyValue Cell types also to oswrite.
HBASE-14047 - Cleanup deprecated APIs from Cell class (Ashish Singhi)
HBASE-13817 ByteBufferOuputStream - add writeInt support.
HBASE-12374 Change DBEs to work with new BB based cell.
HBASE-14116 Change ByteBuff.getXXXStrictlyForward to relative position based
reads
HBASE-14073 TestRemoteTable

[jira] [Commented] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1

2016-12-06 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15725028#comment-15725028
 ] 

Yu Sun commented on HBASE-17138:


done, you and [~anoop.hbase] please see the Description

> Backport read-path offheap (HBASE-11425) to branch-1
> 
>
> Key: HBASE-17138
> URL: https://issues.apache.org/jira/browse/HBASE-17138
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yu Li
>Assignee: Yu Sun
>
> From the 
> [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E]
>  of sharing our experience and performance data of read-path offheap usage in 
> Alibaba search, we could see people are positive to have HBASE-11425 in 
> branch-1, so I'd like to create a JIRA and move the discussion and decision 
> making here.
> Echoing some comments from the mail thread:
> Bryan:
> Is the backported patch available anywhere? If it ends up not getting 
> officially backported to branch-1 due to 2.0 around the corner, some of us 
> who build our own deploy may want to integrate into our builds
> Andrew:
> Yes, please, the patches will be useful to the community even if we decide 
> not to backport into an official 1.x release.
> Enis:
> I don't see any reason why we cannot backport to branch-1.
> Ted:
> Opening a JIRA would be fine. This makes it easier for people to obtain the 
> patch(es)
> Nick:
> From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to
> support 1.x releases for some time... I would guess these will be
> maintained until 2.2 at least. Therefore, offheap patches that have seen
> production exposure seem like a reasonable candidate for backport, perhaps in 
> a 1.4 or 1.5 release timeframe.
> Anoop:
> Because of some compatibility issues, we decide that this will be done in 2.0 
> only..  Ya as Andy said, it would be great to share the 1.x backported 
> patches.
> The following is all the jira ids we have back ported:
> HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells 
> (Ram)
> HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and 
> AbstractHFileReader; ditto for Scanners and BlockReader, etc.
> HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner.
> HBASE-13450 - Purge RawBytescomparator from the writers and readers for 
> HBASE-10800 (Ram)
> HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo.
> HBASE-12048 Remove deprecated APIs from Filter.
> HBASE-10800 - Use CellComparator instead of KVComparator (Ram)
> HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], 
> int, int.
> HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with 
> byte[],int,int args in favor of taking Cell arg.
> HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int 
> length) in favor of filterRowKey(Cell firstRowCell).
> HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner.
> HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], 
> int, int.
> HBASE-11911 Break up tests into more fine grained categories (Alex Newman)
> HBASE-12059 Create hbase-annotations module
> HBASE-12106 Move test annotations to test artifact (Enis Soztutar)
> HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers.
> HBASE-15679 Assertion on wrong variable in 
> TestReplicationThrottler#testThrottling
> HBASE-13931 Move Unsafe based operations to UnsafeAccess.
> HBASE-12345 Unsafe based ByteBuffer Comparator.
> HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int 
> llength, byte[] right, int roffset, int rlength).
> HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn
> HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport 
> this patch)
> HBASE-13448 New Cell implementation with cached component offsets/lengths.
> HBASE-13387 Add ByteBufferedCell an extension to Cell.
> HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum.
> HBASE-12650 Move ServerName to hbase-common module (partially backport this 
> patch)
> HBASE-12296 Filters should work with ByteBufferedCell.
> HBASE-14120 ByteBufferUtils#compareTo small optimization.
> HBASE-13510 - Purge ByteBloomFilter (Ram)
> HBASE-13451 - Make the HFileBlockIndex blockKeys to Cells so that it could be 
> easy to use in the CellComparators (Ram)
> HBASE-13614 - Avoid temp KeyOnlyKeyValue temp objects creations in read hot 
> path (Ram)
> HBASE-13939 - Make HFileReaderImpl.getFirstKeyInBlock() to return a Cell (Ram)
> HBASE-13307 Making methods under ScannerV2#next inlineable, faster
> HBASE-14020 Unsafe based optimized write in ByteBufferOutputStream.
> HBASE-13977 - Convert getKey and related APIs to Cell (Ram)
> HBASE-11927 U

[jira] [Updated] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1

2016-12-06 Thread Yu Sun (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Echoing some comments from the mail thread:

Andrew:
Yes, please, the patches will be useful to the community even if we decide not
to backport into an official 1.x release.

Enis:
I don't see any reason why we cannot backport to branch-1.

Ted:
Opening a JIRA would be fine. This makes it easier for people to obtain the
patch(es)

Anoop:
Because of some compatibility issues, we decide that this will be done in 2.0
only.. Ya as Andy said, it would be great to share the 1.x backported patches.
The following is all the jira ids we have back ported:
HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells
(Ram)
HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and
AbstractHFileReader; ditto for Scanners and BlockReader, etc.
HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner.
HBASE-13450 - Purge RawBytescomparator from the writers and readers for
HBASE-10800 (Ram)
HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo.
HBASE-12048 Remove deprecated APIs from Filter.
HBASE-10800 - Use CellComparator instead of KVComparator (Ram)
HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[],
int, int.
HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with
byte[],int,int args in favor of taking Cell arg.
HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int
length) in favor of filterRowKey(Cell firstRowCell).
HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner.
HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], int,
int.

HBASE-11911 Break up tests into more fine grained categories (Alex Newman)
HBASE-12059 Create hbase-annotations module
HBASE-12106 Move test annotations to test artifact (Enis Soztutar)

[jira] [Commented] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1

2016-12-05 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15721697#comment-15721697
 ] 

Yu Sun commented on HBASE-17138:


yes, [~anoop.hbase] is right, there are some building blocks jira not under 
HBASE-11425, To resolve the merge conflicts I have totally backport about 77 
patches to our customized branch, should I list the all the jira id I have 
backported as sub-task here for you and [~ram_krish] to check?

> Backport read-path offheap (HBASE-11425) to branch-1
> 
>
> Key: HBASE-17138
> URL: https://issues.apache.org/jira/browse/HBASE-17138
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yu Li
>Assignee: Yu Sun
>
> From the 
> [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E]
>  of sharing our experience and performance data of read-path offheap usage in 
> Alibaba search, we could see people are positive to have HBASE-11425 in 
> branch-1, so I'd like to create a JIRA and move the discussion and decision 
> making here.
> Echoing some comments from the mail thread:
> Bryan:
> Is the backported patch available anywhere? If it ends up not getting 
> officially backported to branch-1 due to 2.0 around the corner, some of us 
> who build our own deploy may want to integrate into our builds
> Andrew:
> Yes, please, the patches will be useful to the community even if we decide 
> not to backport into an official 1.x release.
> Enis:
> I don't see any reason why we cannot backport to branch-1.
> Ted:
> Opening a JIRA would be fine. This makes it easier for people to obtain the 
> patch(es)
> Nick:
> From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to
> support 1.x releases for some time... I would guess these will be
> maintained until 2.2 at least. Therefore, offheap patches that have seen
> production exposure seem like a reasonable candidate for backport, perhaps in 
> a 1.4 or 1.5 release timeframe.
> Anoop:
> Because of some compatibility issues, we decide that this will be done in 2.0 
> only..  Ya as Andy said, it would be great to share the 1.x backported 
> patches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-10 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-17020:
---
Attachment: HBASE-17020-branch-0.98.patch

submit patch for 0.98

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-17020-branch-0.98.patch, HBASE-17020-v1.patch, 
> HBASE-17020-v2.patch, HBASE-17020-v2.patch, HBASE-17020-v3-branch1.1.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-10 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-17020:
---
Attachment: HBASE-17020-v3-branch1.1.patch

attach patch for branch-1.0, 1.1, 1.2, 1.3

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch, 
> HBASE-17020-v2.patch, HBASE-17020-v3-branch1.1.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-10 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-17020:
---
Comment: was deleted

(was: ok, i will prepare patch for other branches)

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch, 
> HBASE-17020-v2.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-10 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15653621#comment-15653621
 ] 

Yu Sun commented on HBASE-17020:


ok, i will prepare patch for other branches

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch, 
> HBASE-17020-v2.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-10 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15653620#comment-15653620
 ] 

Yu Sun commented on HBASE-17020:


ok, i will prepare patch for other branches

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch, 
> HBASE-17020-v2.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-08 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15647552#comment-15647552
 ] 

Yu Sun commented on HBASE-17020:


thanks Ram

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-08 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15647145#comment-15647145
 ] 

Yu Sun edited comment on HBASE-17020 at 11/8/16 10:15 AM:
--

attach version v2, add a ut to reproduce ArrayIndexOutOfBoundsException


was (Author: haoran):
attach version v2, add a ut to reproduct ArrayIndexOutOfBoundsException

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-08 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-17020:
---
Attachment: HBASE-17020-v2.patch

attach version v2, add a ut to reproduct ArrayIndexOutOfBoundsException

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-04 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15636854#comment-15636854
 ] 

Yu Sun commented on HBASE-17020:


No, old versions also have same issue, I think this bug exist several years ago.

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-17020-v1.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-04 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-17020:
---
Comment: was deleted

(was: Yes，we encounter the exception in our cluster，i have tried to UT to 
reproduce this exception, but i found its hard, because although in  
CellBasedKeyBlockIndexReader.midkey()
{code}
byte[] bytes = b.toBytes(keyOffset, keyLen);
{code}
keyOffset + keyLen have exceed b.limit() or b.capacity()(exceed 
SECONDARY_INDEX_ENTRY_OVERHEAD == 12 bytes), but is still small than 
b.array().length, due to hbase alway read the next block header which of size 
33 Bytes into buffer.
so in ByteBufferUtils.copyFromBufferToArray
{code}
  public static void copyFromBufferToArray(byte[] out, ByteBuffer in, int 
sourceOffset,
  int destinationOffset, int length) {
if (in.hasArray()) {
  System.arraycopy(in.array(), sourceOffset + in.arrayOffset(), out, 
destinationOffset, length);
} else if (UNSAFE_AVAIL) {
  UnsafeAccess.copy(in, sourceOffset, out, destinationOffset, length);
{code}
hbase use  in.array().length to check if ArrayIndexOutOfBoundsException. so in 
most case, it not occur frequently.

i will try to improve the ut this weekend.
)

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-17020-v1.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-04 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15636784#comment-15636784
 ] 

Yu Sun commented on HBASE-17020:


Yes，we encounter the exception in our cluster，i have tried to UT to reproduce 
this exception, but i found its hard, because although in  
CellBasedKeyBlockIndexReader.midkey()
{code}
byte[] bytes = b.toBytes(keyOffset, keyLen);
{code}
keyOffset + keyLen have exceed b.limit() or b.capacity()(exceed 
SECONDARY_INDEX_ENTRY_OVERHEAD == 12 bytes), but is still small than 
b.array().length, due to hbase alway read the next block header which of size 
33 Bytes into buffer.
so in ByteBufferUtils.copyFromBufferToArray
{code}
  public static void copyFromBufferToArray(byte[] out, ByteBuffer in, int 
sourceOffset,
  int destinationOffset, int length) {
if (in.hasArray()) {
  System.arraycopy(in.array(), sourceOffset + in.arrayOffset(), out, 
destinationOffset, length);
} else if (UNSAFE_AVAIL) {
  UnsafeAccess.copy(in, sourceOffset, out, destinationOffset, length);
{code}
hbase use  in.array().length to check if ArrayIndexOutOfBoundsException. so in 
most case, it not occur frequently.

i will try to improve the ut this weekend.


> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-17020-v1.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-04 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15636786#comment-15636786
 ] 

Yu Sun commented on HBASE-17020:


Yes，we encounter the exception in our cluster，i have tried to UT to reproduce 
this exception, but i found its hard, because although in  
CellBasedKeyBlockIndexReader.midkey()
{code}
byte[] bytes = b.toBytes(keyOffset, keyLen);
{code}
keyOffset + keyLen have exceed b.limit() or b.capacity()(exceed 
SECONDARY_INDEX_ENTRY_OVERHEAD == 12 bytes), but is still small than 
b.array().length, due to hbase alway read the next block header which of size 
33 Bytes into buffer.
so in ByteBufferUtils.copyFromBufferToArray
{code}
  public static void copyFromBufferToArray(byte[] out, ByteBuffer in, int 
sourceOffset,
  int destinationOffset, int length) {
if (in.hasArray()) {
  System.arraycopy(in.array(), sourceOffset + in.arrayOffset(), out, 
destinationOffset, length);
} else if (UNSAFE_AVAIL) {
  UnsafeAccess.copy(in, sourceOffset, out, destinationOffset, length);
{code}
hbase use  in.array().length to check if ArrayIndexOutOfBoundsException. so in 
most case, it not occur frequently.

i will try to improve the ut this weekend.


> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-17020-v1.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-04 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-17020:
---
Attachment: HBASE-17020-v1.patch

patch version v1, any ut required?

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-17020-v1.patch
>
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-17020) keylen in midkey() dont computed correctly

2016-11-04 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-17020:
---
Summary: keylen in midkey() dont computed correctly  (was: keylen in 
midkey() should be computed correctly)

> keylen in midkey() dont computed correctly
> --
>
> Key: HBASE-17020
> URL: https://issues.apache.org/jira/browse/HBASE-17020
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Yu Sun
>Assignee: Yu Sun
>
> in CellBasedKeyBlockIndexReader.midkey():
> {code}
>   ByteBuff b = midLeafBlock.getBufferWithoutHeader();
>   int numDataBlocks = b.getIntAfterPosition(0);
>   int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
> (midKeyEntry + 1));
>   int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry 
> + 2)) - keyRelOffset;
> {code}
> the local varible keyLen get this should be total length of: 
> SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;
> the code is:
> {code}
> void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
> long curTotalNumSubEntries) {
>   // Record the offset for the secondary index
>   secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
>   curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
>   + firstKey.length;
> {code}
> when the midkey last entry of a leaf-level index block, this may throw:
> {quote}
> 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] 
> regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region 
> pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
> at 
> org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
> at 
> org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:756)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-17020) keylen in midkey() should be computed correctly

2016-11-04 Thread Yu Sun (JIRA)

Yu Sun created HBASE-17020:
--

 Summary: keylen in midkey() should be computed correctly
 Key: HBASE-17020
 URL: https://issues.apache.org/jira/browse/HBASE-17020
 Project: HBase
  Issue Type: Bug
  Components: HFile
Reporter: Yu Sun
Assignee: Yu Sun


in CellBasedKeyBlockIndexReader.midkey():
{code}
  ByteBuff b = midLeafBlock.getBufferWithoutHeader();
  int numDataBlocks = b.getIntAfterPosition(0);
  int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * 
(midKeyEntry + 1));
  int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry + 
2)) - keyRelOffset;
{code}
the local varible keyLen get this should be total length of: 
SECONDARY_INDEX_ENTRY_OVERHEAD  + firstKey.length;

the code is:
{code}
void add(byte[] firstKey, long blockOffset, int onDiskDataSize,
long curTotalNumSubEntries) {
  // Record the offset for the secondary index
  secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize);
  curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD
  + firstKey.length;
{code}

when the midkey last entry of a leaf-level index block, this may throw:
{quote}
2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] regionserver.MemStoreFlusher: 
Cache flusher failed for entry [flush region 
pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.]
java.lang.ArrayIndexOutOfBoundsException
at 
org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936)
at 
org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519)
at 
org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520)
at 
org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706)
at 
org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126)
at 
org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983)
at 
org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77)
at 
org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
at java.lang.Thread.run(Thread.java:756)
{quote}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16704) Scan will broken while work with KeyValueCodecWithTags

2016-09-24 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16704:
---
Assignee: Anoop Sam John

> Scan will broken while work with KeyValueCodecWithTags
> --
>
> Key: HBASE-16704
> URL: https://issues.apache.org/jira/browse/HBASE-16704
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Yu Sun
>Assignee: Anoop Sam John
>
> scan will always broken if we set LIMIT more than 1 with rs  
> hbase.client.rpc.codec set to 
> org.apache.hadoop.hbase.codec.KeyValueCodecWithTags.
> How to reproduce:
> 1. 1 master + 1 rs, codec use KeyValueCodecWithTags.
> 2.  create a table table_1024B_30g，1 cf and with only 1 qualifier, then load 
> some data with ycsb，
> 3. scan 'table_1024B_30g', {LIMIT => 2, STARTROW => 'user5499'}, STARTROW  is 
> set any valid start row.
> 4. scan failed.
> this should be bug in KeyValueCodecWithTags, after some investigations, I 
> found some the key not serialized correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-16704) Scan will broken while work with KeyValueCodecWithTags

2016-09-24 Thread Yu Sun (JIRA)

Yu Sun created HBASE-16704:
--

 Summary: Scan will broken while work with KeyValueCodecWithTags
 Key: HBASE-16704
 URL: https://issues.apache.org/jira/browse/HBASE-16704
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Yu Sun


scan will always broken if we set LIMIT more than 1 with rs  
hbase.client.rpc.codec set to 
org.apache.hadoop.hbase.codec.KeyValueCodecWithTags.

How to reproduce:
1. 1 master + 1 rs, codec use KeyValueCodecWithTags.
2.  create a table table_1024B_30g，1 cf and with only 1 qualifier, then load 
some data with ycsb，
3. scan 'table_1024B_30g', {LIMIT => 2, STARTROW => 'user5499'}, STARTROW  is 
set any valid start row.
4. scan failed.

this should be bug in KeyValueCodecWithTags, after some investigations, I found 
some the key not serialized correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16609) Fake cells EmptyByteBufferedCell created in read path not implementing SettableSequenceId

2016-09-11 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16609:
---
Description: 
I backport offheap in 2.0 to hbase-1.1.2, and when testing，I encounter a 
similar problem HBASE-15379 ,Here is the stack trace:
{noformat}
java.io.IOException: java.lang.UnsupportedOperationException: Cell is not of 
type org.apache.hadoop.hbase.SettableSequenceId
at org.apache.hadoop.hbase.CellUtil.setSequenceId(CellUtil.java:915)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.setCurrentCell(StoreFileScanner.java:203)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.requestSeek(StoreFileScanner.java:338)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:821)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:809)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:636)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5611)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5750)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5551)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5528)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5515)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2125)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2068)
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32201
)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:790)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:102)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
{noformat}
this will occur in read path when offheap is used. mostly due to ByteBuffer 
backed Cells dont implement interface SettableSequenceId. 


  was:
I backport offheap in 2.0 to hbase-1.1.2, and when testing，I encounter a 
similar problem HBASE-14099 ,Here is the stack trace:
{noformat}
java.io.IOException: java.lang.UnsupportedOperationException: Cell is not of 
type org.apache.hadoop.hbase.SettableSequenceId
at org.apache.hadoop.hbase.CellUtil.setSequenceId(CellUtil.java:915)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.setCurrentCell(StoreFileScanner.java:203)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.requestSeek(StoreFileScanner.java:338)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:821)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:809)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:636)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5611)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5750)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5551)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5528)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5515)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2125)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2068)
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32201
)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:790)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:102)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
{noformat}
this will occur in read path when offheap is used. mostly due to ByteBuffer 
backed Cells

[jira] [Commented] (HBASE-16609) Fake cells EmptyByteBufferedCell created in read path not implementing SettableSequenceId

2016-09-11 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483138#comment-15483138
 ] 

Yu Sun commented on HBASE-16609:


sorry, i make a mistake ,shoud be this jira HBASE-15379 .

> Fake cells EmptyByteBufferedCell  created in read path not implementing 
> SettableSequenceId 
> ---
>
> Key: HBASE-16609
> URL: https://issues.apache.org/jira/browse/HBASE-16609
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Yu Sun
>Assignee: Yu Sun
> Fix For: 2.0.0
>
> Attachments: HBASE-16609-v1.patch
>
>
> I backport offheap in 2.0 to hbase-1.1.2, and when testing，I encounter a 
> similar problem HBASE-14099 ,Here is the stack trace:
> {noformat}
> java.io.IOException: java.lang.UnsupportedOperationException: Cell is not of 
> type org.apache.hadoop.hbase.SettableSequenceId
> at org.apache.hadoop.hbase.CellUtil.setSequenceId(CellUtil.java:915)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.setCurrentCell(StoreFileScanner.java:203)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.requestSeek(StoreFileScanner.java:338)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:821)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:809)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:636)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5611)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5750)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5551)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5528)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5515)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2125)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2068)
> at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32201
> )
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:790)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:102)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> {noformat}
> this will occur in read path when offheap is used. mostly due to ByteBuffer 
> backed Cells dont implement interface SettableSequenceId. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16609) Fake cells EmptyByteBufferedCell created in read path not implementing SettableSequenceId

2016-09-10 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16609:
---
Attachment: HBASE-16609-v1.patch

attach patch v1

> Fake cells EmptyByteBufferedCell  created in read path not implementing 
> SettableSequenceId 
> ---
>
> Key: HBASE-16609
> URL: https://issues.apache.org/jira/browse/HBASE-16609
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16609-v1.patch
>
>
> I backport offheap in 2.0 to hbase-1.1.2, and when testing，I encounter a 
> similar problem HBASE-14099 ,Here is the stack trace:
> {noformat}
> java.io.IOException: java.lang.UnsupportedOperationException: Cell is not of 
> type org.apache.hadoop.hbase.SettableSequenceId
> at org.apache.hadoop.hbase.CellUtil.setSequenceId(CellUtil.java:915)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.setCurrentCell(StoreFileScanner.java:203)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.requestSeek(StoreFileScanner.java:338)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:821)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:809)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:636)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5611)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5750)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5551)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5528)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5515)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2125)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2068)
> at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32201
> )
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:790)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:102)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> {noformat}
> this will occur in read path when offheap is used. mostly due to ByteBuffer 
> backed Cells dont implement interface SettableSequenceId. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-16609) Fake cells EmptyByteBufferedCell created in read path not implementing SettableSequenceId

2016-09-10 Thread Yu Sun (JIRA)

Yu Sun created HBASE-16609:
--

 Summary: Fake cells EmptyByteBufferedCell  created in read path 
not implementing SettableSequenceId 
 Key: HBASE-16609
 URL: https://issues.apache.org/jira/browse/HBASE-16609
 Project: HBase
  Issue Type: Bug
Reporter: Yu Sun
Assignee: Yu Sun


I backport offheap in 2.0 to hbase-1.1.2, and when testing，I encounter a 
similar problem HBASE-14099 ,Here is the stack trace:
{noformat}
java.io.IOException: java.lang.UnsupportedOperationException: Cell is not of 
type org.apache.hadoop.hbase.SettableSequenceId
at org.apache.hadoop.hbase.CellUtil.setSequenceId(CellUtil.java:915)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.setCurrentCell(StoreFileScanner.java:203)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.requestSeek(StoreFileScanner.java:338)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:821)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:809)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:636)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5611)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5750)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5551)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5528)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5515)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2125)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2068)
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32201
)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:790)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:102)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
{noformat}
this will occur in read path when offheap is used. mostly due to ByteBuffer 
backed Cells dont implement interface SettableSequenceId. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-08-04 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407761#comment-15407761
 ] 

Yu Sun commented on HBASE-16287:


ok

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, 
> HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, 
> HBASE-16287-v6.patch, HBASE-16287-v7.patch, HBASE-16287-v8.patch, 
> HBASE-16287-v9.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-08-04 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407762#comment-15407762
 ] 

Yu Sun commented on HBASE-16287:


ok

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, 
> HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, 
> HBASE-16287-v6.patch, HBASE-16287-v7.patch, HBASE-16287-v8.patch, 
> HBASE-16287-v9.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-08-04 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407598#comment-15407598
 ] 

Yu Sun commented on HBASE-16287:


thanks for comments, v9 has fix this.

BTW, I'd like to fix  evictionInProgress problem if [~chenheng] and 
[~anoop.hbase] dont have time(smile)(smile)

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, 
> HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, 
> HBASE-16287-v6.patch, HBASE-16287-v7.patch, HBASE-16287-v8.patch, 
> HBASE-16287-v9.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-08-04 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16287:
---
Attachment: HBASE-16287-v9.patch

resubmit v9 for QA run and the main changes:
 calculate acceptableSize just once

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, 
> HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, 
> HBASE-16287-v6.patch, HBASE-16287-v7.patch, HBASE-16287-v8.patch, 
> HBASE-16287-v9.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-08-03 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16287:
---
Attachment: HBASE-16287-v8.patch

v8 is based on v6 + [~yuzhih...@gmail.com] advice.

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, 
> HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, 
> HBASE-16287-v6.patch, HBASE-16287-v7.patch, HBASE-16287-v8.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-08-03 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407101#comment-15407101
 ] 

Yu Sun commented on HBASE-16287:


[~chenheng] Thanks for your reminding, yes,you are right, maxSize maybe changed 
in runtime.

I will resubmit another version base on v6 with Ted Yu's advice, it's ok?

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, 
> HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, 
> HBASE-16287-v6.patch, HBASE-16287-v7.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-08-03 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16287:
---
Attachment: HBASE-16287-v7.patch

resubmit version v7 for QA
v7 contains the following two changes:
1. calculate hardLimitSize in advance
2. call stats.failInsert() while put into cache failed.

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, 
> HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, 
> HBASE-16287-v6.patch, HBASE-16287-v7.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-08-03 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406969#comment-15406969
 ] 

Yu Sun commented on HBASE-16287:


thanks for [~yuzhih...@gmail.com] comments, yes, I will update the patch to fix 
this.

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, 
> HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, 
> HBASE-16287-v6.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-08-03 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406103#comment-15406103
 ] 

Yu Sun commented on HBASE-16287:


yes, I have tried this patch on some  rs of our real cluster, it looks good.

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, 
> HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, 
> HBASE-16287-v6.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-08-03 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16287:
---
Attachment: HBASE-16287-v6.patch

retry

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, 
> HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, 
> HBASE-16287-v6.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-08-02 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16287:
---
Attachment: HBASE-16287-v5.patch

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, 
> HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-07-31 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16287:
---
Attachment: HBASE-16287-v4.patch

the faileure ut seems has no relateion with the changes I made and i also cant 
reproduce the failure ut in my local machine. so resubmit the patch for  
further check.

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, 
> HBASE-16287-v3.patch, HBASE-16287-v4.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-07-29 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16287:
---
Attachment: HBASE-16287-v3.patch

summit patch v3 rebase to master

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, 
> HBASE-16287-v3.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly

2016-07-28 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398770#comment-15398770
 ] 

Yu Sun commented on HBASE-16300:


thanks [~stack] and [~carp84]

> LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size 
> correctly
> 
>
> Key: HBASE-16300
> URL: https://issues.apache.org/jira/browse/HBASE-16300
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16300-v1.patch
>
>
> in current master {{LruBlockCache}},  CACHE_FIXED_OVERHEAD is calculated as 
> this:
> {code}
>   public final static long CACHE_FIXED_OVERHEAD = ClassSize.align(
>   (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) +
>   (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN)
>   + ClassSize.OBJECT);
> {code}
> after some investigation. I think there are some wrong here, {{class 
> LruBlockCache}}, except static varible(which is belongs to class), there are 
> 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference 
> varibles and 2 boolean varibles, so the above code will not calculate 
> LruBlockCache instance size correctly.
> current related ut not failed mostly due to the result is 8 bytes aligned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-07-28 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398623#comment-15398623
 ] 

Yu Sun commented on HBASE-16287:


will attach patch v3 after {{HBASE-16300}} commited, or TestHeapSize will 
failed while add new filed of type float to LruBlockCache.

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly

2016-07-28 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398566#comment-15398566
 ] 

Yu Sun commented on HBASE-16300:


this small bug likely imported by {{HBASE-14793}}, in which add a new long 
field {{private final long maxBlockSize;}} to LruBlockCache, but when update:
{code}
   public final static long CACHE_FIXED_OVERHEAD = ClassSize.align(
-  (3 * Bytes.SIZEOF_LONG) + (9 * ClassSize.REFERENCE) +
-  (5 * Bytes.SIZEOF_FLOAT) + Bytes.SIZEOF_BOOLEAN
+  (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) +
+  (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN)
   + ClassSize.OBJECT);
{code}

the long field are taken as a reference.

> LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size 
> correctly
> 
>
> Key: HBASE-16300
> URL: https://issues.apache.org/jira/browse/HBASE-16300
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16300-v1.patch
>
>
> in current master {{LruBlockCache}},  CACHE_FIXED_OVERHEAD is calculated as 
> this:
> {code}
>   public final static long CACHE_FIXED_OVERHEAD = ClassSize.align(
>   (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) +
>   (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN)
>   + ClassSize.OBJECT);
> {code}
> after some investigation. I think there are some wrong here, {{class 
> LruBlockCache}}, except static varible(which is belongs to class), there are 
> 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference 
> varibles and 2 boolean varibles, so the above code will not calculate 
> LruBlockCache instance size correctly.
> current related ut not failed mostly due to the result is 8 bytes aligned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly

2016-07-28 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398529#comment-15398529
 ] 

Yu Sun commented on HBASE-16300:


"org.apache.hadoop.hbase.io.TestHeapSize.testSizes" will cover this change,so 
no need new ut I think.

> LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size 
> correctly
> 
>
> Key: HBASE-16300
> URL: https://issues.apache.org/jira/browse/HBASE-16300
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16300-v1.patch
>
>
> in current master {{LruBlockCache}},  CACHE_FIXED_OVERHEAD is calculated as 
> this:
> {code}
>   public final static long CACHE_FIXED_OVERHEAD = ClassSize.align(
>   (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) +
>   (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN)
>   + ClassSize.OBJECT);
> {code}
> after some investigation. I think there are some wrong here, {{class 
> LruBlockCache}}, except static varible(which is belongs to class), there are 
> 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference 
> varibles and 2 boolean varibles, so the above code will not calculate 
> LruBlockCache instance size correctly.
> current related ut not failed mostly due to the result is 8 bytes aligned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly

2016-07-28 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398031#comment-15398031
 ] 

Yu Sun commented on HBASE-16300:


after some investigation ,current ut such as 
{{org.apache.hadoop.hbase.io.TestHeapSize}} can work mostly due to alignment.

> LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size 
> correctly
> 
>
> Key: HBASE-16300
> URL: https://issues.apache.org/jira/browse/HBASE-16300
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16300-v1.patch
>
>
> in current master {{LruBlockCache}},  CACHE_FIXED_OVERHEAD is calculated as 
> this:
> {code}
>   public final static long CACHE_FIXED_OVERHEAD = ClassSize.align(
>   (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) +
>   (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN)
>   + ClassSize.OBJECT);
> {code}
> after some investigation. I think there are some wrong here, {{class 
> LruBlockCache}}, except static varible(which is belongs to class), there are 
> 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference 
> varibles and 2 boolean varibles, so the above code will not calculate 
> LruBlockCache instance size correctly.
> current related ut not failed mostly due to the result is 8 bytes aligned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly

2016-07-28 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16300:
---
Attachment: HBASE-16300-v1.patch

attach patch v1

> LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size 
> correctly
> 
>
> Key: HBASE-16300
> URL: https://issues.apache.org/jira/browse/HBASE-16300
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16300-v1.patch
>
>
> in current master {{LruBlockCache}},  CACHE_FIXED_OVERHEAD is calculated as 
> this:
> {code}
>   public final static long CACHE_FIXED_OVERHEAD = ClassSize.align(
>   (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) +
>   (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN)
>   + ClassSize.OBJECT);
> {code}
> after some investigation. I think there are some wrong here, {{class 
> LruBlockCache}}, except static varible(which is belongs to class), there are 
> 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference 
> varibles and 2 boolean varibles, so the above code will not calculate 
> LruBlockCache instance size correctly.
> current related ut not failed mostly due to the result is 8 bytes aligned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly

2016-07-28 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16300:
---
Description: 
in current master {{LruBlockCache}},  CACHE_FIXED_OVERHEAD is calculated as 
this:
{code}
  public final static long CACHE_FIXED_OVERHEAD = ClassSize.align(
  (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) +
  (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN)
  + ClassSize.OBJECT);
{code}

after some investigation. I think there are some wrong here, {{class 
LruBlockCache}}, except static varible(which is belongs to class), there are 4 
long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference 
varibles and 2 boolean varibles, so the above code will not calculate 
LruBlockCache instance size correctly.

current related ut not failed mostly due to the result is 8 bytes aligned.

  was:
in current master {{LruBlockCache}},  CACHE_FIXED_OVERHEAD is calculated as 
this:
{code}
  public final static long CACHE_FIXED_OVERHEAD = ClassSize.align(
  (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) +
  (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN)
  + ClassSize.OBJECT);
{code}

after some investigation. I think there are some wrong here, {{class 
LruBlockCache}}, except static varible(which is belongs to class), there are 4 
long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference 
varibles and 2 boolean varibles, so the above code will not calculate 
LruBlockCache instance size correctly.


> LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size 
> correctly
> 
>
> Key: HBASE-16300
> URL: https://issues.apache.org/jira/browse/HBASE-16300
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Sun
>Assignee: Yu Sun
>
> in current master {{LruBlockCache}},  CACHE_FIXED_OVERHEAD is calculated as 
> this:
> {code}
>   public final static long CACHE_FIXED_OVERHEAD = ClassSize.align(
>   (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) +
>   (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN)
>   + ClassSize.OBJECT);
> {code}
> after some investigation. I think there are some wrong here, {{class 
> LruBlockCache}}, except static varible(which is belongs to class), there are 
> 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference 
> varibles and 2 boolean varibles, so the above code will not calculate 
> LruBlockCache instance size correctly.
> current related ut not failed mostly due to the result is 8 bytes aligned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly

2016-07-28 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16300:
---
Status: Patch Available  (was: Open)

> LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size 
> correctly
> 
>
> Key: HBASE-16300
> URL: https://issues.apache.org/jira/browse/HBASE-16300
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Sun
>Assignee: Yu Sun
>
> in current master {{LruBlockCache}},  CACHE_FIXED_OVERHEAD is calculated as 
> this:
> {code}
>   public final static long CACHE_FIXED_OVERHEAD = ClassSize.align(
>   (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) +
>   (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN)
>   + ClassSize.OBJECT);
> {code}
> after some investigation. I think there are some wrong here, {{class 
> LruBlockCache}}, except static varible(which is belongs to class), there are 
> 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference 
> varibles and 2 boolean varibles, so the above code will not calculate 
> LruBlockCache instance size correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly

2016-07-28 Thread Yu Sun (JIRA)

Yu Sun created HBASE-16300:
--

 Summary: LruBlockCache.CACHE_FIXED_OVERHEAD should calculate 
LruBlockCache size correctly
 Key: HBASE-16300
 URL: https://issues.apache.org/jira/browse/HBASE-16300
 Project: HBase
  Issue Type: Bug
Reporter: Yu Sun
Assignee: Yu Sun


in current master {{LruBlockCache}},  CACHE_FIXED_OVERHEAD is calculated as 
this:
{code}
  public final static long CACHE_FIXED_OVERHEAD = ClassSize.align(
  (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) +
  (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN)
  + ClassSize.OBJECT);
{code}

after some investigation. I think there are some wrong here, {{class 
LruBlockCache}}, except static varible(which is belongs to class), there are 4 
long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference 
varibles and 2 boolean varibles, so the above code will not calculate 
LruBlockCache instance size correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-07-28 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16287:
---
Attachment: HBASE-16287-v2.patch

attach patch v2 to fix ut failed error

this patch also contains a fix to 
org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testStoreFileCacheOnWrite() 
failed due to this patch. its simply set 
LruBlockCache.LRU_HARD_CAPACITY_LIMIT_FACTOR_CONFIG_NAME, to 2.0f, and  if we 
dont apply this change, estCacheOnWrite.testStoreFileCacheOnWrite() will failed 
and the output log is:
{quote}
2016-07-28 23:02:49,801 INFO  [main] hfile.CacheConfig(285): 
blockCache=LruBlockCache{blockCount=0, currentSize=159452224, f
reeSize=-25234496, maxSize=134217728, heapSize=159452224, minSize=127506840, 
minFactor=0.95, multiSize=63753420, multiFactor
=0.5, singleSize=31876710, singleFactor=0.25}, cacheDataOnRead=true, 
cacheDataOnWrite=false, cacheIndexesOnWrite=true, cache
BloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=true, 
prefetchOnOpen=false
2016-07-28 23:02:49,807 DEBUG [main] hfile.HFile$WriterFactory(345): Unable to 
set drop behind on /home/hongxi.sy/hbase/hbas
e-server/target/test-data/b1c99d85-27e3-4796-a66b-324feb06c620/test_cache_on_write/9174b12e141143acb9d4be7b6e7165a9
{quote}
from the log above we can see: currentSize  > 1.2f * maxSize * 
DEFAULT_ACCEPTABLE_FACTOR, that is 159452224 > 159450660.864, so the block 
beening read is not put into LruCache, the assert failed.  here i just 
increment hard limt factor to make the lur cache large enough for all the 
blocks of the file beening read.

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-07-27 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16287:
---
Attachment: HBASE-16287-v1.patch

attachment patch v1

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
> Attachments: HBASE-16287-v1.patch
>
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-07-27 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16287:
---
Assignee: Yu Sun
  Status: Patch Available  (was: Open)

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>Assignee: Yu Sun
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16287) BlockCache size should not exceed acceptableSize too many

2016-07-27 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395252#comment-15395252
 ] 

Yu Sun commented on HBASE-16287:


[~chenheng] thanks, I will upload the patch later today.

> BlockCache size should not exceed acceptableSize too many
> -
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many

2016-07-27 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16287:
---
Summary: LruBlockCache size should not exceed acceptableSize too many  
(was: BlockCache size should not exceed acceptableSize too many)

> LruBlockCache size should not exceed acceptableSize too many
> 
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16287) BlockCache size should not exceed acceptableSize too many

2016-07-27 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395250#comment-15395250
 ] 

Yu Sun commented on HBASE-16287:


[~anoop.hbase] sorry for my latter reply. thanks for your comments. yes you can 
open another jira for L2 cache, and I will change the this jira title soon.

> BlockCache size should not exceed acceptableSize too many
> -
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, 
> evictedPerRun=20051.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which will 
> cause the FullGC more seriously. 
> Afterfter some investigations, I found in this function:
> {code:borderStyle=solid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory,
>   final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. but 
> if the evict thread is not fast enough, blockcache size will increament 
> significantly.
> So here I think we should have a check, for example, if the blockcache size > 
> 1.2 * acceptableSize(), just return and dont put into it until the blockcache 
> size if under watrmark. if this is reasonable, I can make a small patch for 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-16287) BlockCache size should not exceed acceptableSize too many

2016-07-27 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395233#comment-15395233
 ] 

Yu Sun edited comment on HBASE-16287 at 7/27/16 8:33 AM:
-

{quote}
Why -1g? We calc the BC size by conf xmx value * BC percentage.
{quote}

under this jvm configuation:-Xmn4g -XX:SurvriorRatio=2, survrior size will be 
4g/(2+1+1)=1g, and at any time(except between young gc and some FullGC(not 
cms)), at least one of the two survrior is empty, contains no objects. so if we 
get max heapsize by jvm, jvm will just return Xmx - one survrior size. 

{code:borderStyle=solid}
  public static synchronized BlockCache instantiateBlockCache(Configuration 
conf) {
if (GLOBAL_BLOCK_CACHE_INSTANCE != null) return GLOBAL_BLOCK_CACHE_INSTANCE;
if (blockCacheDisabled) return null;
MemoryUsage mu = ManagementFactory.getMemoryMXBean().getHeapMemoryUsage();
LruBlockCache l1 = getL1(conf, mu);

{code}

{code:borderStyle=solid}
  static long getLruCacheSize(final Configuration conf, final MemoryUsage mu) {
float cachePercentage = conf.getFloat(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY,
  HConstants.HFILE_BLOCK_CACHE_SIZE_DEFAULT);
if (cachePercentage <= 0.0001f) {
  blockCacheDisabled = true;
  return -1;
}
if (cachePercentage > 1.0) {
  throw new IllegalArgumentException(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY +
" must be between 0.0 and 1.0, and not > 1.0");
}

// Calculate the amount of heap to give the heap.
return (long) (mu.getMax() * cachePercentage);
  }

{code}
the code above is how hbase compute block cache size, and the keypoint is how 
mu.getMax() is calculated。
mu itself is returned by the following jni call:
http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/58e586f18da6/src/share/native/sun/management/MemoryImpl.c
{code:borderStyle=solid}
JNIEXPORT jobject JNICALL Java_sun_management_MemoryImpl_getMemoryManagers0
  (JNIEnv *env, jclass dummy) {
return jmm_interface->GetMemoryManagers(env, NULL);
}
{code}
GetMemoryManagers(env, NULL) is implemented in jvm in file:
http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/vm/services/management.cpp
and part of this function implementation is listed bellow:

{code:borderStyle=solid}
// Returns a java/lang/management/MemoryUsage object representing
// the memory usage for the heap or non-heap memory.
JVM_ENTRY(jobject, jmm_GetMemoryUsage(JNIEnv* env, jboolean heap))
  ResourceMark rm(THREAD);

  // Calculate the memory usage
  size_t total_init = 0;
  size_t total_used = 0;
  size_t total_committed = 0;
  size_t total_max = 0;
  bool   has_undefined_init_size = false;
  bool   has_undefined_max_size = false;

  ..
  ..

  MemoryUsage usage((heap ? InitialHeapSize : total_init),
total_used,
total_committed,
(heap ? Universe::heap()->max_capacity() : total_max));

  Handle obj = MemoryService::create_MemoryUsage_obj(usage, CHECK_NULL);
  return JNIHandles::make_local(env, obj());
JVM_END
{code}

according to ctor of MemoryUsage, the _maxSize field is initialized by 
Universe::heap()->max_capacity(), which also implemented in jvm, take CMS gc 
for example(PS and G1 is almost the same):
http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/vm/memory/genCollectedHeap.cpp
{code:borderStyle=solid}
size_t GenCollectedHeap::max_capacity() const {
  size_t res = 0;
  for (int i = 0; i < _n_gens; i++) {
res += _gens[i]->max_capacity();
  }
  return res;
}
{code}

in the above code, _n_gens is 2, represent 2 generations(young and old), and 
max_capacity() is a virtual call , for young generation and cms gc, the 
max_capacity() is implemented in :
http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/vm/memory/defNewGeneration.cpp
{code:borderStyle=solid}
size_t DefNewGeneration::max_capacity() const {
  const size_t alignment = 
GenCollectedHeap::heap()->collector_policy()->min_alignment();
  const size_t reserved_bytes = reserved().byte_size();
  return reserved_bytes - compute_survivor_size(reserved_bytes, alignment);
{code}

reserved_bytes is just Xmn we set, so here we can see jvm calculate young gen 
max_capacity by Xmn-one survrior size.
actually, in CMS gc ,adaptive policy is disabled explicitly in jvm, so the two 
survrior alway of the same size.


was (Author: haoran):
{quote}
Why -1g? We calc the BC size by conf xmx value * BC percentage.
{quote}

under this jvm configuation:-Xmn4g -XX:SurvriorRatio=2, survrior size will be 
4g/(2+1+1)=1g, and at any time(except between young gc and some FullGC(not 
cms)), at least one of the two survrior is empty, contains no objects. so if we 
get max heapsize by jvm, jvm will just return Xmx - one survrior size. 

{code:borderStyle=solid}
  public static sync

[jira] [Commented] (HBASE-16287) BlockCache size should not exceed acceptableSize too many

2016-07-27 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395233#comment-15395233
 ] 

Yu Sun commented on HBASE-16287:


{quote}
Why -1g? We calc the BC size by conf xmx value * BC percentage.
{quote}

under this jvm configuation:-Xmn4g -XX:SurvriorRatio=2, survrior size will be 
4g/(2+1+1)=1g, and at any time(except between young gc and some FullGC(not 
cms)), at least one of the two survrior is empty, contains no objects. so if we 
get max heapsize by jvm, jvm will just return Xmx - one survrior size. 

{code:borderStyle=solid}
  public static synchronized BlockCache instantiateBlockCache(Configuration 
conf) {
if (GLOBAL_BLOCK_CACHE_INSTANCE != null) return GLOBAL_BLOCK_CACHE_INSTANCE;
if (blockCacheDisabled) return null;
MemoryUsage mu = ManagementFactory.getMemoryMXBean().getHeapMemoryUsage();
LruBlockCache l1 = getL1(conf, mu);

{code}

{code:borderStyle=solid}
  static long getLruCacheSize(final Configuration conf, final MemoryUsage mu) {
float cachePercentage = conf.getFloat(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY,
  HConstants.HFILE_BLOCK_CACHE_SIZE_DEFAULT);
if (cachePercentage <= 0.0001f) {
  blockCacheDisabled = true;
  return -1;
}
if (cachePercentage > 1.0) {
  throw new IllegalArgumentException(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY +
" must be between 0.0 and 1.0, and not > 1.0");
}

// Calculate the amount of heap to give the heap.
return (long) (mu.getMax() * cachePercentage);
  }

{code}
the code above is how hbase compute block cache size, and the keypoint is how 
mu.getMax() is calculated。
mu itself is returned by the following jni call:
http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/58e586f18da6/src/share/native/sun/management/MemoryImpl.c
{code:borderStyle=solid}
JNIEXPORT jobject JNICALL Java_sun_management_MemoryImpl_getMemoryManagers0
  (JNIEnv *env, jclass dummy) {
return jmm_interface->GetMemoryManagers(env, NULL);
}
{code}
GetMemoryManagers(env, NULL) is implemented in jvm in file:
http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/vm/services/management.cpp
and part of this function implementation is listed bellow:

{code:borderStyle=solid}
// Returns a java/lang/management/MemoryUsage object representing
// the memory usage for the heap or non-heap memory.
JVM_ENTRY(jobject, jmm_GetMemoryUsage(JNIEnv* env, jboolean heap))
  ResourceMark rm(THREAD);

  // Calculate the memory usage
  size_t total_init = 0;
  size_t total_used = 0;
  size_t total_committed = 0;
  size_t total_max = 0;
  bool   has_undefined_init_size = false;
  bool   has_undefined_max_size = false;

  ..
  ..

  MemoryUsage usage((heap ? InitialHeapSize : total_init),
total_used,
total_committed,
(heap ? Universe::heap()->max_capacity() : total_max));

  Handle obj = MemoryService::create_MemoryUsage_obj(usage, CHECK_NULL);
  return JNIHandles::make_local(env, obj());
JVM_END
{code}

according to ctor of MemoryUsage, the _maxSize field is initialized by 
Universe::heap()->max_capacity(), which also implemented in jvm, take CMS gc 
for example(PS and G1 is almost the same):
http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/vm/memory/genCollectedHeap.cpp
{code:borderStyle=solid}
size_t GenCollectedHeap::max_capacity() const {
  size_t res = 0;
  for (int i = 0; i < _n_gens; i++) {
res += _gens[i]->max_capacity();
  }
  return res;
}
{code}

in the above code, _n_gens is 2, represent 2 generations(young and old), and 
max_capacity() is a virtual call , for young generation and cms gc, the 
max_capacity() is implemented in :
http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/vm/memory/defNewGeneration.cpp
{code:borderStyle=solid}
size_t DefNewGeneration::max_capacity() const {
  const size_t alignment = 
GenCollectedHeap::heap()->collector_policy()->min_alignment();
  const size_t reserved_bytes = reserved().byte_size();
  return reserved_bytes - compute_survivor_size(reserved_bytes, alignment);
{code}

reserved_bytes is just Xmn we set, so here we can see jvm calculate young gen 
max_capacity by Xmn-one survrior size.
actually, in CMS gc ,adaptive policy is disabled explicitly in jvm, so the two 
survrior alway of the same this.

> BlockCache size should not exceed acceptableSize too many
> -
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockca

[jira] [Updated] (HBASE-16287) BlockCache size should not exceed acceptableSize too many

2016-07-26 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16287:
---
Description: 
Our regionserver has a configuation as bellow：
  -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
hbase_site.xml,so under this configuration, the lru block cache size will 
be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur continuous 
FullGC  for hours and most importantly, after FullGC most of the object in old 
will not be GCed. so we dump the heap and analyse with MAT and we observed a 
obvious memory leak in LruBlockCache, which occpy about 16g memory, then we set 
set class LruBlockCache log level to TRACE and observed this in log:

{quote}
2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: 
totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, blockCount=628182, 
accesses=101799469125, hits=93517800259, hitRatio=91.86%, , 
cachingAccesses=99462650031, cachingHits=93468334621, cachingHitsRatio=93.97%, 
evictions=238199, evicted=4776350518, evictedPerRun=20051.93359375{quote}

we can see blockcache size has exceeded acceptableSize too many, which will 
cause the FullGC more seriously. 
Afterfter some investigations, I found in this function:

{code:borderStyle=solid}
  public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
inMemory,
  final boolean cacheDataInL1) {
{code}


No matter the blockcache size has been used, just put the block into it. but if 
the evict thread is not fast enough, blockcache size will increament 
significantly.
So here I think we should have a check, for example, if the blockcache size > 
1.2 * acceptableSize(), just return and dont put into it until the blockcache 
size if under watrmark. if this is reasonable, I can make a small patch for 
this.

  was:
Our regionserver has a configuation as bellow：
  -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
hbase_site.xml,so under this configuration, the lru block cache size will 
be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur continuous 
FullGC  for hours and most importantly, after FullGC most of the object in old 
will not be GCed. so we dump the heap and analyse with MAT and we observed a 
obvious memory leak in LruBlockCache, which occpy about 16g memory, then we set 
set class LruBlockCache log level to TRACE and observed this in log:

2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: 
totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, blockCount=628182, 
accesses=101799469125, hits=93517800259, hitRatio=91.86%, , 
cachingAccesses=99462650031, cachingHits=93468334621, cachingHitsRatio=93.97%, 
evictions=238199, evicted=4776350518, evictedPerRun=20051.93359375

we can see blockcache size has exceeded acceptableSize too many, which will 
cause the FullGC more seriously. 
Afterfter some investigations, I found in this function:

{code:borderStyle=solid}
  public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
inMemory,
  final boolean cacheDataInL1) {
{code}


No matter the blockcache size has been used, just put the block into it. but if 
the evict thread is not fast enough, blockcache size will increament 
significantly.
So here I think we should have a check, for example, if the blockcache size > 
1.2 * acceptableSize(), just return and dont put into it until the blockcache 
size if under watrmark. if this is reasonable, I can make a small patch for 
this.


> BlockCache size should not exceed acceptableSize too many
> -
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=994626

[jira] [Updated] (HBASE-16287) BlockCache size should not exceed acceptableSize too many

2016-07-26 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun updated HBASE-16287:
---
Description: 
Our regionserver has a configuation as bellow：
  -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
hbase_site.xml,so under this configuration, the lru block cache size will 
be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur continuous 
FullGC  for hours and most importantly, after FullGC most of the object in old 
will not be GCed. so we dump the heap and analyse with MAT and we observed a 
obvious memory leak in LruBlockCache, which occpy about 16g memory, then we set 
set class LruBlockCache log level to TRACE and observed this in log:

2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: 
totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, blockCount=628182, 
accesses=101799469125, hits=93517800259, hitRatio=91.86%, , 
cachingAccesses=99462650031, cachingHits=93468334621, cachingHitsRatio=93.97%, 
evictions=238199, evicted=4776350518, evictedPerRun=20051.93359375

we can see blockcache size has exceeded acceptableSize too many, which will 
cause the FullGC more seriously. 
Afterfter some investigations, I found in this function:

{code:borderStyle=solid}
  public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
inMemory,
  final boolean cacheDataInL1) {
{code}


No matter the blockcache size has been used, just put the block into it. but if 
the evict thread is not fast enough, blockcache size will increament 
significantly.
So here I think we should have a check, for example, if the blockcache size > 
1.2 * acceptableSize(), just return and dont put into it until the blockcache 
size if under watrmark. if this is reasonable, I can make a small patch for 
this.

  was:
Our regionserver has a configuation as bellow：
  -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
hbase_site.xml,so under this configuration, the lru block cache size will 
be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur continuous 
FullGC  for hours and most importantly, after FullGC most of the object in old 
will not be GCed. so we dump the heap and analyse with MAT and we observed a 
obvious memory leak in LruBlockCache, which occpy about 16g memory, then we set 
set class LruBlockCache log level to TRACE and observed this in log:

2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: 
totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, blockCount=628182, 
accesses=101799469125, hits=93517800259, hitRatio=91.86%, , 
cachingAccesses=99462650031, cachingHits=93468334621, cachingHitsRatio=93.97%, 
evictions=238199, evicted=4776350518, evictedPerRun=20051.93359375

we can see blockcache size has exceeded acceptableSize too many, which will 
cause the FullGC more seriously. 
Afterfter some investigations, I found in this function:

  public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
inMemory,
  final boolean cacheDataInL1) {

No matter the blockcache size has been used, just put the block into it. but if 
the evict thread is not fast enough, blockcache size will increament 
significantly.
So here I think we should have a check, for example, if the blockcache size > 
1.2 * acceptableSize(), just return and dont put into it until the blockcache 
size if under watrmark. if this is reasonable, I can make a small patch for 
this.


> BlockCache size should not exceed acceptableSize too many
> -
>
> Key: HBASE-16287
> URL: https://issues.apache.org/jira/browse/HBASE-16287
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Reporter: Yu Sun
>
> Our regionserver has a configuation as bellow：
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
> also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
> hbase_site.xml,so under this configuration, the lru block cache size will 
> be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur 
> continuous FullGC  for hours and most importantly, after FullGC most of the 
> object in old will not be GCed. so we dump the heap and analyse with MAT and 
> we observed a obvious memory leak in LruBlockCache, which occpy about 16g 
> memory, then we set set class LruBlockCache log level to TRACE and observed 
> this in log:
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, 
> blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, 
> , cachingAccesses=99462650031, cachingHits=93468334621, 
> cachingHitsRatio=93.97%

[jira] [Created] (HBASE-16287) BlockCache size should not exceed acceptableSize too many

2016-07-26 Thread Yu Sun (JIRA)

Yu Sun created HBASE-16287:
--

 Summary: BlockCache size should not exceed acceptableSize too many
 Key: HBASE-16287
 URL: https://issues.apache.org/jira/browse/HBASE-16287
 Project: HBase
  Issue Type: Improvement
  Components: BlockCache
Reporter: Yu Sun


Our regionserver has a configuation as bellow：
  -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC 
also we only use blockcache,and set hfile.block.cache.size = 0.3 in 
hbase_site.xml,so under this configuration, the lru block cache size will 
be(32g-1g)*0.3=9.3g. but in some scenarios，some of the rs will occur continuous 
FullGC  for hours and most importantly, after FullGC most of the object in old 
will not be GCed. so we dump the heap and analyse with MAT and we observed a 
obvious memory leak in LruBlockCache, which occpy about 16g memory, then we set 
set class LruBlockCache log level to TRACE and observed this in log:

2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: 
totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, blockCount=628182, 
accesses=101799469125, hits=93517800259, hitRatio=91.86%, , 
cachingAccesses=99462650031, cachingHits=93468334621, cachingHitsRatio=93.97%, 
evictions=238199, evicted=4776350518, evictedPerRun=20051.93359375

we can see blockcache size has exceeded acceptableSize too many, which will 
cause the FullGC more seriously. 
Afterfter some investigations, I found in this function:

  public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
inMemory,
  final boolean cacheDataInL1) {

No matter the blockcache size has been used, just put the block into it. but if 
the evict thread is not fast enough, blockcache size will increament 
significantly.
So here I think we should have a check, for example, if the blockcache size > 
1.2 * acceptableSize(), just return and dont put into it until the blockcache 
size if under watrmark. if this is reasonable, I can make a small patch for 
this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-15325) ResultScanner allowing partial result will miss the rest of the row if the region is moved between two rpc requests

2016-03-19 Thread Yu Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Sun reassigned HBASE-15325:
--

Assignee: Yu Sun  (was: Phil Yang)

> ResultScanner allowing partial result will miss the rest of the row if the 
> region is moved between two rpc requests
> ---
>
> Key: HBASE-15325
> URL: https://issues.apache.org/jira/browse/HBASE-15325
> Project: HBase
>  Issue Type: Bug
>  Components: dataloss, Scanners
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Yu Sun
>Priority: Critical
> Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 1.4.0
>
> Attachments: 15325-test.txt, HBASE-15325-v1.txt, 
> HBASE-15325-v10.patch, HBASE-15325-v11.patch, HBASE-15325-v2.txt, 
> HBASE-15325-v3.txt, HBASE-15325-v5.txt, HBASE-15325-v6.1.txt, 
> HBASE-15325-v6.2.txt, HBASE-15325-v6.3.txt, HBASE-15325-v6.4.txt, 
> HBASE-15325-v6.5.txt, HBASE-15325-v6.txt, HBASE-15325-v7.patch, 
> HBASE-15325-v8.patch, HBASE-15325-v9.patch
>
>
> HBASE-11544 allow scan rpc return partial of a row to reduce memory usage for 
> one rpc request. And client can setAllowPartial or setBatch to get several 
> cells in a row instead of the whole row.
> However, the status of the scanner is saved on server and we need this to get 
> the next part if there is a partial result before. If we move the region to 
> another RS, client will get a NotServingRegionException and open a new 
> scanner to the new RS which will be regarded as a new scan from the end of 
> this row. So the rest cells of the row of last result will be missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15325) ResultScanner allowing partial result will miss the rest of the row if the region is moved between two rpc requests

2016-03-15 Thread Yu Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195127#comment-15195127
 ] 

Yu Sun commented on HBASE-15325:


in the current implementation(that before apply your patch), when DNRIOE 
occured, hbase will first call :

// An exception was thrown which makes any partial results that we were 
collecting
// invalid. The scanner will need to be reset to the beginning of a row.
clearPartialResults();

to clear the partialResults list, and the next scan will start from 
Bytes.add(lastResult.getRow(), new byte[1]), not including current row, so here 
i think will miss the data of the this row, not some cells, right ？

> ResultScanner allowing partial result will miss the rest of the row if the 
> region is moved between two rpc requests
> ---
>
> Key: HBASE-15325
> URL: https://issues.apache.org/jira/browse/HBASE-15325
> Project: HBase
>  Issue Type: Bug
>  Components: dataloss, Scanners
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
>Priority: Critical
> Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 1.4.0
>
> Attachments: 15325-test.txt, HBASE-15325-v1.txt, 
> HBASE-15325-v10.patch, HBASE-15325-v2.txt, HBASE-15325-v3.txt, 
> HBASE-15325-v5.txt, HBASE-15325-v6.1.txt, HBASE-15325-v6.2.txt, 
> HBASE-15325-v6.3.txt, HBASE-15325-v6.4.txt, HBASE-15325-v6.5.txt, 
> HBASE-15325-v6.txt, HBASE-15325-v7.patch, HBASE-15325-v8.patch, 
> HBASE-15325-v9.patch
>
>
> HBASE-11544 allow scan rpc return partial of a row to reduce memory usage for 
> one rpc request. And client can setAllowPartial or setBatch to get several 
> cells in a row instead of the whole row.
> However, the status of the scanner is saved on server and we need this to get 
> the next part if there is a partial result before. If we move the region to 
> another RS, client will get a NotServingRegionException and open a new 
> scanner to the new RS which will be regarded as a new scan from the end of 
> this row. So the rest cells of the row of last result will be missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

65 matches

Mail list logo