[jira] [Commented] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15747155#comment-15747155 ] Yu Sun commented on HBASE-17138: [~tedyu] As you have listed, the backport needs lots of effort and i think it will need weeks to complete this. I am afraid I dont have enouth time to do this and cause the backport to be delayed. so if any others who have interesting in doing backport, I would like to provide any necessary help if needed. > Backport read-path offheap (HBASE-11425) to branch-1 > > > Key: HBASE-17138 > URL: https://issues.apache.org/jira/browse/HBASE-17138 > Project: HBase > Issue Type: Improvement >Reporter: Yu Li >Assignee: Yu Sun > Attachments: > 0001-fix-EHB-511-Resolve-client-compatibility-issue-introduced-by-offheap-change.patch, > 0001-to-EHB-446-offheap-hfile-format-should-keep-compatible-v3.patch, > 0001-to-EHB-456-Cell-should-be-compatible-with-branch-1.1.2.patch > > > From the > [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E] > of sharing our experience and performance data of read-path offheap usage in > Alibaba search, we could see people are positive to have HBASE-11425 in > branch-1, so I'd like to create a JIRA and move the discussion and decision > making here. > Echoing some comments from the mail thread: > Bryan: > Is the backported patch available anywhere? If it ends up not getting > officially backported to branch-1 due to 2.0 around the corner, some of us > who build our own deploy may want to integrate into our builds > Andrew: > Yes, please, the patches will be useful to the community even if we decide > not to backport into an official 1.x release. > Enis: > I don't see any reason why we cannot backport to branch-1. > Ted: > Opening a JIRA would be fine. This makes it easier for people to obtain the > patch(es) > Nick: > From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to > support 1.x releases for some time... I would guess these will be > maintained until 2.2 at least. Therefore, offheap patches that have seen > production exposure seem like a reasonable candidate for backport, perhaps in > a 1.4 or 1.5 release timeframe. > Anoop: > Because of some compatibility issues, we decide that this will be done in 2.0 > only.. Ya as Andy said, it would be great to share the 1.x backported > patches. > The following is all the jira ids we have back ported: > HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells > (Ram) > HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and > AbstractHFileReader; ditto for Scanners and BlockReader, etc. > HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner. > HBASE-13450 - Purge RawBytescomparator from the writers and readers for > HBASE-10800 (Ram) > HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo. > HBASE-12048 Remove deprecated APIs from Filter. > HBASE-10800 - Use CellComparator instead of KVComparator (Ram) > HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], > int, int. > HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with > byte[],int,int args in favor of taking Cell arg. > HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int > length) in favor of filterRowKey(Cell firstRowCell). > HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner. > HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], > int, int. > HBASE-11911 Break up tests into more fine grained categories (Alex Newman) > HBASE-12059 Create hbase-annotations module > HBASE-12106 Move test annotations to test artifact (Enis Soztutar) > HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers. > HBASE-15679 Assertion on wrong variable in > TestReplicationThrottler#testThrottling > HBASE-13931 Move Unsafe based operations to UnsafeAccess. > HBASE-12345 Unsafe based ByteBuffer Comparator. > HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int > llength, byte[] right, int roffset, int rlength). > HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn > HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport > this patch) > HBASE-13448 New Cell implementation with cached component offsets/lengths. > HBASE-13387 Add ByteBufferedCell an extension to Cell. > HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum. > HBASE-12650 Move ServerName to hbase-common module (partially backport this > patch) > HBASE-12296 Filters should work with ByteBufferedCell. > HBASE-14120 ByteBufferUtils#compareTo small optimization. > HBASE-13510 - Purge ByteBloomFilter (Ram)
[jira] [Commented] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731493#comment-15731493 ] Yu Sun commented on HBASE-17138: yes, I think so, almost all of patches need to be changed due to merge conflict.so this will need some time if we decide to backport. > Backport read-path offheap (HBASE-11425) to branch-1 > > > Key: HBASE-17138 > URL: https://issues.apache.org/jira/browse/HBASE-17138 > Project: HBase > Issue Type: Improvement >Reporter: Yu Li >Assignee: Yu Sun > Attachments: > 0001-fix-EHB-511-Resolve-client-compatibility-issue-introduced-by-offheap-change.patch, > 0001-to-EHB-446-offheap-hfile-format-should-keep-compatible-v3.patch, > 0001-to-EHB-456-Cell-should-be-compatible-with-branch-1.1.2.patch > > > From the > [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E] > of sharing our experience and performance data of read-path offheap usage in > Alibaba search, we could see people are positive to have HBASE-11425 in > branch-1, so I'd like to create a JIRA and move the discussion and decision > making here. > Echoing some comments from the mail thread: > Bryan: > Is the backported patch available anywhere? If it ends up not getting > officially backported to branch-1 due to 2.0 around the corner, some of us > who build our own deploy may want to integrate into our builds > Andrew: > Yes, please, the patches will be useful to the community even if we decide > not to backport into an official 1.x release. > Enis: > I don't see any reason why we cannot backport to branch-1. > Ted: > Opening a JIRA would be fine. This makes it easier for people to obtain the > patch(es) > Nick: > From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to > support 1.x releases for some time... I would guess these will be > maintained until 2.2 at least. Therefore, offheap patches that have seen > production exposure seem like a reasonable candidate for backport, perhaps in > a 1.4 or 1.5 release timeframe. > Anoop: > Because of some compatibility issues, we decide that this will be done in 2.0 > only.. Ya as Andy said, it would be great to share the 1.x backported > patches. > The following is all the jira ids we have back ported: > HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells > (Ram) > HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and > AbstractHFileReader; ditto for Scanners and BlockReader, etc. > HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner. > HBASE-13450 - Purge RawBytescomparator from the writers and readers for > HBASE-10800 (Ram) > HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo. > HBASE-12048 Remove deprecated APIs from Filter. > HBASE-10800 - Use CellComparator instead of KVComparator (Ram) > HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], > int, int. > HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with > byte[],int,int args in favor of taking Cell arg. > HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int > length) in favor of filterRowKey(Cell firstRowCell). > HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner. > HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], > int, int. > HBASE-11911 Break up tests into more fine grained categories (Alex Newman) > HBASE-12059 Create hbase-annotations module > HBASE-12106 Move test annotations to test artifact (Enis Soztutar) > HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers. > HBASE-15679 Assertion on wrong variable in > TestReplicationThrottler#testThrottling > HBASE-13931 Move Unsafe based operations to UnsafeAccess. > HBASE-12345 Unsafe based ByteBuffer Comparator. > HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int > llength, byte[] right, int roffset, int rlength). > HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn > HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport > this patch) > HBASE-13448 New Cell implementation with cached component offsets/lengths. > HBASE-13387 Add ByteBufferedCell an extension to Cell. > HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum. > HBASE-12650 Move ServerName to hbase-common module (partially backport this > patch) > HBASE-12296 Filters should work with ByteBufferedCell. > HBASE-14120 ByteBufferUtils#compareTo small optimization. > HBASE-13510 - Purge ByteBloomFilter (Ram) > HBASE-13451 - Make the HFileBlockIndex blockKeys to Cells so that it could be > easy to use in the CellComparators (Ram) > HBASE-13614 - Avoid temp KeyOnlyKeyValue temp obje
[jira] [Commented] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731485#comment-15731485 ] Yu Sun commented on HBASE-17138: done > Backport read-path offheap (HBASE-11425) to branch-1 > > > Key: HBASE-17138 > URL: https://issues.apache.org/jira/browse/HBASE-17138 > Project: HBase > Issue Type: Improvement >Reporter: Yu Li >Assignee: Yu Sun > Attachments: > 0001-fix-EHB-511-Resolve-client-compatibility-issue-introduced-by-offheap-change.patch, > 0001-to-EHB-446-offheap-hfile-format-should-keep-compatible-v3.patch, > 0001-to-EHB-456-Cell-should-be-compatible-with-branch-1.1.2.patch > > > From the > [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E] > of sharing our experience and performance data of read-path offheap usage in > Alibaba search, we could see people are positive to have HBASE-11425 in > branch-1, so I'd like to create a JIRA and move the discussion and decision > making here. > Echoing some comments from the mail thread: > Bryan: > Is the backported patch available anywhere? If it ends up not getting > officially backported to branch-1 due to 2.0 around the corner, some of us > who build our own deploy may want to integrate into our builds > Andrew: > Yes, please, the patches will be useful to the community even if we decide > not to backport into an official 1.x release. > Enis: > I don't see any reason why we cannot backport to branch-1. > Ted: > Opening a JIRA would be fine. This makes it easier for people to obtain the > patch(es) > Nick: > From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to > support 1.x releases for some time... I would guess these will be > maintained until 2.2 at least. Therefore, offheap patches that have seen > production exposure seem like a reasonable candidate for backport, perhaps in > a 1.4 or 1.5 release timeframe. > Anoop: > Because of some compatibility issues, we decide that this will be done in 2.0 > only.. Ya as Andy said, it would be great to share the 1.x backported > patches. > The following is all the jira ids we have back ported: > HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells > (Ram) > HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and > AbstractHFileReader; ditto for Scanners and BlockReader, etc. > HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner. > HBASE-13450 - Purge RawBytescomparator from the writers and readers for > HBASE-10800 (Ram) > HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo. > HBASE-12048 Remove deprecated APIs from Filter. > HBASE-10800 - Use CellComparator instead of KVComparator (Ram) > HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], > int, int. > HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with > byte[],int,int args in favor of taking Cell arg. > HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int > length) in favor of filterRowKey(Cell firstRowCell). > HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner. > HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], > int, int. > HBASE-11911 Break up tests into more fine grained categories (Alex Newman) > HBASE-12059 Create hbase-annotations module > HBASE-12106 Move test annotations to test artifact (Enis Soztutar) > HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers. > HBASE-15679 Assertion on wrong variable in > TestReplicationThrottler#testThrottling > HBASE-13931 Move Unsafe based operations to UnsafeAccess. > HBASE-12345 Unsafe based ByteBuffer Comparator. > HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int > llength, byte[] right, int roffset, int rlength). > HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn > HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport > this patch) > HBASE-13448 New Cell implementation with cached component offsets/lengths. > HBASE-13387 Add ByteBufferedCell an extension to Cell. > HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum. > HBASE-12650 Move ServerName to hbase-common module (partially backport this > patch) > HBASE-12296 Filters should work with ByteBufferedCell. > HBASE-14120 ByteBufferUtils#compareTo small optimization. > HBASE-13510 - Purge ByteBloomFilter (Ram) > HBASE-13451 - Make the HFileBlockIndex blockKeys to Cells so that it could be > easy to use in the CellComparators (Ram) > HBASE-13614 - Avoid temp KeyOnlyKeyValue temp objects creations in read hot > path (Ram) > HBASE-13939 - Make HFileReaderImpl.getFirstKeyInBlock() to return a Cell (Ram) > HBASE-1
[jira] [Updated] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-17138: --- Attachment: 0001-fix-EHB-511-Resolve-client-compatibility-issue-introduced-by-offheap-change.patch 0001-to-EHB-456-Cell-should-be-compatible-with-branch-1.1.2.patch 0001-to-EHB-446-offheap-hfile-format-should-keep-compatible-v3.patch > Backport read-path offheap (HBASE-11425) to branch-1 > > > Key: HBASE-17138 > URL: https://issues.apache.org/jira/browse/HBASE-17138 > Project: HBase > Issue Type: Improvement >Reporter: Yu Li >Assignee: Yu Sun > Attachments: > 0001-fix-EHB-511-Resolve-client-compatibility-issue-introduced-by-offheap-change.patch, > 0001-to-EHB-446-offheap-hfile-format-should-keep-compatible-v3.patch, > 0001-to-EHB-456-Cell-should-be-compatible-with-branch-1.1.2.patch > > > From the > [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E] > of sharing our experience and performance data of read-path offheap usage in > Alibaba search, we could see people are positive to have HBASE-11425 in > branch-1, so I'd like to create a JIRA and move the discussion and decision > making here. > Echoing some comments from the mail thread: > Bryan: > Is the backported patch available anywhere? If it ends up not getting > officially backported to branch-1 due to 2.0 around the corner, some of us > who build our own deploy may want to integrate into our builds > Andrew: > Yes, please, the patches will be useful to the community even if we decide > not to backport into an official 1.x release. > Enis: > I don't see any reason why we cannot backport to branch-1. > Ted: > Opening a JIRA would be fine. This makes it easier for people to obtain the > patch(es) > Nick: > From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to > support 1.x releases for some time... I would guess these will be > maintained until 2.2 at least. Therefore, offheap patches that have seen > production exposure seem like a reasonable candidate for backport, perhaps in > a 1.4 or 1.5 release timeframe. > Anoop: > Because of some compatibility issues, we decide that this will be done in 2.0 > only.. Ya as Andy said, it would be great to share the 1.x backported > patches. > The following is all the jira ids we have back ported: > HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells > (Ram) > HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and > AbstractHFileReader; ditto for Scanners and BlockReader, etc. > HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner. > HBASE-13450 - Purge RawBytescomparator from the writers and readers for > HBASE-10800 (Ram) > HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo. > HBASE-12048 Remove deprecated APIs from Filter. > HBASE-10800 - Use CellComparator instead of KVComparator (Ram) > HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], > int, int. > HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with > byte[],int,int args in favor of taking Cell arg. > HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int > length) in favor of filterRowKey(Cell firstRowCell). > HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner. > HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], > int, int. > HBASE-11911 Break up tests into more fine grained categories (Alex Newman) > HBASE-12059 Create hbase-annotations module > HBASE-12106 Move test annotations to test artifact (Enis Soztutar) > HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers. > HBASE-15679 Assertion on wrong variable in > TestReplicationThrottler#testThrottling > HBASE-13931 Move Unsafe based operations to UnsafeAccess. > HBASE-12345 Unsafe based ByteBuffer Comparator. > HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int > llength, byte[] right, int roffset, int rlength). > HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn > HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport > this patch) > HBASE-13448 New Cell implementation with cached component offsets/lengths. > HBASE-13387 Add ByteBufferedCell an extension to Cell. > HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum. > HBASE-12650 Move ServerName to hbase-common module (partially backport this > patch) > HBASE-12296 Filters should work with ByteBufferedCell. > HBASE-14120 ByteBufferUtils#compareTo small optimization. > HBASE-13510 - Purge ByteBloomFilter (Ram) > HBASE-13451 - Make the HFileBlockIndex blockKeys to Cells so that it could be > easy to use i
[jira] [Commented] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731381#comment-15731381 ] Yu Sun commented on HBASE-17138: yes, we have three patches to resolve the issue, but not list above. 1. we still use Cell.getFamily(),Cell.getQualilfier and Cell.getRow() api in our existing code. this issue mainly introduced by HBASE-14047. 2. we should ensure offheap hfile format and old branch 1.1.2(without HBASE-16189) hfile format keep compatible with each other for fallback purpose. so change the offheap hfile format the same as our branch-1.1.2, and change some code while backport offheap to process this. 3. introduced by HBASE-12084,HBASE-13641,HBASE-12048. so i think we should at least remove HBASE-14047,HBASE-12084,HBASE-13641,HBASE-12048 from backport list. > Backport read-path offheap (HBASE-11425) to branch-1 > > > Key: HBASE-17138 > URL: https://issues.apache.org/jira/browse/HBASE-17138 > Project: HBase > Issue Type: Improvement >Reporter: Yu Li >Assignee: Yu Sun > > From the > [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E] > of sharing our experience and performance data of read-path offheap usage in > Alibaba search, we could see people are positive to have HBASE-11425 in > branch-1, so I'd like to create a JIRA and move the discussion and decision > making here. > Echoing some comments from the mail thread: > Bryan: > Is the backported patch available anywhere? If it ends up not getting > officially backported to branch-1 due to 2.0 around the corner, some of us > who build our own deploy may want to integrate into our builds > Andrew: > Yes, please, the patches will be useful to the community even if we decide > not to backport into an official 1.x release. > Enis: > I don't see any reason why we cannot backport to branch-1. > Ted: > Opening a JIRA would be fine. This makes it easier for people to obtain the > patch(es) > Nick: > From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to > support 1.x releases for some time... I would guess these will be > maintained until 2.2 at least. Therefore, offheap patches that have seen > production exposure seem like a reasonable candidate for backport, perhaps in > a 1.4 or 1.5 release timeframe. > Anoop: > Because of some compatibility issues, we decide that this will be done in 2.0 > only.. Ya as Andy said, it would be great to share the 1.x backported > patches. > The following is all the jira ids we have back ported: > HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells > (Ram) > HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and > AbstractHFileReader; ditto for Scanners and BlockReader, etc. > HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner. > HBASE-13450 - Purge RawBytescomparator from the writers and readers for > HBASE-10800 (Ram) > HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo. > HBASE-12048 Remove deprecated APIs from Filter. > HBASE-10800 - Use CellComparator instead of KVComparator (Ram) > HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], > int, int. > HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with > byte[],int,int args in favor of taking Cell arg. > HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int > length) in favor of filterRowKey(Cell firstRowCell). > HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner. > HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], > int, int. > HBASE-11911 Break up tests into more fine grained categories (Alex Newman) > HBASE-12059 Create hbase-annotations module > HBASE-12106 Move test annotations to test artifact (Enis Soztutar) > HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers. > HBASE-15679 Assertion on wrong variable in > TestReplicationThrottler#testThrottling > HBASE-13931 Move Unsafe based operations to UnsafeAccess. > HBASE-12345 Unsafe based ByteBuffer Comparator. > HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int > llength, byte[] right, int roffset, int rlength). > HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn > HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport > this patch) > HBASE-13448 New Cell implementation with cached component offsets/lengths. > HBASE-13387 Add ByteBufferedCell an extension to Cell. > HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum. > HBASE-12650 Move ServerName to hbase-common module (partially backport this > patch) > HBASE-12296 Filters should work with ByteBufferedCell. > HBASE-14120 ByteBu
[jira] [Updated] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-17138: --- Description: >From the >[thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E] > of sharing our experience and performance data of read-path offheap usage in >Alibaba search, we could see people are positive to have HBASE-11425 in >branch-1, so I'd like to create a JIRA and move the discussion and decision >making here. Echoing some comments from the mail thread: Bryan: Is the backported patch available anywhere? If it ends up not getting officially backported to branch-1 due to 2.0 around the corner, some of us who build our own deploy may want to integrate into our builds Andrew: Yes, please, the patches will be useful to the community even if we decide not to backport into an official 1.x release. Enis: I don't see any reason why we cannot backport to branch-1. Ted: Opening a JIRA would be fine. This makes it easier for people to obtain the patch(es) Nick: >From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to support 1.x releases for some time... I would guess these will be maintained until 2.2 at least. Therefore, offheap patches that have seen production exposure seem like a reasonable candidate for backport, perhaps in a 1.4 or 1.5 release timeframe. Anoop: Because of some compatibility issues, we decide that this will be done in 2.0 only.. Ya as Andy said, it would be great to share the 1.x backported patches. The following is all the jira ids we have back ported: HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells (Ram) HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and AbstractHFileReader; ditto for Scanners and BlockReader, etc. HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner. HBASE-13450 - Purge RawBytescomparator from the writers and readers for HBASE-10800 (Ram) HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo. HBASE-12048 Remove deprecated APIs from Filter. HBASE-10800 - Use CellComparator instead of KVComparator (Ram) HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], int, int. HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with byte[],int,int args in favor of taking Cell arg. HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int length) in favor of filterRowKey(Cell firstRowCell). HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner. HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], int, int. HBASE-11911 Break up tests into more fine grained categories (Alex Newman) HBASE-12059 Create hbase-annotations module HBASE-12106 Move test annotations to test artifact (Enis Soztutar) HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers. HBASE-15679 Assertion on wrong variable in TestReplicationThrottler#testThrottling HBASE-13931 Move Unsafe based operations to UnsafeAccess. HBASE-12345 Unsafe based ByteBuffer Comparator. HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int llength, byte[] right, int roffset, int rlength). HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport this patch) HBASE-13448 New Cell implementation with cached component offsets/lengths. HBASE-13387 Add ByteBufferedCell an extension to Cell. HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum. HBASE-12650 Move ServerName to hbase-common module (partially backport this patch) HBASE-12296 Filters should work with ByteBufferedCell. HBASE-14120 ByteBufferUtils#compareTo small optimization. HBASE-13510 - Purge ByteBloomFilter (Ram) HBASE-13451 - Make the HFileBlockIndex blockKeys to Cells so that it could be easy to use in the CellComparators (Ram) HBASE-13614 - Avoid temp KeyOnlyKeyValue temp objects creations in read hot path (Ram) HBASE-13939 - Make HFileReaderImpl.getFirstKeyInBlock() to return a Cell (Ram) HBASE-13307 Making methods under ScannerV2#next inlineable, faster HBASE-14020 Unsafe based optimized write in ByteBufferOutputStream. HBASE-13977 - Convert getKey and related APIs to Cell (Ram) HBASE-11927 Use Native Hadoop Library for HFile checksum. (Apekshit) HBASE-12213 HFileBlock backed by Array of ByteBuffers (Ram) HBASE-12084 Remove deprecated APIs from Result. HBASE-12084 Remove deprecated APIs from Result - shell addendum HBASE-13754 Allow non KeyValue Cell types also to oswrite. HBASE-14047 - Cleanup deprecated APIs from Cell class (Ashish Singhi) HBASE-13817 ByteBufferOuputStream - add writeInt support. HBASE-12374 Change DBEs to work with new BB based cell. HBASE-14116 Change ByteBuff.getXXXStrictlyForward to relative position based reads HBASE-14073 TestRemoteTable
[jira] [Commented] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15725028#comment-15725028 ] Yu Sun commented on HBASE-17138: done, you and [~anoop.hbase] please see the Description > Backport read-path offheap (HBASE-11425) to branch-1 > > > Key: HBASE-17138 > URL: https://issues.apache.org/jira/browse/HBASE-17138 > Project: HBase > Issue Type: Improvement >Reporter: Yu Li >Assignee: Yu Sun > > From the > [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E] > of sharing our experience and performance data of read-path offheap usage in > Alibaba search, we could see people are positive to have HBASE-11425 in > branch-1, so I'd like to create a JIRA and move the discussion and decision > making here. > Echoing some comments from the mail thread: > Bryan: > Is the backported patch available anywhere? If it ends up not getting > officially backported to branch-1 due to 2.0 around the corner, some of us > who build our own deploy may want to integrate into our builds > Andrew: > Yes, please, the patches will be useful to the community even if we decide > not to backport into an official 1.x release. > Enis: > I don't see any reason why we cannot backport to branch-1. > Ted: > Opening a JIRA would be fine. This makes it easier for people to obtain the > patch(es) > Nick: > From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to > support 1.x releases for some time... I would guess these will be > maintained until 2.2 at least. Therefore, offheap patches that have seen > production exposure seem like a reasonable candidate for backport, perhaps in > a 1.4 or 1.5 release timeframe. > Anoop: > Because of some compatibility issues, we decide that this will be done in 2.0 > only.. Ya as Andy said, it would be great to share the 1.x backported > patches. > The following is all the jira ids we have back ported: > HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells > (Ram) > HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and > AbstractHFileReader; ditto for Scanners and BlockReader, etc. > HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner. > HBASE-13450 - Purge RawBytescomparator from the writers and readers for > HBASE-10800 (Ram) > HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo. > HBASE-12048 Remove deprecated APIs from Filter. > HBASE-10800 - Use CellComparator instead of KVComparator (Ram) > HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], > int, int. > HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with > byte[],int,int args in favor of taking Cell arg. > HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int > length) in favor of filterRowKey(Cell firstRowCell). > HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner. > HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], > int, int. > HBASE-11911 Break up tests into more fine grained categories (Alex Newman) > HBASE-12059 Create hbase-annotations module > HBASE-12106 Move test annotations to test artifact (Enis Soztutar) > HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers. > HBASE-15679 Assertion on wrong variable in > TestReplicationThrottler#testThrottling > HBASE-13931 Move Unsafe based operations to UnsafeAccess. > HBASE-12345 Unsafe based ByteBuffer Comparator. > HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int > llength, byte[] right, int roffset, int rlength). > HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn > HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport > this patch) > HBASE-13448 New Cell implementation with cached component offsets/lengths. > HBASE-13387 Add ByteBufferedCell an extension to Cell. > HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum. > HBASE-12650 Move ServerName to hbase-common module (partially backport this > patch) > HBASE-12296 Filters should work with ByteBufferedCell. > HBASE-14120 ByteBufferUtils#compareTo small optimization. > HBASE-13510 - Purge ByteBloomFilter (Ram) > HBASE-13451 - Make the HFileBlockIndex blockKeys to Cells so that it could be > easy to use in the CellComparators (Ram) > HBASE-13614 - Avoid temp KeyOnlyKeyValue temp objects creations in read hot > path (Ram) > HBASE-13939 - Make HFileReaderImpl.getFirstKeyInBlock() to return a Cell (Ram) > HBASE-13307 Making methods under ScannerV2#next inlineable, faster > HBASE-14020 Unsafe based optimized write in ByteBufferOutputStream. > HBASE-13977 - Convert getKey and related APIs to Cell (Ram) > HBASE-11927 U
[jira] [Updated] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-17138: --- Description: >From the >[thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E] > of sharing our experience and performance data of read-path offheap usage in >Alibaba search, we could see people are positive to have HBASE-11425 in >branch-1, so I'd like to create a JIRA and move the discussion and decision >making here. Echoing some comments from the mail thread: Bryan: Is the backported patch available anywhere? If it ends up not getting officially backported to branch-1 due to 2.0 around the corner, some of us who build our own deploy may want to integrate into our builds Andrew: Yes, please, the patches will be useful to the community even if we decide not to backport into an official 1.x release. Enis: I don't see any reason why we cannot backport to branch-1. Ted: Opening a JIRA would be fine. This makes it easier for people to obtain the patch(es) Nick: >From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to support 1.x releases for some time... I would guess these will be maintained until 2.2 at least. Therefore, offheap patches that have seen production exposure seem like a reasonable candidate for backport, perhaps in a 1.4 or 1.5 release timeframe. Anoop: Because of some compatibility issues, we decide that this will be done in 2.0 only.. Ya as Andy said, it would be great to share the 1.x backported patches. The following is all the jira ids we have back ported: HBASE-10930 Change Filters and GetClosestRowBeforeTracker to work with Cells (Ram) HBASE-13373 Squash HFileReaderV3 together with HFileReaderV2 and AbstractHFileReader; ditto for Scanners and BlockReader, etc. HBASE-13429 Remove deprecated seek/reseek methods from HFileScanner. HBASE-13450 - Purge RawBytescomparator from the writers and readers for HBASE-10800 (Ram) HBASE-13501 - Deprecate/Remove getComparator() in HRegionInfo. HBASE-12048 Remove deprecated APIs from Filter. HBASE-10800 - Use CellComparator instead of KVComparator (Ram) HBASE-13679 Change ColumnTracker and SQM to deal with Cell instead of byte[], int, int. HBASE-13642 Deprecate RegionObserver#postScannerFilterRow CP hook with byte[],int,int args in favor of taking Cell arg. HBASE-13641 Deperecate Filter#filterRowKey(byte[] buffer, int offset, int length) in favor of filterRowKey(Cell firstRowCell). HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner. HBASE-13871 Change RegionScannerImpl to deal with Cell instead of byte[], int, int. HBASE-11911 Break up tests into more fine grained categories (Alex Newman) HBASE-12059 Create hbase-annotations module HBASE-12106 Move test annotations to test artifact (Enis Soztutar) HBASE-13916 Create MultiByteBuffer an aggregation of ByteBuffers. HBASE-15679 Assertion on wrong variable in TestReplicationThrottler#testThrottling HBASE-13931 Move Unsafe based operations to UnsafeAccess. HBASE-12345 Unsafe based ByteBuffer Comparator. HBASE-13998 Remove CellComparator#compareRows(byte[] left, int loffset, int llength, byte[] right, int roffset, int rlength). HBASE-13998 Remove CellComparator#compareRows()- Addendum to fix javadoc warn HBASE-13579 Avoid isCellTTLExpired() for NO-TAG cases (partially backport this patch) HBASE-13448 New Cell implementation with cached component offsets/lengths. HBASE-13387 Add ByteBufferedCell an extension to Cell. HBASE-13387 Add ByteBufferedCell an extension to Cell - addendum. HBASE-12650 Move ServerName to hbase-common module (partially backport this patch) HBASE-12296 Filters should work with ByteBufferedCell. HBASE-14120 ByteBufferUtils#compareTo small optimization. HBASE-13510 - Purge ByteBloomFilter (Ram) HBASE-13451 - Make the HFileBlockIndex blockKeys to Cells so that it could be easy to use in the CellComparators (Ram) HBASE-13614 - Avoid temp KeyOnlyKeyValue temp objects creations in read hot path (Ram) HBASE-13939 - Make HFileReaderImpl.getFirstKeyInBlock() to return a Cell (Ram) HBASE-13307 Making methods under ScannerV2#next inlineable, faster HBASE-14020 Unsafe based optimized write in ByteBufferOutputStream. HBASE-13977 - Convert getKey and related APIs to Cell (Ram) HBASE-11927 Use Native Hadoop Library for HFile checksum. (Apekshit) HBASE-12213 HFileBlock backed by Array of ByteBuffers (Ram) HBASE-12084 Remove deprecated APIs from Result. HBASE-12084 Remove deprecated APIs from Result - shell addendum HBASE-13754 Allow non KeyValue Cell types also to oswrite. HBASE-14047 - Cleanup deprecated APIs from Cell class (Ashish Singhi) HBASE-13817 ByteBufferOuputStream - add writeInt support. HBASE-12374 Change DBEs to work with new BB based cell. HBASE-14116 Change ByteBuff.getXXXStrictlyForward to relative position based reads HBASE-14073 TestRemoteTable.te
[jira] [Commented] (HBASE-17138) Backport read-path offheap (HBASE-11425) to branch-1
[ https://issues.apache.org/jira/browse/HBASE-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15721697#comment-15721697 ] Yu Sun commented on HBASE-17138: yes, [~anoop.hbase] is right, there are some building blocks jira not under HBASE-11425, To resolve the merge conflicts I have totally backport about 77 patches to our customized branch, should I list the all the jira id I have backported as sub-task here for you and [~ram_krish] to check? > Backport read-path offheap (HBASE-11425) to branch-1 > > > Key: HBASE-17138 > URL: https://issues.apache.org/jira/browse/HBASE-17138 > Project: HBase > Issue Type: Improvement >Reporter: Yu Li >Assignee: Yu Sun > > From the > [thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201611.mbox/%3CCAM7-19%2Bn7cEiY4H9iLQ3N9V0NXppOPduZwk-hhgNLEaJfiV3kA%40mail.gmail.com%3E] > of sharing our experience and performance data of read-path offheap usage in > Alibaba search, we could see people are positive to have HBASE-11425 in > branch-1, so I'd like to create a JIRA and move the discussion and decision > making here. > Echoing some comments from the mail thread: > Bryan: > Is the backported patch available anywhere? If it ends up not getting > officially backported to branch-1 due to 2.0 around the corner, some of us > who build our own deploy may want to integrate into our builds > Andrew: > Yes, please, the patches will be useful to the community even if we decide > not to backport into an official 1.x release. > Enis: > I don't see any reason why we cannot backport to branch-1. > Ted: > Opening a JIRA would be fine. This makes it easier for people to obtain the > patch(es) > Nick: > From the DISCUSS thread re: EOL of 1.1, it seems we'll continue to > support 1.x releases for some time... I would guess these will be > maintained until 2.2 at least. Therefore, offheap patches that have seen > production exposure seem like a reasonable candidate for backport, perhaps in > a 1.4 or 1.5 release timeframe. > Anoop: > Because of some compatibility issues, we decide that this will be done in 2.0 > only.. Ya as Andy said, it would be great to share the 1.x backported > patches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-17020: --- Attachment: HBASE-17020-branch-0.98.patch submit patch for 0.98 > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-17020-branch-0.98.patch, HBASE-17020-v1.patch, > HBASE-17020-v2.patch, HBASE-17020-v2.patch, HBASE-17020-v3-branch1.1.patch > > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-17020: --- Attachment: HBASE-17020-v3-branch1.1.patch attach patch for branch-1.0, 1.1, 1.2, 1.3 > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch, > HBASE-17020-v2.patch, HBASE-17020-v3-branch1.1.patch > > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-17020: --- Comment: was deleted (was: ok, i will prepare patch for other branches) > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch, > HBASE-17020-v2.patch > > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15653621#comment-15653621 ] Yu Sun commented on HBASE-17020: ok, i will prepare patch for other branches > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch, > HBASE-17020-v2.patch > > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15653620#comment-15653620 ] Yu Sun commented on HBASE-17020: ok, i will prepare patch for other branches > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch, > HBASE-17020-v2.patch > > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15647552#comment-15647552 ] Yu Sun commented on HBASE-17020: thanks Ram > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch > > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15647145#comment-15647145 ] Yu Sun edited comment on HBASE-17020 at 11/8/16 10:15 AM: -- attach version v2, add a ut to reproduce ArrayIndexOutOfBoundsException was (Author: haoran): attach version v2, add a ut to reproduct ArrayIndexOutOfBoundsException > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch > > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-17020: --- Attachment: HBASE-17020-v2.patch attach version v2, add a ut to reproduct ArrayIndexOutOfBoundsException > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-17020-v1.patch, HBASE-17020-v2.patch > > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15636854#comment-15636854 ] Yu Sun commented on HBASE-17020: No, old versions also have same issue, I think this bug exist several years ago. > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-17020-v1.patch > > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-17020: --- Comment: was deleted (was: Yes,we encounter the exception in our cluster,i have tried to UT to reproduce this exception, but i found its hard, because although in CellBasedKeyBlockIndexReader.midkey() {code} byte[] bytes = b.toBytes(keyOffset, keyLen); {code} keyOffset + keyLen have exceed b.limit() or b.capacity()(exceed SECONDARY_INDEX_ENTRY_OVERHEAD == 12 bytes), but is still small than b.array().length, due to hbase alway read the next block header which of size 33 Bytes into buffer. so in ByteBufferUtils.copyFromBufferToArray {code} public static void copyFromBufferToArray(byte[] out, ByteBuffer in, int sourceOffset, int destinationOffset, int length) { if (in.hasArray()) { System.arraycopy(in.array(), sourceOffset + in.arrayOffset(), out, destinationOffset, length); } else if (UNSAFE_AVAIL) { UnsafeAccess.copy(in, sourceOffset, out, destinationOffset, length); {code} hbase use in.array().length to check if ArrayIndexOutOfBoundsException. so in most case, it not occur frequently. i will try to improve the ut this weekend. ) > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-17020-v1.patch > > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15636784#comment-15636784 ] Yu Sun commented on HBASE-17020: Yes,we encounter the exception in our cluster,i have tried to UT to reproduce this exception, but i found its hard, because although in CellBasedKeyBlockIndexReader.midkey() {code} byte[] bytes = b.toBytes(keyOffset, keyLen); {code} keyOffset + keyLen have exceed b.limit() or b.capacity()(exceed SECONDARY_INDEX_ENTRY_OVERHEAD == 12 bytes), but is still small than b.array().length, due to hbase alway read the next block header which of size 33 Bytes into buffer. so in ByteBufferUtils.copyFromBufferToArray {code} public static void copyFromBufferToArray(byte[] out, ByteBuffer in, int sourceOffset, int destinationOffset, int length) { if (in.hasArray()) { System.arraycopy(in.array(), sourceOffset + in.arrayOffset(), out, destinationOffset, length); } else if (UNSAFE_AVAIL) { UnsafeAccess.copy(in, sourceOffset, out, destinationOffset, length); {code} hbase use in.array().length to check if ArrayIndexOutOfBoundsException. so in most case, it not occur frequently. i will try to improve the ut this weekend. > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-17020-v1.patch > > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15636786#comment-15636786 ] Yu Sun commented on HBASE-17020: Yes,we encounter the exception in our cluster,i have tried to UT to reproduce this exception, but i found its hard, because although in CellBasedKeyBlockIndexReader.midkey() {code} byte[] bytes = b.toBytes(keyOffset, keyLen); {code} keyOffset + keyLen have exceed b.limit() or b.capacity()(exceed SECONDARY_INDEX_ENTRY_OVERHEAD == 12 bytes), but is still small than b.array().length, due to hbase alway read the next block header which of size 33 Bytes into buffer. so in ByteBufferUtils.copyFromBufferToArray {code} public static void copyFromBufferToArray(byte[] out, ByteBuffer in, int sourceOffset, int destinationOffset, int length) { if (in.hasArray()) { System.arraycopy(in.array(), sourceOffset + in.arrayOffset(), out, destinationOffset, length); } else if (UNSAFE_AVAIL) { UnsafeAccess.copy(in, sourceOffset, out, destinationOffset, length); {code} hbase use in.array().length to check if ArrayIndexOutOfBoundsException. so in most case, it not occur frequently. i will try to improve the ut this weekend. > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-17020-v1.patch > > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-17020: --- Attachment: HBASE-17020-v1.patch patch version v1, any ut required? > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-17020-v1.patch > > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17020) keylen in midkey() dont computed correctly
[ https://issues.apache.org/jira/browse/HBASE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-17020: --- Summary: keylen in midkey() dont computed correctly (was: keylen in midkey() should be computed correctly) > keylen in midkey() dont computed correctly > -- > > Key: HBASE-17020 > URL: https://issues.apache.org/jira/browse/HBASE-17020 > Project: HBase > Issue Type: Bug > Components: HFile >Reporter: Yu Sun >Assignee: Yu Sun > > in CellBasedKeyBlockIndexReader.midkey(): > {code} > ByteBuff b = midLeafBlock.getBufferWithoutHeader(); > int numDataBlocks = b.getIntAfterPosition(0); > int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * > (midKeyEntry + 1)); > int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry > + 2)) - keyRelOffset; > {code} > the local varible keyLen get this should be total length of: > SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; > the code is: > {code} > void add(byte[] firstKey, long blockOffset, int onDiskDataSize, > long curTotalNumSubEntries) { > // Record the offset for the secondary index > secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); > curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD > + firstKey.length; > {code} > when the midkey last entry of a leaf-level index block, this may throw: > {quote} > 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] > regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region > pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) > at > org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) > at > org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) > at > org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) > at > org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) > at java.lang.Thread.run(Thread.java:756) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17020) keylen in midkey() should be computed correctly
Yu Sun created HBASE-17020: -- Summary: keylen in midkey() should be computed correctly Key: HBASE-17020 URL: https://issues.apache.org/jira/browse/HBASE-17020 Project: HBase Issue Type: Bug Components: HFile Reporter: Yu Sun Assignee: Yu Sun in CellBasedKeyBlockIndexReader.midkey(): {code} ByteBuff b = midLeafBlock.getBufferWithoutHeader(); int numDataBlocks = b.getIntAfterPosition(0); int keyRelOffset = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry + 1)); int keyLen = b.getIntAfterPosition(Bytes.SIZEOF_INT * (midKeyEntry + 2)) - keyRelOffset; {code} the local varible keyLen get this should be total length of: SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; the code is: {code} void add(byte[] firstKey, long blockOffset, int onDiskDataSize, long curTotalNumSubEntries) { // Record the offset for the secondary index secondaryIndexOffsetMarks.add(curTotalNonRootEntrySize); curTotalNonRootEntrySize += SECONDARY_INDEX_ENTRY_OVERHEAD + firstKey.length; {code} when the midkey last entry of a leaf-level index block, this may throw: {quote} 2016-10-01 12:27:55,186 ERROR [MemStoreFlusher.0] regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region pora_6_item_feature,0061:,1473838922457.12617bc4ebbfd171018bf96ac9bdd2a7.] java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:936) at org.apache.hadoop.hbase.nio.SingleByteBuff.toBytes(SingleByteBuff.java:303) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.midkey(HFileBlockIndex.java:419) at org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.midkey(HFileReaderImpl.java:1519) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1520) at org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:706) at org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) at org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1983) at org.apache.hadoop.hbase.regionserver.ConstantFamilySizeRegionSplitPolicy.getSplitPoint(ConstantFamilySizeRegionSplitPolicy.java:77) at org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7756) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:513) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) at java.lang.Thread.run(Thread.java:756) {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16704) Scan will broken while work with KeyValueCodecWithTags
[ https://issues.apache.org/jira/browse/HBASE-16704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16704: --- Assignee: Anoop Sam John > Scan will broken while work with KeyValueCodecWithTags > -- > > Key: HBASE-16704 > URL: https://issues.apache.org/jira/browse/HBASE-16704 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Yu Sun >Assignee: Anoop Sam John > > scan will always broken if we set LIMIT more than 1 with rs > hbase.client.rpc.codec set to > org.apache.hadoop.hbase.codec.KeyValueCodecWithTags. > How to reproduce: > 1. 1 master + 1 rs, codec use KeyValueCodecWithTags. > 2. create a table table_1024B_30g,1 cf and with only 1 qualifier, then load > some data with ycsb, > 3. scan 'table_1024B_30g', {LIMIT => 2, STARTROW => 'user5499'}, STARTROW is > set any valid start row. > 4. scan failed. > this should be bug in KeyValueCodecWithTags, after some investigations, I > found some the key not serialized correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16704) Scan will broken while work with KeyValueCodecWithTags
Yu Sun created HBASE-16704: -- Summary: Scan will broken while work with KeyValueCodecWithTags Key: HBASE-16704 URL: https://issues.apache.org/jira/browse/HBASE-16704 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Yu Sun scan will always broken if we set LIMIT more than 1 with rs hbase.client.rpc.codec set to org.apache.hadoop.hbase.codec.KeyValueCodecWithTags. How to reproduce: 1. 1 master + 1 rs, codec use KeyValueCodecWithTags. 2. create a table table_1024B_30g,1 cf and with only 1 qualifier, then load some data with ycsb, 3. scan 'table_1024B_30g', {LIMIT => 2, STARTROW => 'user5499'}, STARTROW is set any valid start row. 4. scan failed. this should be bug in KeyValueCodecWithTags, after some investigations, I found some the key not serialized correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16609) Fake cells EmptyByteBufferedCell created in read path not implementing SettableSequenceId
[ https://issues.apache.org/jira/browse/HBASE-16609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16609: --- Description: I backport offheap in 2.0 to hbase-1.1.2, and when testing,I encounter a similar problem HBASE-15379 ,Here is the stack trace: {noformat} java.io.IOException: java.lang.UnsupportedOperationException: Cell is not of type org.apache.hadoop.hbase.SettableSequenceId at org.apache.hadoop.hbase.CellUtil.setSequenceId(CellUtil.java:915) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.setCurrentCell(StoreFileScanner.java:203) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.requestSeek(StoreFileScanner.java:338) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:821) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:809) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:636) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5611) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5750) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5551) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5528) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5515) at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2125) at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2068) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32201 ) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:790) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:102) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) {noformat} this will occur in read path when offheap is used. mostly due to ByteBuffer backed Cells dont implement interface SettableSequenceId. was: I backport offheap in 2.0 to hbase-1.1.2, and when testing,I encounter a similar problem HBASE-14099 ,Here is the stack trace: {noformat} java.io.IOException: java.lang.UnsupportedOperationException: Cell is not of type org.apache.hadoop.hbase.SettableSequenceId at org.apache.hadoop.hbase.CellUtil.setSequenceId(CellUtil.java:915) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.setCurrentCell(StoreFileScanner.java:203) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.requestSeek(StoreFileScanner.java:338) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:821) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:809) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:636) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5611) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5750) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5551) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5528) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5515) at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2125) at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2068) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32201 ) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:790) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:102) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) {noformat} this will occur in read path when offheap is used. mostly due to ByteBuffer backed Cells
[jira] [Commented] (HBASE-16609) Fake cells EmptyByteBufferedCell created in read path not implementing SettableSequenceId
[ https://issues.apache.org/jira/browse/HBASE-16609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483138#comment-15483138 ] Yu Sun commented on HBASE-16609: sorry, i make a mistake ,shoud be this jira HBASE-15379 . > Fake cells EmptyByteBufferedCell created in read path not implementing > SettableSequenceId > --- > > Key: HBASE-16609 > URL: https://issues.apache.org/jira/browse/HBASE-16609 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Yu Sun >Assignee: Yu Sun > Fix For: 2.0.0 > > Attachments: HBASE-16609-v1.patch > > > I backport offheap in 2.0 to hbase-1.1.2, and when testing,I encounter a > similar problem HBASE-14099 ,Here is the stack trace: > {noformat} > java.io.IOException: java.lang.UnsupportedOperationException: Cell is not of > type org.apache.hadoop.hbase.SettableSequenceId > at org.apache.hadoop.hbase.CellUtil.setSequenceId(CellUtil.java:915) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.setCurrentCell(StoreFileScanner.java:203) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.requestSeek(StoreFileScanner.java:338) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:821) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:809) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:636) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5611) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5750) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5551) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5528) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5515) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2125) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2068) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32201 > ) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:790) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:102) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > {noformat} > this will occur in read path when offheap is used. mostly due to ByteBuffer > backed Cells dont implement interface SettableSequenceId. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16609) Fake cells EmptyByteBufferedCell created in read path not implementing SettableSequenceId
[ https://issues.apache.org/jira/browse/HBASE-16609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16609: --- Attachment: HBASE-16609-v1.patch attach patch v1 > Fake cells EmptyByteBufferedCell created in read path not implementing > SettableSequenceId > --- > > Key: HBASE-16609 > URL: https://issues.apache.org/jira/browse/HBASE-16609 > Project: HBase > Issue Type: Bug >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16609-v1.patch > > > I backport offheap in 2.0 to hbase-1.1.2, and when testing,I encounter a > similar problem HBASE-14099 ,Here is the stack trace: > {noformat} > java.io.IOException: java.lang.UnsupportedOperationException: Cell is not of > type org.apache.hadoop.hbase.SettableSequenceId > at org.apache.hadoop.hbase.CellUtil.setSequenceId(CellUtil.java:915) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.setCurrentCell(StoreFileScanner.java:203) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.requestSeek(StoreFileScanner.java:338) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:821) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:809) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:636) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5611) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5750) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5551) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5528) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5515) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2125) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2068) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32201 > ) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:790) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:102) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > {noformat} > this will occur in read path when offheap is used. mostly due to ByteBuffer > backed Cells dont implement interface SettableSequenceId. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16609) Fake cells EmptyByteBufferedCell created in read path not implementing SettableSequenceId
Yu Sun created HBASE-16609: -- Summary: Fake cells EmptyByteBufferedCell created in read path not implementing SettableSequenceId Key: HBASE-16609 URL: https://issues.apache.org/jira/browse/HBASE-16609 Project: HBase Issue Type: Bug Reporter: Yu Sun Assignee: Yu Sun I backport offheap in 2.0 to hbase-1.1.2, and when testing,I encounter a similar problem HBASE-14099 ,Here is the stack trace: {noformat} java.io.IOException: java.lang.UnsupportedOperationException: Cell is not of type org.apache.hadoop.hbase.SettableSequenceId at org.apache.hadoop.hbase.CellUtil.setSequenceId(CellUtil.java:915) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.setCurrentCell(StoreFileScanner.java:203) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.requestSeek(StoreFileScanner.java:338) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:821) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:809) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:636) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5611) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5750) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5551) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5528) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5515) at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2125) at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2068) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32201 ) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:790) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:102) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) {noformat} this will occur in read path when offheap is used. mostly due to ByteBuffer backed Cells dont implement interface SettableSequenceId. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407761#comment-15407761 ] Yu Sun commented on HBASE-16287: ok > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, > HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, > HBASE-16287-v6.patch, HBASE-16287-v7.patch, HBASE-16287-v8.patch, > HBASE-16287-v9.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407762#comment-15407762 ] Yu Sun commented on HBASE-16287: ok > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, > HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, > HBASE-16287-v6.patch, HBASE-16287-v7.patch, HBASE-16287-v8.patch, > HBASE-16287-v9.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407598#comment-15407598 ] Yu Sun commented on HBASE-16287: thanks for comments, v9 has fix this. BTW, I'd like to fix evictionInProgress problem if [~chenheng] and [~anoop.hbase] dont have time(smile)(smile) > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, > HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, > HBASE-16287-v6.patch, HBASE-16287-v7.patch, HBASE-16287-v8.patch, > HBASE-16287-v9.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16287: --- Attachment: HBASE-16287-v9.patch resubmit v9 for QA run and the main changes: calculate acceptableSize just once > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, > HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, > HBASE-16287-v6.patch, HBASE-16287-v7.patch, HBASE-16287-v8.patch, > HBASE-16287-v9.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16287: --- Attachment: HBASE-16287-v8.patch v8 is based on v6 + [~yuzhih...@gmail.com] advice. > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, > HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, > HBASE-16287-v6.patch, HBASE-16287-v7.patch, HBASE-16287-v8.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407101#comment-15407101 ] Yu Sun commented on HBASE-16287: [~chenheng] Thanks for your reminding, yes,you are right, maxSize maybe changed in runtime. I will resubmit another version base on v6 with Ted Yu's advice, it's ok? > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, > HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, > HBASE-16287-v6.patch, HBASE-16287-v7.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16287: --- Attachment: HBASE-16287-v7.patch resubmit version v7 for QA v7 contains the following two changes: 1. calculate hardLimitSize in advance 2. call stats.failInsert() while put into cache failed. > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, > HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, > HBASE-16287-v6.patch, HBASE-16287-v7.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406969#comment-15406969 ] Yu Sun commented on HBASE-16287: thanks for [~yuzhih...@gmail.com] comments, yes, I will update the patch to fix this. > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, > HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, > HBASE-16287-v6.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406103#comment-15406103 ] Yu Sun commented on HBASE-16287: yes, I have tried this patch on some rs of our real cluster, it looks good. > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, > HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, > HBASE-16287-v6.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16287: --- Attachment: HBASE-16287-v6.patch retry > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, > HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, > HBASE-16287-v6.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16287: --- Attachment: HBASE-16287-v5.patch > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, > HBASE-16287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16287: --- Attachment: HBASE-16287-v4.patch the faileure ut seems has no relateion with the changes I made and i also cant reproduce the failure ut in my local machine. so resubmit the patch for further check. > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, > HBASE-16287-v3.patch, HBASE-16287-v4.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16287: --- Attachment: HBASE-16287-v3.patch summit patch v3 rebase to master > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, > HBASE-16287-v3.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly
[ https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398770#comment-15398770 ] Yu Sun commented on HBASE-16300: thanks [~stack] and [~carp84] > LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size > correctly > > > Key: HBASE-16300 > URL: https://issues.apache.org/jira/browse/HBASE-16300 > Project: HBase > Issue Type: Bug >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16300-v1.patch > > > in current master {{LruBlockCache}}, CACHE_FIXED_OVERHEAD is calculated as > this: > {code} > public final static long CACHE_FIXED_OVERHEAD = ClassSize.align( > (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) + > (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN) > + ClassSize.OBJECT); > {code} > after some investigation. I think there are some wrong here, {{class > LruBlockCache}}, except static varible(which is belongs to class), there are > 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference > varibles and 2 boolean varibles, so the above code will not calculate > LruBlockCache instance size correctly. > current related ut not failed mostly due to the result is 8 bytes aligned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398623#comment-15398623 ] Yu Sun commented on HBASE-16287: will attach patch v3 after {{HBASE-16300}} commited, or TestHeapSize will failed while add new filed of type float to LruBlockCache. > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly
[ https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398566#comment-15398566 ] Yu Sun commented on HBASE-16300: this small bug likely imported by {{HBASE-14793}}, in which add a new long field {{private final long maxBlockSize;}} to LruBlockCache, but when update: {code} public final static long CACHE_FIXED_OVERHEAD = ClassSize.align( - (3 * Bytes.SIZEOF_LONG) + (9 * ClassSize.REFERENCE) + - (5 * Bytes.SIZEOF_FLOAT) + Bytes.SIZEOF_BOOLEAN + (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) + + (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN) + ClassSize.OBJECT); {code} the long field are taken as a reference. > LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size > correctly > > > Key: HBASE-16300 > URL: https://issues.apache.org/jira/browse/HBASE-16300 > Project: HBase > Issue Type: Bug >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16300-v1.patch > > > in current master {{LruBlockCache}}, CACHE_FIXED_OVERHEAD is calculated as > this: > {code} > public final static long CACHE_FIXED_OVERHEAD = ClassSize.align( > (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) + > (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN) > + ClassSize.OBJECT); > {code} > after some investigation. I think there are some wrong here, {{class > LruBlockCache}}, except static varible(which is belongs to class), there are > 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference > varibles and 2 boolean varibles, so the above code will not calculate > LruBlockCache instance size correctly. > current related ut not failed mostly due to the result is 8 bytes aligned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly
[ https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398529#comment-15398529 ] Yu Sun commented on HBASE-16300: "org.apache.hadoop.hbase.io.TestHeapSize.testSizes" will cover this change,so no need new ut I think. > LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size > correctly > > > Key: HBASE-16300 > URL: https://issues.apache.org/jira/browse/HBASE-16300 > Project: HBase > Issue Type: Bug >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16300-v1.patch > > > in current master {{LruBlockCache}}, CACHE_FIXED_OVERHEAD is calculated as > this: > {code} > public final static long CACHE_FIXED_OVERHEAD = ClassSize.align( > (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) + > (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN) > + ClassSize.OBJECT); > {code} > after some investigation. I think there are some wrong here, {{class > LruBlockCache}}, except static varible(which is belongs to class), there are > 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference > varibles and 2 boolean varibles, so the above code will not calculate > LruBlockCache instance size correctly. > current related ut not failed mostly due to the result is 8 bytes aligned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly
[ https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398031#comment-15398031 ] Yu Sun commented on HBASE-16300: after some investigation ,current ut such as {{org.apache.hadoop.hbase.io.TestHeapSize}} can work mostly due to alignment. > LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size > correctly > > > Key: HBASE-16300 > URL: https://issues.apache.org/jira/browse/HBASE-16300 > Project: HBase > Issue Type: Bug >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16300-v1.patch > > > in current master {{LruBlockCache}}, CACHE_FIXED_OVERHEAD is calculated as > this: > {code} > public final static long CACHE_FIXED_OVERHEAD = ClassSize.align( > (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) + > (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN) > + ClassSize.OBJECT); > {code} > after some investigation. I think there are some wrong here, {{class > LruBlockCache}}, except static varible(which is belongs to class), there are > 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference > varibles and 2 boolean varibles, so the above code will not calculate > LruBlockCache instance size correctly. > current related ut not failed mostly due to the result is 8 bytes aligned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly
[ https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16300: --- Attachment: HBASE-16300-v1.patch attach patch v1 > LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size > correctly > > > Key: HBASE-16300 > URL: https://issues.apache.org/jira/browse/HBASE-16300 > Project: HBase > Issue Type: Bug >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16300-v1.patch > > > in current master {{LruBlockCache}}, CACHE_FIXED_OVERHEAD is calculated as > this: > {code} > public final static long CACHE_FIXED_OVERHEAD = ClassSize.align( > (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) + > (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN) > + ClassSize.OBJECT); > {code} > after some investigation. I think there are some wrong here, {{class > LruBlockCache}}, except static varible(which is belongs to class), there are > 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference > varibles and 2 boolean varibles, so the above code will not calculate > LruBlockCache instance size correctly. > current related ut not failed mostly due to the result is 8 bytes aligned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly
[ https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16300: --- Description: in current master {{LruBlockCache}}, CACHE_FIXED_OVERHEAD is calculated as this: {code} public final static long CACHE_FIXED_OVERHEAD = ClassSize.align( (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) + (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN) + ClassSize.OBJECT); {code} after some investigation. I think there are some wrong here, {{class LruBlockCache}}, except static varible(which is belongs to class), there are 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference varibles and 2 boolean varibles, so the above code will not calculate LruBlockCache instance size correctly. current related ut not failed mostly due to the result is 8 bytes aligned. was: in current master {{LruBlockCache}}, CACHE_FIXED_OVERHEAD is calculated as this: {code} public final static long CACHE_FIXED_OVERHEAD = ClassSize.align( (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) + (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN) + ClassSize.OBJECT); {code} after some investigation. I think there are some wrong here, {{class LruBlockCache}}, except static varible(which is belongs to class), there are 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference varibles and 2 boolean varibles, so the above code will not calculate LruBlockCache instance size correctly. > LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size > correctly > > > Key: HBASE-16300 > URL: https://issues.apache.org/jira/browse/HBASE-16300 > Project: HBase > Issue Type: Bug >Reporter: Yu Sun >Assignee: Yu Sun > > in current master {{LruBlockCache}}, CACHE_FIXED_OVERHEAD is calculated as > this: > {code} > public final static long CACHE_FIXED_OVERHEAD = ClassSize.align( > (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) + > (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN) > + ClassSize.OBJECT); > {code} > after some investigation. I think there are some wrong here, {{class > LruBlockCache}}, except static varible(which is belongs to class), there are > 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference > varibles and 2 boolean varibles, so the above code will not calculate > LruBlockCache instance size correctly. > current related ut not failed mostly due to the result is 8 bytes aligned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly
[ https://issues.apache.org/jira/browse/HBASE-16300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16300: --- Status: Patch Available (was: Open) > LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size > correctly > > > Key: HBASE-16300 > URL: https://issues.apache.org/jira/browse/HBASE-16300 > Project: HBase > Issue Type: Bug >Reporter: Yu Sun >Assignee: Yu Sun > > in current master {{LruBlockCache}}, CACHE_FIXED_OVERHEAD is calculated as > this: > {code} > public final static long CACHE_FIXED_OVERHEAD = ClassSize.align( > (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) + > (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN) > + ClassSize.OBJECT); > {code} > after some investigation. I think there are some wrong here, {{class > LruBlockCache}}, except static varible(which is belongs to class), there are > 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference > varibles and 2 boolean varibles, so the above code will not calculate > LruBlockCache instance size correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16300) LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly
Yu Sun created HBASE-16300: -- Summary: LruBlockCache.CACHE_FIXED_OVERHEAD should calculate LruBlockCache size correctly Key: HBASE-16300 URL: https://issues.apache.org/jira/browse/HBASE-16300 Project: HBase Issue Type: Bug Reporter: Yu Sun Assignee: Yu Sun in current master {{LruBlockCache}}, CACHE_FIXED_OVERHEAD is calculated as this: {code} public final static long CACHE_FIXED_OVERHEAD = ClassSize.align( (3 * Bytes.SIZEOF_LONG) + (10 * ClassSize.REFERENCE) + (5 * Bytes.SIZEOF_FLOAT) + (2 * Bytes.SIZEOF_BOOLEAN) + ClassSize.OBJECT); {code} after some investigation. I think there are some wrong here, {{class LruBlockCache}}, except static varible(which is belongs to class), there are 4 long varibles(maxBlockSize,maxSize,blockSize and overhead), 9 reference varibles and 2 boolean varibles, so the above code will not calculate LruBlockCache instance size correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16287: --- Attachment: HBASE-16287-v2.patch attach patch v2 to fix ut failed error this patch also contains a fix to org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testStoreFileCacheOnWrite() failed due to this patch. its simply set LruBlockCache.LRU_HARD_CAPACITY_LIMIT_FACTOR_CONFIG_NAME, to 2.0f, and if we dont apply this change, estCacheOnWrite.testStoreFileCacheOnWrite() will failed and the output log is: {quote} 2016-07-28 23:02:49,801 INFO [main] hfile.CacheConfig(285): blockCache=LruBlockCache{blockCount=0, currentSize=159452224, f reeSize=-25234496, maxSize=134217728, heapSize=159452224, minSize=127506840, minFactor=0.95, multiSize=63753420, multiFactor =0.5, singleSize=31876710, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=true, cache BloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=true, prefetchOnOpen=false 2016-07-28 23:02:49,807 DEBUG [main] hfile.HFile$WriterFactory(345): Unable to set drop behind on /home/hongxi.sy/hbase/hbas e-server/target/test-data/b1c99d85-27e3-4796-a66b-324feb06c620/test_cache_on_write/9174b12e141143acb9d4be7b6e7165a9 {quote} from the log above we can see: currentSize > 1.2f * maxSize * DEFAULT_ACCEPTABLE_FACTOR, that is 159452224 > 159450660.864, so the block beening read is not put into LruCache, the assert failed. here i just increment hard limt factor to make the lur cache large enough for all the blocks of the file beening read. > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16287: --- Attachment: HBASE-16287-v1.patch attachment patch v1 > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > Attachments: HBASE-16287-v1.patch > > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16287: --- Assignee: Yu Sun Status: Patch Available (was: Open) > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun >Assignee: Yu Sun > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16287) BlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395252#comment-15395252 ] Yu Sun commented on HBASE-16287: [~chenheng] thanks, I will upload the patch later today. > BlockCache size should not exceed acceptableSize too many > - > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16287: --- Summary: LruBlockCache size should not exceed acceptableSize too many (was: BlockCache size should not exceed acceptableSize too many) > LruBlockCache size should not exceed acceptableSize too many > > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16287) BlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395250#comment-15395250 ] Yu Sun commented on HBASE-16287: [~anoop.hbase] sorry for my latter reply. thanks for your comments. yes you can open another jira for L2 cache, and I will change the this jira title soon. > BlockCache size should not exceed acceptableSize too many > - > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, > evictedPerRun=20051.93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which will > cause the FullGC more seriously. > Afterfter some investigations, I found in this function: > {code:borderStyle=solid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. but > if the evict thread is not fast enough, blockcache size will increament > significantly. > So here I think we should have a check, for example, if the blockcache size > > 1.2 * acceptableSize(), just return and dont put into it until the blockcache > size if under watrmark. if this is reasonable, I can make a small patch for > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-16287) BlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395233#comment-15395233 ] Yu Sun edited comment on HBASE-16287 at 7/27/16 8:33 AM: - {quote} Why -1g? We calc the BC size by conf xmx value * BC percentage. {quote} under this jvm configuation:-Xmn4g -XX:SurvriorRatio=2, survrior size will be 4g/(2+1+1)=1g, and at any time(except between young gc and some FullGC(not cms)), at least one of the two survrior is empty, contains no objects. so if we get max heapsize by jvm, jvm will just return Xmx - one survrior size. {code:borderStyle=solid} public static synchronized BlockCache instantiateBlockCache(Configuration conf) { if (GLOBAL_BLOCK_CACHE_INSTANCE != null) return GLOBAL_BLOCK_CACHE_INSTANCE; if (blockCacheDisabled) return null; MemoryUsage mu = ManagementFactory.getMemoryMXBean().getHeapMemoryUsage(); LruBlockCache l1 = getL1(conf, mu); {code} {code:borderStyle=solid} static long getLruCacheSize(final Configuration conf, final MemoryUsage mu) { float cachePercentage = conf.getFloat(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY, HConstants.HFILE_BLOCK_CACHE_SIZE_DEFAULT); if (cachePercentage <= 0.0001f) { blockCacheDisabled = true; return -1; } if (cachePercentage > 1.0) { throw new IllegalArgumentException(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY + " must be between 0.0 and 1.0, and not > 1.0"); } // Calculate the amount of heap to give the heap. return (long) (mu.getMax() * cachePercentage); } {code} the code above is how hbase compute block cache size, and the keypoint is how mu.getMax() is calculated。 mu itself is returned by the following jni call: http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/58e586f18da6/src/share/native/sun/management/MemoryImpl.c {code:borderStyle=solid} JNIEXPORT jobject JNICALL Java_sun_management_MemoryImpl_getMemoryManagers0 (JNIEnv *env, jclass dummy) { return jmm_interface->GetMemoryManagers(env, NULL); } {code} GetMemoryManagers(env, NULL) is implemented in jvm in file: http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/vm/services/management.cpp and part of this function implementation is listed bellow: {code:borderStyle=solid} // Returns a java/lang/management/MemoryUsage object representing // the memory usage for the heap or non-heap memory. JVM_ENTRY(jobject, jmm_GetMemoryUsage(JNIEnv* env, jboolean heap)) ResourceMark rm(THREAD); // Calculate the memory usage size_t total_init = 0; size_t total_used = 0; size_t total_committed = 0; size_t total_max = 0; bool has_undefined_init_size = false; bool has_undefined_max_size = false; .. .. MemoryUsage usage((heap ? InitialHeapSize : total_init), total_used, total_committed, (heap ? Universe::heap()->max_capacity() : total_max)); Handle obj = MemoryService::create_MemoryUsage_obj(usage, CHECK_NULL); return JNIHandles::make_local(env, obj()); JVM_END {code} according to ctor of MemoryUsage, the _maxSize field is initialized by Universe::heap()->max_capacity(), which also implemented in jvm, take CMS gc for example(PS and G1 is almost the same): http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/vm/memory/genCollectedHeap.cpp {code:borderStyle=solid} size_t GenCollectedHeap::max_capacity() const { size_t res = 0; for (int i = 0; i < _n_gens; i++) { res += _gens[i]->max_capacity(); } return res; } {code} in the above code, _n_gens is 2, represent 2 generations(young and old), and max_capacity() is a virtual call , for young generation and cms gc, the max_capacity() is implemented in : http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/vm/memory/defNewGeneration.cpp {code:borderStyle=solid} size_t DefNewGeneration::max_capacity() const { const size_t alignment = GenCollectedHeap::heap()->collector_policy()->min_alignment(); const size_t reserved_bytes = reserved().byte_size(); return reserved_bytes - compute_survivor_size(reserved_bytes, alignment); {code} reserved_bytes is just Xmn we set, so here we can see jvm calculate young gen max_capacity by Xmn-one survrior size. actually, in CMS gc ,adaptive policy is disabled explicitly in jvm, so the two survrior alway of the same size. was (Author: haoran): {quote} Why -1g? We calc the BC size by conf xmx value * BC percentage. {quote} under this jvm configuation:-Xmn4g -XX:SurvriorRatio=2, survrior size will be 4g/(2+1+1)=1g, and at any time(except between young gc and some FullGC(not cms)), at least one of the two survrior is empty, contains no objects. so if we get max heapsize by jvm, jvm will just return Xmx - one survrior size. {code:borderStyle=solid} public static sync
[jira] [Commented] (HBASE-16287) BlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395233#comment-15395233 ] Yu Sun commented on HBASE-16287: {quote} Why -1g? We calc the BC size by conf xmx value * BC percentage. {quote} under this jvm configuation:-Xmn4g -XX:SurvriorRatio=2, survrior size will be 4g/(2+1+1)=1g, and at any time(except between young gc and some FullGC(not cms)), at least one of the two survrior is empty, contains no objects. so if we get max heapsize by jvm, jvm will just return Xmx - one survrior size. {code:borderStyle=solid} public static synchronized BlockCache instantiateBlockCache(Configuration conf) { if (GLOBAL_BLOCK_CACHE_INSTANCE != null) return GLOBAL_BLOCK_CACHE_INSTANCE; if (blockCacheDisabled) return null; MemoryUsage mu = ManagementFactory.getMemoryMXBean().getHeapMemoryUsage(); LruBlockCache l1 = getL1(conf, mu); {code} {code:borderStyle=solid} static long getLruCacheSize(final Configuration conf, final MemoryUsage mu) { float cachePercentage = conf.getFloat(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY, HConstants.HFILE_BLOCK_CACHE_SIZE_DEFAULT); if (cachePercentage <= 0.0001f) { blockCacheDisabled = true; return -1; } if (cachePercentage > 1.0) { throw new IllegalArgumentException(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY + " must be between 0.0 and 1.0, and not > 1.0"); } // Calculate the amount of heap to give the heap. return (long) (mu.getMax() * cachePercentage); } {code} the code above is how hbase compute block cache size, and the keypoint is how mu.getMax() is calculated。 mu itself is returned by the following jni call: http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/58e586f18da6/src/share/native/sun/management/MemoryImpl.c {code:borderStyle=solid} JNIEXPORT jobject JNICALL Java_sun_management_MemoryImpl_getMemoryManagers0 (JNIEnv *env, jclass dummy) { return jmm_interface->GetMemoryManagers(env, NULL); } {code} GetMemoryManagers(env, NULL) is implemented in jvm in file: http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/vm/services/management.cpp and part of this function implementation is listed bellow: {code:borderStyle=solid} // Returns a java/lang/management/MemoryUsage object representing // the memory usage for the heap or non-heap memory. JVM_ENTRY(jobject, jmm_GetMemoryUsage(JNIEnv* env, jboolean heap)) ResourceMark rm(THREAD); // Calculate the memory usage size_t total_init = 0; size_t total_used = 0; size_t total_committed = 0; size_t total_max = 0; bool has_undefined_init_size = false; bool has_undefined_max_size = false; .. .. MemoryUsage usage((heap ? InitialHeapSize : total_init), total_used, total_committed, (heap ? Universe::heap()->max_capacity() : total_max)); Handle obj = MemoryService::create_MemoryUsage_obj(usage, CHECK_NULL); return JNIHandles::make_local(env, obj()); JVM_END {code} according to ctor of MemoryUsage, the _maxSize field is initialized by Universe::heap()->max_capacity(), which also implemented in jvm, take CMS gc for example(PS and G1 is almost the same): http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/vm/memory/genCollectedHeap.cpp {code:borderStyle=solid} size_t GenCollectedHeap::max_capacity() const { size_t res = 0; for (int i = 0; i < _n_gens; i++) { res += _gens[i]->max_capacity(); } return res; } {code} in the above code, _n_gens is 2, represent 2 generations(young and old), and max_capacity() is a virtual call , for young generation and cms gc, the max_capacity() is implemented in : http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/vm/memory/defNewGeneration.cpp {code:borderStyle=solid} size_t DefNewGeneration::max_capacity() const { const size_t alignment = GenCollectedHeap::heap()->collector_policy()->min_alignment(); const size_t reserved_bytes = reserved().byte_size(); return reserved_bytes - compute_survivor_size(reserved_bytes, alignment); {code} reserved_bytes is just Xmn we set, so here we can see jvm calculate young gen max_capacity by Xmn-one survrior size. actually, in CMS gc ,adaptive policy is disabled explicitly in jvm, so the two survrior alway of the same this. > BlockCache size should not exceed acceptableSize too many > - > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockca
[jira] [Updated] (HBASE-16287) BlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16287: --- Description: Our regionserver has a configuation as bellow: -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC also we only use blockcache,and set hfile.block.cache.size = 0.3 in hbase_site.xml,so under this configuration, the lru block cache size will be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur continuous FullGC for hours and most importantly, after FullGC most of the object in old will not be GCed. so we dump the heap and analyse with MAT and we observed a obvious memory leak in LruBlockCache, which occpy about 16g memory, then we set set class LruBlockCache log level to TRACE and observed this in log: {quote} 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, , cachingAccesses=99462650031, cachingHits=93468334621, cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, evictedPerRun=20051.93359375{quote} we can see blockcache size has exceeded acceptableSize too many, which will cause the FullGC more seriously. Afterfter some investigations, I found in this function: {code:borderStyle=solid} public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory, final boolean cacheDataInL1) { {code} No matter the blockcache size has been used, just put the block into it. but if the evict thread is not fast enough, blockcache size will increament significantly. So here I think we should have a check, for example, if the blockcache size > 1.2 * acceptableSize(), just return and dont put into it until the blockcache size if under watrmark. if this is reasonable, I can make a small patch for this. was: Our regionserver has a configuation as bellow: -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC also we only use blockcache,and set hfile.block.cache.size = 0.3 in hbase_site.xml,so under this configuration, the lru block cache size will be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur continuous FullGC for hours and most importantly, after FullGC most of the object in old will not be GCed. so we dump the heap and analyse with MAT and we observed a obvious memory leak in LruBlockCache, which occpy about 16g memory, then we set set class LruBlockCache log level to TRACE and observed this in log: 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, , cachingAccesses=99462650031, cachingHits=93468334621, cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, evictedPerRun=20051.93359375 we can see blockcache size has exceeded acceptableSize too many, which will cause the FullGC more seriously. Afterfter some investigations, I found in this function: {code:borderStyle=solid} public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory, final boolean cacheDataInL1) { {code} No matter the blockcache size has been used, just put the block into it. but if the evict thread is not fast enough, blockcache size will increament significantly. So here I think we should have a check, for example, if the blockcache size > 1.2 * acceptableSize(), just return and dont put into it until the blockcache size if under watrmark. if this is reasonable, I can make a small patch for this. > BlockCache size should not exceed acceptableSize too many > - > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=994626
[jira] [Updated] (HBASE-16287) BlockCache size should not exceed acceptableSize too many
[ https://issues.apache.org/jira/browse/HBASE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun updated HBASE-16287: --- Description: Our regionserver has a configuation as bellow: -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC also we only use blockcache,and set hfile.block.cache.size = 0.3 in hbase_site.xml,so under this configuration, the lru block cache size will be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur continuous FullGC for hours and most importantly, after FullGC most of the object in old will not be GCed. so we dump the heap and analyse with MAT and we observed a obvious memory leak in LruBlockCache, which occpy about 16g memory, then we set set class LruBlockCache log level to TRACE and observed this in log: 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, , cachingAccesses=99462650031, cachingHits=93468334621, cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, evictedPerRun=20051.93359375 we can see blockcache size has exceeded acceptableSize too many, which will cause the FullGC more seriously. Afterfter some investigations, I found in this function: {code:borderStyle=solid} public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory, final boolean cacheDataInL1) { {code} No matter the blockcache size has been used, just put the block into it. but if the evict thread is not fast enough, blockcache size will increament significantly. So here I think we should have a check, for example, if the blockcache size > 1.2 * acceptableSize(), just return and dont put into it until the blockcache size if under watrmark. if this is reasonable, I can make a small patch for this. was: Our regionserver has a configuation as bellow: -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC also we only use blockcache,and set hfile.block.cache.size = 0.3 in hbase_site.xml,so under this configuration, the lru block cache size will be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur continuous FullGC for hours and most importantly, after FullGC most of the object in old will not be GCed. so we dump the heap and analyse with MAT and we observed a obvious memory leak in LruBlockCache, which occpy about 16g memory, then we set set class LruBlockCache log level to TRACE and observed this in log: 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, , cachingAccesses=99462650031, cachingHits=93468334621, cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, evictedPerRun=20051.93359375 we can see blockcache size has exceeded acceptableSize too many, which will cause the FullGC more seriously. Afterfter some investigations, I found in this function: public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory, final boolean cacheDataInL1) { No matter the blockcache size has been used, just put the block into it. but if the evict thread is not fast enough, blockcache size will increament significantly. So here I think we should have a check, for example, if the blockcache size > 1.2 * acceptableSize(), just return and dont put into it until the blockcache size if under watrmark. if this is reasonable, I can make a small patch for this. > BlockCache size should not exceed acceptableSize too many > - > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Reporter: Yu Sun > > Our regionserver has a configuation as bellow: > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC > also we only use blockcache,and set hfile.block.cache.size = 0.3 in > hbase_site.xml,so under this configuration, the lru block cache size will > be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur > continuous FullGC for hours and most importantly, after FullGC most of the > object in old will not be GCed. so we dump the heap and analyse with MAT and > we observed a obvious memory leak in LruBlockCache, which occpy about 16g > memory, then we set set class LruBlockCache log level to TRACE and observed > this in log: > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] > hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, > blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, > , cachingAccesses=99462650031, cachingHits=93468334621, > cachingHitsRatio=93.97%
[jira] [Created] (HBASE-16287) BlockCache size should not exceed acceptableSize too many
Yu Sun created HBASE-16287: -- Summary: BlockCache size should not exceed acceptableSize too many Key: HBASE-16287 URL: https://issues.apache.org/jira/browse/HBASE-16287 Project: HBase Issue Type: Improvement Components: BlockCache Reporter: Yu Sun Our regionserver has a configuation as bellow: -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=2 -XX:+UseConcMarkSweepGC also we only use blockcache,and set hfile.block.cache.size = 0.3 in hbase_site.xml,so under this configuration, the lru block cache size will be(32g-1g)*0.3=9.3g. but in some scenarios,some of the rs will occur continuous FullGC for hours and most importantly, after FullGC most of the object in old will not be GCed. so we dump the heap and analyse with MAT and we observed a obvious memory leak in LruBlockCache, which occpy about 16g memory, then we set set class LruBlockCache log level to TRACE and observed this in log: 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=15.29 GB, freeSize=-5.99 GB, max=9.30 GB, blockCount=628182, accesses=101799469125, hits=93517800259, hitRatio=91.86%, , cachingAccesses=99462650031, cachingHits=93468334621, cachingHitsRatio=93.97%, evictions=238199, evicted=4776350518, evictedPerRun=20051.93359375 we can see blockcache size has exceeded acceptableSize too many, which will cause the FullGC more seriously. Afterfter some investigations, I found in this function: public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory, final boolean cacheDataInL1) { No matter the blockcache size has been used, just put the block into it. but if the evict thread is not fast enough, blockcache size will increament significantly. So here I think we should have a check, for example, if the blockcache size > 1.2 * acceptableSize(), just return and dont put into it until the blockcache size if under watrmark. if this is reasonable, I can make a small patch for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-15325) ResultScanner allowing partial result will miss the rest of the row if the region is moved between two rpc requests
[ https://issues.apache.org/jira/browse/HBASE-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Sun reassigned HBASE-15325: -- Assignee: Yu Sun (was: Phil Yang) > ResultScanner allowing partial result will miss the rest of the row if the > region is moved between two rpc requests > --- > > Key: HBASE-15325 > URL: https://issues.apache.org/jira/browse/HBASE-15325 > Project: HBase > Issue Type: Bug > Components: dataloss, Scanners >Affects Versions: 1.2.0, 1.1.3 >Reporter: Phil Yang >Assignee: Yu Sun >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 1.4.0 > > Attachments: 15325-test.txt, HBASE-15325-v1.txt, > HBASE-15325-v10.patch, HBASE-15325-v11.patch, HBASE-15325-v2.txt, > HBASE-15325-v3.txt, HBASE-15325-v5.txt, HBASE-15325-v6.1.txt, > HBASE-15325-v6.2.txt, HBASE-15325-v6.3.txt, HBASE-15325-v6.4.txt, > HBASE-15325-v6.5.txt, HBASE-15325-v6.txt, HBASE-15325-v7.patch, > HBASE-15325-v8.patch, HBASE-15325-v9.patch > > > HBASE-11544 allow scan rpc return partial of a row to reduce memory usage for > one rpc request. And client can setAllowPartial or setBatch to get several > cells in a row instead of the whole row. > However, the status of the scanner is saved on server and we need this to get > the next part if there is a partial result before. If we move the region to > another RS, client will get a NotServingRegionException and open a new > scanner to the new RS which will be regarded as a new scan from the end of > this row. So the rest cells of the row of last result will be missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15325) ResultScanner allowing partial result will miss the rest of the row if the region is moved between two rpc requests
[ https://issues.apache.org/jira/browse/HBASE-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195127#comment-15195127 ] Yu Sun commented on HBASE-15325: in the current implementation(that before apply your patch), when DNRIOE occured, hbase will first call : // An exception was thrown which makes any partial results that we were collecting // invalid. The scanner will need to be reset to the beginning of a row. clearPartialResults(); to clear the partialResults list, and the next scan will start from Bytes.add(lastResult.getRow(), new byte[1]), not including current row, so here i think will miss the data of the this row, not some cells, right ? > ResultScanner allowing partial result will miss the rest of the row if the > region is moved between two rpc requests > --- > > Key: HBASE-15325 > URL: https://issues.apache.org/jira/browse/HBASE-15325 > Project: HBase > Issue Type: Bug > Components: dataloss, Scanners >Affects Versions: 1.2.0, 1.1.3 >Reporter: Phil Yang >Assignee: Phil Yang >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 1.4.0 > > Attachments: 15325-test.txt, HBASE-15325-v1.txt, > HBASE-15325-v10.patch, HBASE-15325-v2.txt, HBASE-15325-v3.txt, > HBASE-15325-v5.txt, HBASE-15325-v6.1.txt, HBASE-15325-v6.2.txt, > HBASE-15325-v6.3.txt, HBASE-15325-v6.4.txt, HBASE-15325-v6.5.txt, > HBASE-15325-v6.txt, HBASE-15325-v7.patch, HBASE-15325-v8.patch, > HBASE-15325-v9.patch > > > HBASE-11544 allow scan rpc return partial of a row to reduce memory usage for > one rpc request. And client can setAllowPartial or setBatch to get several > cells in a row instead of the whole row. > However, the status of the scanner is saved on server and we need this to get > the next part if there is a partial result before. If we move the region to > another RS, client will get a NotServingRegionException and open a new > scanner to the new RS which will be regarded as a new scan from the end of > this row. So the rest cells of the row of last result will be missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)