[jira] [Commented] (HBASE-12075) Preemptive Fast Fail
[ https://issues.apache.org/jira/browse/HBASE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184416#comment-14184416 ] Manukranth Kolloju commented on HBASE-12075: I can add some simpler example illustrating what we can do, in the release notes. The NoOpRetryableCallerInterceptor will be used by default. The client behavior by default doesn't change. As long as hbase.client.enable.fast.fail.mode is set to false, the code will use the NoOpInterceptor. About the new in the getNewRpcRetryingCallerFactory, I too felt that it didn't sound so much like a builder method. On the contrary I didn't particularly like (create/build)RpcRetryingCallerFactory either. I didn't have a specific choice about it and left it like that and commented the same on the diff. I can make the classes which I am not using in the server tests as package private. Preemptive Fast Fail Key: HBASE-12075 URL: https://issues.apache.org/jira/browse/HBASE-12075 Project: HBase Issue Type: Sub-task Components: Client Affects Versions: 0.99.0, 2.0.0, 0.98.6.1 Reporter: Manukranth Kolloju Assignee: Manukranth Kolloju Attachments: 0001-Add-a-test-case-for-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch In multi threaded clients, we use a feature developed on 0.89-fb branch called Preemptive Fast Fail. This allows the client threads which would potentially fail, fail fast. The idea behind this feature is that we allow, among the hundreds of client threads, one thread to try and establish connection with the regionserver and if that succeeds, we mark it as a live node again. Meanwhile, other threads which are trying to establish connection to the same server would ideally go into the timeouts which is effectively unfruitful. We can in those cases return appropriate exceptions to those clients instead of letting them retry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12345) Unsafe based Comparator for BB
[ https://issues.apache.org/jira/browse/HBASE-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184436#comment-14184436 ] Anoop Sam John commented on HBASE-12345: Yep. As per the test I have done in HBASE-11425. Added comment there. Just copying that comment to here {quote} Testing with a 2 million Cells with single cell per row. Writing all cells to a BB/DBB and trying a seek with to last kv. (To make compare across all cells in BB/DBB) Seek code is like what we have in ScannerV3#blockSeek with RK length 17 bytes (1st 13 bytes are same) Getting almost same result. With RK length 117 bytes (1st 113 bytes are same) the DBB based read is ~3% degrade {quote} Well in this test, the read and compare were from HBB and DBB and those are almost same. In case of our CellComparator we have Unsafe based optimization. In my old test this was not in use. With Unsafe based read from HBB#array() [this is what happens in HFileReaderV2/V3] there is a significant perf diff with DBB. Here RK length of 117 bytes and 2 millions cells and we seek to last cell, the DBB test is 50% slower. I am thinking of doing Unsafe based compares for data in DBB as well. Just done Unsafe based access from DBB/HBB and then we are in a better shape. The DBB based above test is ~12% slower than old HBB.array() based compares. Will raise a subtask and attach the approach there. Unsafe based Comparator for BB --- Key: HBASE-12345 URL: https://issues.apache.org/jira/browse/HBASE-12345 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-12345.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12345) Unsafe based Comparator for BB
[ https://issues.apache.org/jira/browse/HBASE-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184437#comment-14184437 ] Anoop Sam John commented on HBASE-12345: We can expose APIs getLong/Int etc in BBUtil which uses Unsafe, if it is available, and use that to read from BB. We will need that in HFileReaderV2/V3 seek, next etc. Also when the Cell is backed by buffer and the lengths , like rklength, tagsLength etc are part of the buffer, we can make use of the API for faster reads. Unsafe based Comparator for BB --- Key: HBASE-12345 URL: https://issues.apache.org/jira/browse/HBASE-12345 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-12345.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key
[ https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184438#comment-14184438 ] Anoop Sam John commented on HBASE-12313: {code} for (Cell cell : rr.rawCells()) { -resultSize += CellUtil.estimatedLengthOf(cell); +resultSize += CellUtil.estimatedSerializedSizeOf(cell); {code} estimatedLengthOf was returning the total length. estimatedSerializedSizeOf() is having extra count 4 bytes +. Do you want to change really Stack? Redo the hfile index length optimization so cell-based rather than serialized KV key Key: HBASE-12313 URL: https://issues.apache.org/jira/browse/HBASE-12313 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: stack Assignee: stack Attachments: 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt Trying to remove API that returns the 'key' of a KV serialized into a byte array is thorny. I tried to move over the first and last key serializations and the hfile index entries to be cell but patch was turning massive. Here is a smaller patch that just redoes the optimization that tries to find 'short' midpoints between last key of last block and first key of next block so it is Cell-based rather than byte array based (presuming Keys serialized in a certain way). Adds unit tests which we didn't have before. Also remove CellKey. Not needed... at least not yet. Its just utility for toString. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key
[ https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184438#comment-14184438 ] Anoop Sam John edited comment on HBASE-12313 at 10/26/14 9:12 AM: -- {code} for (Cell cell : rr.rawCells()) { -resultSize += CellUtil.estimatedLengthOf(cell); +resultSize += CellUtil.estimatedSerializedSizeOf(cell); {code} estimatedLengthOf was returning the total length. estimatedSerializedSizeOf() is having extra count 4 bytes +. Do you want to change really Stack? Replacing estimatedLengthOf with estimatedSerializedSizeOf is correct? was (Author: anoop.hbase): {code} for (Cell cell : rr.rawCells()) { -resultSize += CellUtil.estimatedLengthOf(cell); +resultSize += CellUtil.estimatedSerializedSizeOf(cell); {code} estimatedLengthOf was returning the total length. estimatedSerializedSizeOf() is having extra count 4 bytes +. Do you want to change really Stack? Redo the hfile index length optimization so cell-based rather than serialized KV key Key: HBASE-12313 URL: https://issues.apache.org/jira/browse/HBASE-12313 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: stack Assignee: stack Attachments: 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt Trying to remove API that returns the 'key' of a KV serialized into a byte array is thorny. I tried to move over the first and last key serializations and the hfile index entries to be cell but patch was turning massive. Here is a smaller patch that just redoes the optimization that tries to find 'short' midpoints between last key of last block and first key of next block so it is Cell-based rather than byte array based (presuming Keys serialized in a certain way). Adds unit tests which we didn't have before. Also remove CellKey. Not needed... at least not yet. Its just utility for toString. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key
[ https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184438#comment-14184438 ] Anoop Sam John edited comment on HBASE-12313 at 10/26/14 10:16 AM: --- {code} for (Cell cell : rr.rawCells()) { -resultSize += CellUtil.estimatedLengthOf(cell); +resultSize += CellUtil.estimatedSerializedSizeOf(cell); {code} estimatedLengthOf was returning the total length. estimatedSerializedSizeOf() is having extra count 4 bytes +. Do you want to change really Stack? Replacing estimatedLengthOf with estimatedSerializedSizeOf is correct? {code} + private static int getSumOfKeyElementLengths(final Cell cell) { +return cell.getRowLength() + cell.getFamilyLength() + +cell.getQualifierLength() + +cell.getValueLength() + +cell.getTagsLength() + +KeyValue.TIMESTAMP_TYPE_SIZE; + } + + public static int estimatedSerializedSizeOfKey(final Cell cell) { +if (cell instanceof KeyValue) return ((KeyValue)cell).getKeyLength(); +// This will be a low estimate. Will do for now. +return getSumOfKeyElementLengths(cell); + } {code} getSumOfKeyElementLengths - including lengths of tags and value? {code} -return cell.getRowLength() + cell.getFamilyLength() + cell.getQualifierLength() -+ cell.getValueLength() + cell.getTagsLength() + KeyValue.TIMESTAMP_TYPE_SIZE; +// TODO: Add sizing of references that hold the row, family, etc., arrays. +return estimatedSerializedSizeOf(cell); {code} No need to add the extra 4 bytes for heapSize which will come in estimatedSerializedSizeOf (?) {code} + public static String getCellKeyAsString(Cell cell) { +StringBuilder sb = new StringBuilder(Bytes.toStringBinary( + cell.getRowArray(), cell.getRowOffset(), cell.getRowLength())); +sb.append(cell.getFamilyLength() == 0? : + Bytes.toStringBinary(cell.getFamilyArray(), cell.getFamilyOffset(), cell.getFamilyLength())); +sb.append(cell.getQualifierLength() == 0? : + Bytes.toStringBinary(cell.getQualifierArray(), cell.getQualifierOffset(), +cell.getQualifierLength())); {code} Can we add a separator in between rk, f and q parts? {code} -// h goes to the next block -assertEquals(-2, scanner.seekTo(toKV(h, tagUsage))); +// 'h' does not exist so we will get a '1' back for not found. +assertEquals(0, scanner.seekTo(toKV(i, tagUsage))); assertEquals(i, toRowStr(scanner.getKeyValue())); {code} What if we do seekTo 'h' only ? {code} -assertEquals(1, blockIndexReader.rootBlockContainingKey( -toKV(h, tagUsage))); +// 'h', being midpoint between 'g' and 'i', used to be the block index key because of the +// little optimization done creating block index keys where we try to get a midpoint and then +// make this midpoint short as possible so index blocks are kept tight. But now, we won't do +// the 'optimization' -- create new key -- if there is no gain to be had by way of making +// a shorter key; in this case we just use the start key in the index. This means the below +// test changes. Looking for 'h', it'll be in the 0 block rather than 1 block now (though 'h' +// does not exist in this file). +assertEquals(0, blockIndexReader.rootBlockContainingKey(toKV(h, tagUsage))); {code} Read your comment to see why is the change. Will this change in mid point calc make any issue in reads? was (Author: anoop.hbase): {code} for (Cell cell : rr.rawCells()) { -resultSize += CellUtil.estimatedLengthOf(cell); +resultSize += CellUtil.estimatedSerializedSizeOf(cell); {code} estimatedLengthOf was returning the total length. estimatedSerializedSizeOf() is having extra count 4 bytes +. Do you want to change really Stack? Replacing estimatedLengthOf with estimatedSerializedSizeOf is correct? Redo the hfile index length optimization so cell-based rather than serialized KV key Key: HBASE-12313 URL: https://issues.apache.org/jira/browse/HBASE-12313 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: stack Assignee: stack Attachments: 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt Trying to remove API that returns the 'key' of a KV serialized into a byte array is thorny. I tried to move over the first and last key serializations and the hfile index entries to be cell but patch was turning massive. Here is
[jira] [Commented] (HBASE-12075) Preemptive Fast Fail
[ https://issues.apache.org/jira/browse/HBASE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184503#comment-14184503 ] Ted Yu commented on HBASE-12075: bq. hbase.client.enable.fast.fail.mode is set to false Since the above config takes boolean value, maybe call it hbase.client.fast.fail.enabled ? Preemptive Fast Fail Key: HBASE-12075 URL: https://issues.apache.org/jira/browse/HBASE-12075 Project: HBase Issue Type: Sub-task Components: Client Affects Versions: 0.99.0, 2.0.0, 0.98.6.1 Reporter: Manukranth Kolloju Assignee: Manukranth Kolloju Attachments: 0001-Add-a-test-case-for-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch In multi threaded clients, we use a feature developed on 0.89-fb branch called Preemptive Fast Fail. This allows the client threads which would potentially fail, fail fast. The idea behind this feature is that we allow, among the hundreds of client threads, one thread to try and establish connection with the regionserver and if that succeeds, we mark it as a live node again. Meanwhile, other threads which are trying to establish connection to the same server would ideally go into the timeouts which is effectively unfruitful. We can in those cases return appropriate exceptions to those clients instead of letting them retry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key
[ https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184538#comment-14184538 ] stack commented on HBASE-12313: --- bq. Do you want to change really Stack? This patch cleans up the CellUtil methods that do size counting. There were a few too many methods each only slightly different from each other. In this particular case, we are just doing an estimate and serialized size is probably closest to what we are putting on wire at this stage. I don't see a problem that it is slightly different from what was there before (what was there before was an 'estimate'). Do you? bq. Replacing estimatedLengthOf with estimatedSerializedSizeOf is correct? Where we were using estimatedLengthOf (What is this anyways -- smile? Serialized 'length' or size on heap? Or size of the serialized KeyValue byte array -- which is going away), we were talking serialized size. I was thinking estimatedSerializedSizeOf more appropriate where I did the replaces. bq. No need to add the extra 4 bytes for heapSize which will come in estimatedSerializedSizeOf Are your referring to the TODO? I'd think that serialized size and heap size will be calculated differently when we get around to it. bq. Can we add a separator in between rk, f and q parts? Whoops. Will fix. bq. What if we do seekTo 'h' only ? There is no 'h' in the dataset. It was 'artificial' midpoint. If you seek to 'h', you end up in the second block which starts with 'i'. bq. Will this change in mid point calc make any issue in reads? I don't believe so. This whole area was without tests previously. I made the mid calc code stand apart and added a bunch in this patch. I also as part of making this patch put in place the old code and the new and when the result did not equate, I threw exception as our unit test suite ran. I looked at each case to see if the difference was legit? What I found was that the differences were because we made midkeys even when no advantage (as in the above 'h' case -- no need to make a midkey if all sizes are the same). Redo the hfile index length optimization so cell-based rather than serialized KV key Key: HBASE-12313 URL: https://issues.apache.org/jira/browse/HBASE-12313 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: stack Assignee: stack Attachments: 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt Trying to remove API that returns the 'key' of a KV serialized into a byte array is thorny. I tried to move over the first and last key serializations and the hfile index entries to be cell but patch was turning massive. Here is a smaller patch that just redoes the optimization that tries to find 'short' midpoints between last key of last block and first key of next block so it is Cell-based rather than byte array based (presuming Keys serialized in a certain way). Adds unit tests which we didn't have before. Also remove CellKey. Not needed... at least not yet. Its just utility for toString. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12346) Scan's default auths behavior under Visibility labels
Jerry He created HBASE-12346: Summary: Scan's default auths behavior under Visibility labels Key: HBASE-12346 URL: https://issues.apache.org/jira/browse/HBASE-12346 Project: HBase Issue Type: Bug Components: API, security Affects Versions: 0.99.1, 0.98.7 Reporter: Jerry He In Visibility Labels security, a set of labels (auths) are administered and associated with a user. A user can normally only see cell data during scan that are part of the user's label set (auths). Scan uses setAuthorizations to indicates its wants to use the auths to access the cells. Similarly in the shell: {code} scan 'table1', AUTHORIZATIONS = ['private'] {code} But it is a surprise to find that setAuthorizations seems to be 'mandatory' in the default visibility label security setting. Every scan needs to setAuthorizations before the scan can get any cells even the cells are under the labels the request user is part of. The following steps will illustrate the issue: Run as superuser. {code} 1. create a visibility label called 'private' 2. create 'table1' 3. put into 'table1' data and label the data as 'private' 4. set_auths 'user1', 'private' 5. grant 'user1', 'RW', 'table1' {code} Run as 'user1': {code} 1. scan 'table1' This show no cells. 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private'] This will show all the data. {code} I am not sure if this is expected by design or a bug. But a more reasonable, more client application backward compatible, and less surprising default behavior should probably look like this: A scan's default auths, if its Authorizations attributes is not set explicitly, should be all the auths the request user is administered and allowed on the server. If scan.setAuthorizations is used, then the server further filter the auths during scan: use the input auths minus what is not in user's label set on the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels
[ https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184587#comment-14184587 ] Jerry He commented on HBASE-12346: -- In this accumulo doc: http://accumulo.apache.org/1.6/examples/visibility.html The default authorizations for a scan are the user's entire set of authorizations. Scan's default auths behavior under Visibility labels - Key: HBASE-12346 URL: https://issues.apache.org/jira/browse/HBASE-12346 Project: HBase Issue Type: Bug Components: API, security Affects Versions: 0.98.7, 0.99.1 Reporter: Jerry He In Visibility Labels security, a set of labels (auths) are administered and associated with a user. A user can normally only see cell data during scan that are part of the user's label set (auths). Scan uses setAuthorizations to indicates its wants to use the auths to access the cells. Similarly in the shell: {code} scan 'table1', AUTHORIZATIONS = ['private'] {code} But it is a surprise to find that setAuthorizations seems to be 'mandatory' in the default visibility label security setting. Every scan needs to setAuthorizations before the scan can get any cells even the cells are under the labels the request user is part of. The following steps will illustrate the issue: Run as superuser. {code} 1. create a visibility label called 'private' 2. create 'table1' 3. put into 'table1' data and label the data as 'private' 4. set_auths 'user1', 'private' 5. grant 'user1', 'RW', 'table1' {code} Run as 'user1': {code} 1. scan 'table1' This show no cells. 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private'] This will show all the data. {code} I am not sure if this is expected by design or a bug. But a more reasonable, more client application backward compatible, and less surprising default behavior should probably look like this: A scan's default auths, if its Authorizations attributes is not set explicitly, should be all the auths the request user is administered and allowed on the server. If scan.setAuthorizations is used, then the server further filter the auths during scan: use the input auths minus what is not in user's label set on the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12346) Scan's default auths behavior under Visibility labels
[ https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-12346: - Attachment: HBASE-12346-master.patch Scan's default auths behavior under Visibility labels - Key: HBASE-12346 URL: https://issues.apache.org/jira/browse/HBASE-12346 Project: HBase Issue Type: Bug Components: API, security Affects Versions: 0.98.7, 0.99.1 Reporter: Jerry He Attachments: HBASE-12346-master.patch In Visibility Labels security, a set of labels (auths) are administered and associated with a user. A user can normally only see cell data during scan that are part of the user's label set (auths). Scan uses setAuthorizations to indicates its wants to use the auths to access the cells. Similarly in the shell: {code} scan 'table1', AUTHORIZATIONS = ['private'] {code} But it is a surprise to find that setAuthorizations seems to be 'mandatory' in the default visibility label security setting. Every scan needs to setAuthorizations before the scan can get any cells even the cells are under the labels the request user is part of. The following steps will illustrate the issue: Run as superuser. {code} 1. create a visibility label called 'private' 2. create 'table1' 3. put into 'table1' data and label the data as 'private' 4. set_auths 'user1', 'private' 5. grant 'user1', 'RW', 'table1' {code} Run as 'user1': {code} 1. scan 'table1' This show no cells. 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private'] This will show all the data. {code} I am not sure if this is expected by design or a bug. But a more reasonable, more client application backward compatible, and less surprising default behavior should probably look like this: A scan's default auths, if its Authorizations attributes is not set explicitly, should be all the auths the request user is administered and allowed on the server. If scan.setAuthorizations is used, then the server further filter the auths during scan: use the input auths minus what is not in user's label set on the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels
[ https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184601#comment-14184601 ] Jerry He commented on HBASE-12346: -- Attached a patch If everyone agrees with proposed change to the default auths behavior. Scan's default auths behavior under Visibility labels - Key: HBASE-12346 URL: https://issues.apache.org/jira/browse/HBASE-12346 Project: HBase Issue Type: Bug Components: API, security Affects Versions: 0.98.7, 0.99.1 Reporter: Jerry He Attachments: HBASE-12346-master.patch In Visibility Labels security, a set of labels (auths) are administered and associated with a user. A user can normally only see cell data during scan that are part of the user's label set (auths). Scan uses setAuthorizations to indicates its wants to use the auths to access the cells. Similarly in the shell: {code} scan 'table1', AUTHORIZATIONS = ['private'] {code} But it is a surprise to find that setAuthorizations seems to be 'mandatory' in the default visibility label security setting. Every scan needs to setAuthorizations before the scan can get any cells even the cells are under the labels the request user is part of. The following steps will illustrate the issue: Run as superuser. {code} 1. create a visibility label called 'private' 2. create 'table1' 3. put into 'table1' data and label the data as 'private' 4. set_auths 'user1', 'private' 5. grant 'user1', 'RW', 'table1' {code} Run as 'user1': {code} 1. scan 'table1' This show no cells. 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private'] This will show all the data. {code} I am not sure if this is expected by design or a bug. But a more reasonable, more client application backward compatible, and less surprising default behavior should probably look like this: A scan's default auths, if its Authorizations attributes is not set explicitly, should be all the auths the request user is administered and allowed on the server. If scan.setAuthorizations is used, then the server further filter the auths during scan: use the input auths minus what is not in user's label set on the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels
[ https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184641#comment-14184641 ] Andrew Purtell commented on HBASE-12346: The default scan label generator has the behavior you describe. If you don't ask for any authorizations you don't get any. There is another label generator that will do what you want but will force the user's assigned set. Label generators are stackable. It could make sense to change all of this around a bit to have the default configuration start with the generator that adds labels assigned to the user in the labels table, with another generator stacked on top that adds auths passed in on a Scan attribute. This would be more flexible then the result after the proposed patch is applied. Scan's default auths behavior under Visibility labels - Key: HBASE-12346 URL: https://issues.apache.org/jira/browse/HBASE-12346 Project: HBase Issue Type: Bug Components: API, security Affects Versions: 0.98.7, 0.99.1 Reporter: Jerry He Attachments: HBASE-12346-master.patch In Visibility Labels security, a set of labels (auths) are administered and associated with a user. A user can normally only see cell data during scan that are part of the user's label set (auths). Scan uses setAuthorizations to indicates its wants to use the auths to access the cells. Similarly in the shell: {code} scan 'table1', AUTHORIZATIONS = ['private'] {code} But it is a surprise to find that setAuthorizations seems to be 'mandatory' in the default visibility label security setting. Every scan needs to setAuthorizations before the scan can get any cells even the cells are under the labels the request user is part of. The following steps will illustrate the issue: Run as superuser. {code} 1. create a visibility label called 'private' 2. create 'table1' 3. put into 'table1' data and label the data as 'private' 4. set_auths 'user1', 'private' 5. grant 'user1', 'RW', 'table1' {code} Run as 'user1': {code} 1. scan 'table1' This show no cells. 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private'] This will show all the data. {code} I am not sure if this is expected by design or a bug. But a more reasonable, more client application backward compatible, and less surprising default behavior should probably look like this: A scan's default auths, if its Authorizations attributes is not set explicitly, should be all the auths the request user is administered and allowed on the server. If scan.setAuthorizations is used, then the server further filter the auths during scan: use the input auths minus what is not in user's label set on the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels
[ https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184644#comment-14184644 ] Andrew Purtell commented on HBASE-12346: Also, I don't believe we should or even can aim to be transparently like Accumulo. The labels feature should be most useful and relevant for HBase users building HBase applications. Maybe the proposal here by consensus meets that test, but that would be independent of what any Accumulo documentation says (or doesn't). We did aim for some familiarity in the design of the API and shell commands but I'm not sure in retrospect that's more harmful (because we aren't going to get Accumulo exact semantics with a tag based implementation) than helpful. Scan's default auths behavior under Visibility labels - Key: HBASE-12346 URL: https://issues.apache.org/jira/browse/HBASE-12346 Project: HBase Issue Type: Bug Components: API, security Affects Versions: 0.98.7, 0.99.1 Reporter: Jerry He Attachments: HBASE-12346-master.patch In Visibility Labels security, a set of labels (auths) are administered and associated with a user. A user can normally only see cell data during scan that are part of the user's label set (auths). Scan uses setAuthorizations to indicates its wants to use the auths to access the cells. Similarly in the shell: {code} scan 'table1', AUTHORIZATIONS = ['private'] {code} But it is a surprise to find that setAuthorizations seems to be 'mandatory' in the default visibility label security setting. Every scan needs to setAuthorizations before the scan can get any cells even the cells are under the labels the request user is part of. The following steps will illustrate the issue: Run as superuser. {code} 1. create a visibility label called 'private' 2. create 'table1' 3. put into 'table1' data and label the data as 'private' 4. set_auths 'user1', 'private' 5. grant 'user1', 'RW', 'table1' {code} Run as 'user1': {code} 1. scan 'table1' This show no cells. 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private'] This will show all the data. {code} I am not sure if this is expected by design or a bug. But a more reasonable, more client application backward compatible, and less surprising default behavior should probably look like this: A scan's default auths, if its Authorizations attributes is not set explicitly, should be all the auths the request user is administered and allowed on the server. If scan.setAuthorizations is used, then the server further filter the auths during scan: use the input auths minus what is not in user's label set on the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-8607) Allow custom filters and coprocessors to be updated for a region server without requiring a restart
[ https://issues.apache.org/jira/browse/HBASE-8607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184667#comment-14184667 ] Julian Wissmann commented on HBASE-8607: Andrew, your suggestion sounds really interesting. I've just been thinking about it for a while in order to estimate how big an effort prototyping this would be. The way I understand it, the idea is, that we have an OSGi Coprocessor, that the regular coprocessors are registered to as an OSGi Service. However, for this to work, there will either need to be a Service Registry on each Region or we go with Distributed OSGi and dump it in the client. Either way, there also needs to be a mechanism to check service availability on the Regions form the client side. Right now, I'd consider the version with each region server holding its own service registry to be quite feasible. I'm thinking the following approach: The OSGi Coprocessor wil discover Bundles, the only service provided by the Bundle will actually be starting discovered coprocessors with its own environment. That way the client side wil be rather clean and the actual coprocessors will behave as usual (their own protocol and client) thus allowing for the OSGi Coprocessor to be rather simple and maximizing flexibility. The only required client side functionality will then be representing the Service Registry and starting region side services by name (at least I can't think of another way considering that protobuf is in the middle of it). Any more thoughts on this? Allow custom filters and coprocessors to be updated for a region server without requiring a restart --- Key: HBASE-8607 URL: https://issues.apache.org/jira/browse/HBASE-8607 Project: HBase Issue Type: New Feature Components: regionserver Reporter: James Taylor One solution to allowing custom filters and coprocessors to be updated for a region server without requiring a restart might be to run the HBase server in an OSGi container (maybe there are other approaches as well?). Typically, applications that use coprocessors and custom filters also have shared classes underneath, so putting the burden on the user to include some kind of version name in the class is not adequate. Including the version name in the package might work in some cases (at least until dependent jars start to change as well), but is cumbersome and overburdens the app developer. Regardless of what approach is taken, we'd need to define the life cycle of the coprocessors and custom filters when a new version is loaded. For example, in-flight invocations could continue to use the old version while new invocations would use the new ones. Once the in-flight invocations are complete, the old code/jar could be unloaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12343) Document recommended configuration for 0.98 from HBASE-11964
[ https://issues.apache.org/jira/browse/HBASE-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184668#comment-14184668 ] Misty Stanley-Jones commented on HBASE-12343: - +1 from me, as long as you're sure about the technical content. :) Document recommended configuration for 0.98 from HBASE-11964 Key: HBASE-12343 URL: https://issues.apache.org/jira/browse/HBASE-12343 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0 Attachments: HBASE-12343.patch We're not committing the configuration changes from HBASE-11964 to 0.98 but they should be the recommend configuration for replication. Add a paragraph to the replication section of the manual on this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12343) Document recommended configuration for 0.98 from HBASE-11964
[ https://issues.apache.org/jira/browse/HBASE-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184669#comment-14184669 ] Misty Stanley-Jones commented on HBASE-12343: - By the way, this would have come up in my JIRA filter of docs issues to review if it had the Documentation component. Document recommended configuration for 0.98 from HBASE-11964 Key: HBASE-12343 URL: https://issues.apache.org/jira/browse/HBASE-12343 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0 Attachments: HBASE-12343.patch We're not committing the configuration changes from HBASE-11964 to 0.98 but they should be the recommend configuration for replication. Add a paragraph to the replication section of the manual on this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12343) Document recommended configuration for 0.98 from HBASE-11964
[ https://issues.apache.org/jira/browse/HBASE-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184678#comment-14184678 ] Andrew Purtell commented on HBASE-12343: Thanks! I'll remember that for next time Document recommended configuration for 0.98 from HBASE-11964 Key: HBASE-12343 URL: https://issues.apache.org/jira/browse/HBASE-12343 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 2.0.0 Attachments: HBASE-12343.patch We're not committing the configuration changes from HBASE-11964 to 0.98 but they should be the recommend configuration for replication. Add a paragraph to the replication section of the manual on this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12075) Preemptive Fast Fail
[ https://issues.apache.org/jira/browse/HBASE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184681#comment-14184681 ] stack commented on HBASE-12075: --- Ok [~manukranthk] If default doesn't change, if the classes are private, if there is explaination and example of how to use this stuff, i'd be good w/ commit. Preemptive Fast Fail Key: HBASE-12075 URL: https://issues.apache.org/jira/browse/HBASE-12075 Project: HBase Issue Type: Sub-task Components: Client Affects Versions: 0.99.0, 2.0.0, 0.98.6.1 Reporter: Manukranth Kolloju Assignee: Manukranth Kolloju Attachments: 0001-Add-a-test-case-for-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch, 0001-Implement-Preemptive-Fast-Fail.patch In multi threaded clients, we use a feature developed on 0.89-fb branch called Preemptive Fast Fail. This allows the client threads which would potentially fail, fail fast. The idea behind this feature is that we allow, among the hundreds of client threads, one thread to try and establish connection with the regionserver and if that succeeds, we mark it as a live node again. Meanwhile, other threads which are trying to establish connection to the same server would ideally go into the timeouts which is effectively unfruitful. We can in those cases return appropriate exceptions to those clients instead of letting them retry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-11792) Organize PerformanceEvaluation usage output
[ https://issues.apache.org/jira/browse/HBASE-11792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones reassigned HBASE-11792: --- Assignee: Misty Stanley-Jones Organize PerformanceEvaluation usage output --- Key: HBASE-11792 URL: https://issues.apache.org/jira/browse/HBASE-11792 Project: HBase Issue Type: Improvement Components: Performance, test Reporter: Nick Dimiduk Assignee: Misty Stanley-Jones Priority: Minor Labels: beginner PerformanceEvaluation has enjoyed a good bit of attention recently. All the new features are muddled together. It would be nice to organize the output of the Options list according to some scheme. I was thinking you're group entries by when they're used. For example *General options* - nomapred - rows - oneCon - ... *Table Creation/Write tests* - compress - flushCommits - valueZipf - ... *Read tests* - filterAll - multiGet - replicas - ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11985) Document sizing rules of thumb
[ https://issues.apache.org/jira/browse/HBASE-11985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184697#comment-14184697 ] Misty Stanley-Jones commented on HBASE-11985: - {quote} Indicating that 50 to 100 regions are recommended for between 1 to 2 CF would be a useful clarification. It would also be a good way for customers to be aware of the impact of increasing the number of column families. {quote} Thanks [~gkamat] {quote} If you are storing time based machine data or logging information and the load is distributed by device id or service id + time , you can end up with the pattern where older data regions never have additional writes beyond a certain age. This can occur when the solution involves something like Hbase for new data (for example last 30 days) + Impala for older data In these situations, you can end up with a small number of active regions + a set of older regions no longer being written. For these situations you can tolerate greater number of regions as your main resource consumption is driven by the active regions. This, of course, is very dependent on type of load and query patterns. {quote} Thanks [~rstokes] {quote} if only one CF is busy with writes, only that one accumulates memory. That is the same with inactive (only read-from) regions for the a single CF. {quote} Thanks [~larsgeorge], you also had a diagram to illustrate this but the link I have doesn't work now. Can you point me there? Document sizing rules of thumb -- Key: HBASE-11985 URL: https://issues.apache.org/jira/browse/HBASE-11985 Project: HBase Issue Type: Task Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones I'm looking for tuning/sizing rules of thumb to put in the Ref Guide. Info I have gleaned so far: A reasonable region size is between 10 GB and 50 GB. A reasonable maximum cell size is 1 MB to 10 MB. If your cells are larger than 10 MB, consider storing the cell contents in HDFS and storing a reference to the location in HBase. Pending MOB work for 10 MB - 64 MB window. When you size your regions and cells, keep in mind that a region cannot split across a row. If your row size is too large, or your region size is too small, you can end up with a single row per region, which is not a good pattern. It is also possible that one big column causes splits while other columns are tiny, and this may not be great. A large # of columns probably means you are doing it wrong. Column names need to be short because they get stored for every value (barring encoding). Don't need to be self-documenting like in RDBMS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091
[ https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184698#comment-14184698 ] stack commented on HBASE-12285: --- After changing the jenkins config for branch-1 to remove -Dmaven.test.redirectTestOutputToFile=true and stuff continued to pass, I've just set branch-1 back to use DEBUG again from WARN. Builds are failing, possibly because of SUREFIRE-1091 - Key: HBASE-12285 URL: https://issues.apache.org/jira/browse/HBASE-12285 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dima Spivak Assignee: Dima Spivak Priority: Blocker Attachments: HBASE-12285_branch-1_v1.patch Our branch-1 builds on builds.apache.org have been failing in recent days after we switched over to an official version of Surefire a few days back (HBASE-4955). The version we're using, 2.17, is hit by a bug ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results in an IOException, which looks like what we're seeing on Jenkins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels
[ https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184699#comment-14184699 ] Jerry He commented on HBASE-12346: -- HI, [~apurtell] Thanks for the comment. I agree that we should not and can not follow Accumulo blindly. On the contrary, if you think about it, it is probably more ok for Accumulo to force their scanner applications to setAuthorizations(), since Accumulo has had it since the beginning. At least no backward compatible issue. For us, ask users to re-write their read applications to order to use visibility labels is less desirable and not practical. While doing enablement and advocacy work for this feature, the feedbacks I got include 'confusion'. Stacking multiple label generators will do the trick. But it is probably more suitable for advanced users and will complicate things. I think a reasonable and practical out-of-box experience is more important. Scan's default auths behavior under Visibility labels - Key: HBASE-12346 URL: https://issues.apache.org/jira/browse/HBASE-12346 Project: HBase Issue Type: Bug Components: API, security Affects Versions: 0.98.7, 0.99.1 Reporter: Jerry He Attachments: HBASE-12346-master.patch In Visibility Labels security, a set of labels (auths) are administered and associated with a user. A user can normally only see cell data during scan that are part of the user's label set (auths). Scan uses setAuthorizations to indicates its wants to use the auths to access the cells. Similarly in the shell: {code} scan 'table1', AUTHORIZATIONS = ['private'] {code} But it is a surprise to find that setAuthorizations seems to be 'mandatory' in the default visibility label security setting. Every scan needs to setAuthorizations before the scan can get any cells even the cells are under the labels the request user is part of. The following steps will illustrate the issue: Run as superuser. {code} 1. create a visibility label called 'private' 2. create 'table1' 3. put into 'table1' data and label the data as 'private' 4. set_auths 'user1', 'private' 5. grant 'user1', 'RW', 'table1' {code} Run as 'user1': {code} 1. scan 'table1' This show no cells. 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private'] This will show all the data. {code} I am not sure if this is expected by design or a bug. But a more reasonable, more client application backward compatible, and less surprising default behavior should probably look like this: A scan's default auths, if its Authorizations attributes is not set explicitly, should be all the auths the request user is administered and allowed on the server. If scan.setAuthorizations is used, then the server further filter the auths during scan: use the input auths minus what is not in user's label set on the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11179) API parity between mapred and mapreduce
[ https://issues.apache.org/jira/browse/HBASE-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11179: -- Fix Version/s: (was: 0.99.2) 2.0.0 API parity between mapred and mapreduce --- Key: HBASE-11179 URL: https://issues.apache.org/jira/browse/HBASE-11179 Project: HBase Issue Type: Sub-task Components: mapreduce Reporter: Nick Dimiduk Labels: beginner Fix For: 2.0.0 This ticket is for bringing the mapred package up to feature parity with mapreduce. Might become an umbrella ticket in and of itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11179) API parity between mapred and mapreduce
[ https://issues.apache.org/jira/browse/HBASE-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184701#comment-14184701 ] stack commented on HBASE-11179: --- Moved it out of 1.0. Move back if I have it wrong [~ndimiduk] API parity between mapred and mapreduce --- Key: HBASE-11179 URL: https://issues.apache.org/jira/browse/HBASE-11179 Project: HBase Issue Type: Sub-task Components: mapreduce Reporter: Nick Dimiduk Labels: beginner Fix For: 2.0.0 This ticket is for bringing the mapred package up to feature parity with mapreduce. Might become an umbrella ticket in and of itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091
[ https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184721#comment-14184721 ] Hudson commented on HBASE-12285: FAILURE: Integrated in HBase-1.0 #364 (See [https://builds.apache.org/job/HBase-1.0/364/]) HBASE-12285 Builds are failing, possibly because of SUREFIRE-1091 -- Setting log level back to DEBUG from WARN (stack: rev 65c60ce873b4216dc2d05c28191e7f1a724de8b5) * hbase-server/src/test/resources/log4j.properties Builds are failing, possibly because of SUREFIRE-1091 - Key: HBASE-12285 URL: https://issues.apache.org/jira/browse/HBASE-12285 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dima Spivak Assignee: Dima Spivak Priority: Blocker Attachments: HBASE-12285_branch-1_v1.patch Our branch-1 builds on builds.apache.org have been failing in recent days after we switched over to an official version of Surefire a few days back (HBASE-4955). The version we're using, 2.17, is hit by a bug ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results in an IOException, which looks like what we're seeing on Jenkins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091
[ https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184723#comment-14184723 ] Dima Spivak commented on HBASE-12285: - Damn, looks like just removing the output redirection isn't enough ([stream error is back|https://builds.apache.org/view/All/job/HBase-1.0/364/console]). Might as well move logging back to WARN, [~stack]. One thing that's a bit strange is that it doesn't seem to be caused by any particular test since the same number of tests completed before the error occurs is constant (3430, as seen [here|https://builds.apache.org/view/All/job/HBase-1.0/348/testReport/] and [here|https://builds.apache.org/view/All/job/HBase-1.0/347/testReport/]) even though the tests are run in random order. It's also worth pointing out that a large number of tests produce logs over 5 MB (some over 50 MB) even though the test that uncovered SUREFIRE-1091 only had to output 1 MB. And, of course, there's the why does this only hit branch-1? question that I can't answer either. I'll keep digging... Builds are failing, possibly because of SUREFIRE-1091 - Key: HBASE-12285 URL: https://issues.apache.org/jira/browse/HBASE-12285 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dima Spivak Assignee: Dima Spivak Priority: Blocker Attachments: HBASE-12285_branch-1_v1.patch Our branch-1 builds on builds.apache.org have been failing in recent days after we switched over to an official version of Surefire a few days back (HBASE-4955). The version we're using, 2.17, is hit by a bug ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results in an IOException, which looks like what we're seeing on Jenkins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12187) Review in source the paper Simple Testing Can Prevent Most Critical Failures
[ https://issues.apache.org/jira/browse/HBASE-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184734#comment-14184734 ] Ding Yuan commented on HBASE-12187: --- I have implemented the three checks from aspirator into error-prone version 1.1.2. These three checks are: (1). Catch block that ignores exception (including containing only a log printing statement); (2). Aborting the system on exception over-catch; (3). Catch block containing TODO or FIXME in comments Among them, (1) is a bit complicated since I included quite a few false positive suppression heuristics as described in the paper. I have tested all three checks on HBase-0.99.0. The first check found 111 cases, while the other two found less than 10 each. I have attached the reported cases as attachments. Currently I assigned all of the three checks as ERROR severity. So if one thinks that a case is fine, an annotation like @SupressWarnings(EmptyCatch) is needed to get the compilation to succeed. I am attaching the patch to error-prone v1.1.2, which contains the three added checks. I have also uploaded my error-prone repository to: https://github.com/diy1/error-prone-aspirator Please let me know how i can further help. cheers, ding Review in source the paper Simple Testing Can Prevent Most Critical Failures -- Key: HBASE-12187 URL: https://issues.apache.org/jira/browse/HBASE-12187 Project: HBase Issue Type: Bug Reporter: stack Priority: Critical Review the helpful paper https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf It describes 'catastrophic failures', especially issues where exceptions are thrown but not properly handled. Their static analysis tool Aspirator turns up a bunch of the obvious offenders (Lets add to test-patch.sh alongside findbugs?). This issue is about going through code base making sub-issues to root out these and others (Don't we have the test described in figure #6 already? I thought we did? If we don't, need to add). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12187) Review in source the paper Simple Testing Can Prevent Most Critical Failures
[ https://issues.apache.org/jira/browse/HBASE-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ding Yuan updated HBASE-12187: -- Attachment: todoInCatch.warnings.txt emptyCatch.warnings.txt abortInOvercatch.warnings.txt HBASE-12187.patch Review in source the paper Simple Testing Can Prevent Most Critical Failures -- Key: HBASE-12187 URL: https://issues.apache.org/jira/browse/HBASE-12187 Project: HBase Issue Type: Bug Reporter: stack Priority: Critical Attachments: HBASE-12187.patch, abortInOvercatch.warnings.txt, emptyCatch.warnings.txt, todoInCatch.warnings.txt Review the helpful paper https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf It describes 'catastrophic failures', especially issues where exceptions are thrown but not properly handled. Their static analysis tool Aspirator turns up a bunch of the obvious offenders (Lets add to test-patch.sh alongside findbugs?). This issue is about going through code base making sub-issues to root out these and others (Don't we have the test described in figure #6 already? I thought we did? If we don't, need to add). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12326) Document scanner timeout workarounds in troubleshooting section
[ https://issues.apache.org/jira/browse/HBASE-12326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184756#comment-14184756 ] Misty Stanley-Jones commented on HBASE-12326: - Anyone available to review? [~saint@gmail.com] [~apurtell] [~ndimiduk] perhaps? Document scanner timeout workarounds in troubleshooting section --- Key: HBASE-12326 URL: https://issues.apache.org/jira/browse/HBASE-12326 Project: HBase Issue Type: Bug Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-12326.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels
[ https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184772#comment-14184772 ] Anoop Sam John commented on HBASE-12346: We have EnforcingScanLabelGenerator which will always give back the user's auth labels and ignore whatever passed in Scan Authorizations. Even if the scan passes a subset of user's auth labels, that is ignored and will assign all user auths. And this is not the default impl DefaultScanLabelGenerator when passing no Authorizations in Scan, give back no results. I am more inclined towards doing what this jira proposes. That looks easy for use. We can have stack of ScanLabelGenerator and achieve what user wants. But a default behaving the said would be more user friendly IMO. I was also thinking on this some time back and make things easier. What do you say Andy? Scan's default auths behavior under Visibility labels - Key: HBASE-12346 URL: https://issues.apache.org/jira/browse/HBASE-12346 Project: HBase Issue Type: Bug Components: API, security Affects Versions: 0.98.7, 0.99.1 Reporter: Jerry He Attachments: HBASE-12346-master.patch In Visibility Labels security, a set of labels (auths) are administered and associated with a user. A user can normally only see cell data during scan that are part of the user's label set (auths). Scan uses setAuthorizations to indicates its wants to use the auths to access the cells. Similarly in the shell: {code} scan 'table1', AUTHORIZATIONS = ['private'] {code} But it is a surprise to find that setAuthorizations seems to be 'mandatory' in the default visibility label security setting. Every scan needs to setAuthorizations before the scan can get any cells even the cells are under the labels the request user is part of. The following steps will illustrate the issue: Run as superuser. {code} 1. create a visibility label called 'private' 2. create 'table1' 3. put into 'table1' data and label the data as 'private' 4. set_auths 'user1', 'private' 5. grant 'user1', 'RW', 'table1' {code} Run as 'user1': {code} 1. scan 'table1' This show no cells. 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private'] This will show all the data. {code} I am not sure if this is expected by design or a bug. But a more reasonable, more client application backward compatible, and less surprising default behavior should probably look like this: A scan's default auths, if its Authorizations attributes is not set explicitly, should be all the auths the request user is administered and allowed on the server. If scan.setAuthorizations is used, then the server further filter the auths during scan: use the input auths minus what is not in user's label set on the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key
[ https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184782#comment-14184782 ] Anoop Sam John commented on HBASE-12313: bq.Where we were using estimatedLengthOf (What is this anyways – smile? Serialized 'length' or size on heap? Or size of the serialized KeyValue byte array – which is going away), we were talking serialized size. I was thinking estimatedSerializedSizeOf more appropriate where I did the replaces. This is mostly used in metric calc now. Some extra bytes is ok there. This is used in SizedCellScanner size calc also but I can not see this size is really been used now. Said it is estimate am ok with the change. Basically the diff btw estimatedSizeOf and estimatedLengthOf was the former is having and INT size extra. This is because when we serialize the KV over wire, we write the KV length (4 bytes) first followed by the kv buffer(KL, VL, Key and Value) getSumOfKeyElementLengths This is supposed to add the rk, cf, q and ts type parts but in patch we end up adding value and tags part also. Am I missing some thing? We need fix this? Redo the hfile index length optimization so cell-based rather than serialized KV key Key: HBASE-12313 URL: https://issues.apache.org/jira/browse/HBASE-12313 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: stack Assignee: stack Attachments: 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt Trying to remove API that returns the 'key' of a KV serialized into a byte array is thorny. I tried to move over the first and last key serializations and the hfile index entries to be cell but patch was turning massive. Here is a smaller patch that just redoes the optimization that tries to find 'short' midpoints between last key of last block and first key of next block so it is Cell-based rather than byte array based (presuming Keys serialized in a certain way). Adds unit tests which we didn't have before. Also remove CellKey. Not needed... at least not yet. Its just utility for toString. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091
[ https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184793#comment-14184793 ] stack commented on HBASE-12285: --- Ok. I'll set it back. I'll put back toot he record to logs config. Builds are failing, possibly because of SUREFIRE-1091 - Key: HBASE-12285 URL: https://issues.apache.org/jira/browse/HBASE-12285 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dima Spivak Assignee: Dima Spivak Priority: Blocker Attachments: HBASE-12285_branch-1_v1.patch Our branch-1 builds on builds.apache.org have been failing in recent days after we switched over to an official version of Surefire a few days back (HBASE-4955). The version we're using, 2.17, is hit by a bug ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results in an IOException, which looks like what we're seeing on Jenkins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091
[ https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184794#comment-14184794 ] stack commented on HBASE-12285: --- Set it back. Maybe next up is hosting our own surefilre build but we should spend some time on tests that log 50MB for sure; we will only annoy people logging that much. Builds are failing, possibly because of SUREFIRE-1091 - Key: HBASE-12285 URL: https://issues.apache.org/jira/browse/HBASE-12285 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dima Spivak Assignee: Dima Spivak Priority: Blocker Attachments: HBASE-12285_branch-1_v1.patch Our branch-1 builds on builds.apache.org have been failing in recent days after we switched over to an official version of Surefire a few days back (HBASE-4955). The version we're using, 2.17, is hit by a bug ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results in an IOException, which looks like what we're seeing on Jenkins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key
[ https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184796#comment-14184796 ] stack commented on HBASE-12313: --- bq. Basically the diff btw estimatedSizeOf and estimatedLengthOf was the former is having and INT size extra. I think in the end we want serialized size and heap size and maybe an estimated size that would be cheaper to calculate than either of the former for places where it is not that important. The patch makes a start on it. The sizings that are in this patch as I see it cause no problem; perhaps a slight overcount but its for metrics only -- not for anything important (You agree?) bq. Am I missing some thing? We need fix this? No you are right but 'do we need to fix it?' It is ok that the size calculated is 'rough', approx, in the context, or do you think otherwise? Redo the hfile index length optimization so cell-based rather than serialized KV key Key: HBASE-12313 URL: https://issues.apache.org/jira/browse/HBASE-12313 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: stack Assignee: stack Attachments: 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt Trying to remove API that returns the 'key' of a KV serialized into a byte array is thorny. I tried to move over the first and last key serializations and the hfile index entries to be cell but patch was turning massive. Here is a smaller patch that just redoes the optimization that tries to find 'short' midpoints between last key of last block and first key of next block so it is Cell-based rather than byte array based (presuming Keys serialized in a certain way). Adds unit tests which we didn't have before. Also remove CellKey. Not needed... at least not yet. Its just utility for toString. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11912) Catch some bad practices at compile time with error-prone
[ https://issues.apache.org/jira/browse/HBASE-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184801#comment-14184801 ] stack commented on HBASE-11912: --- You commit your fixup [~apurtell]? Looks like we need this (smile). See over in HBASE-12187. Catch some bad practices at compile time with error-prone - Key: HBASE-11912 URL: https://issues.apache.org/jira/browse/HBASE-11912 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Attachments: HBASE-11912.patch, HBASE-11912.patch, HBASE-11912.patch, HBASE-11912.patch Google's error-prone (https://code.google.com/p/error-prone/) wraps javac with some additional static analysis that will generate additional warnings or errors at compile time if certain bug patterns (https://code.google.com/p/error-prone/wiki/BugPatterns) are detected. What's nice about this approach, as opposed to findbugs, is the compile time detection and erroring out prevent the detected problems from getting into the codebase up front. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12207) A script to help keep your Git repo fresh
[ https://issues.apache.org/jira/browse/HBASE-12207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184811#comment-14184811 ] Misty Stanley-Jones commented on HBASE-12207: - No further comments. Since this is just a script and not part of HBase itself, I will go ahead and commit it (fixing the tabs in the output on commit). A script to help keep your Git repo fresh - Key: HBASE-12207 URL: https://issues.apache.org/jira/browse/HBASE-12207 Project: HBase Issue Type: Improvement Components: documentation, scripts Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-12207-v1.patch, HBASE-12207-v2.patch, HBASE-12207-v3.patch, HBASE-12207-v4.patch, HBASE-12207-v5.patch, HBASE-12207-v6.patch, HBASE-12207.patch I have a script that does a {code}git pull --rebase{code} on each tracking branch, and then attempts an automatic rebase of each local branch against its tracking branch. It also prompts you to delete local branches for HBASE- JIRAs that have been closed. I think this script may help to enforce good Git practices. It may be a good candidate to be included in dev-support/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12207) A script to help keep your Git repo fresh
[ https://issues.apache.org/jira/browse/HBASE-12207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones updated HBASE-12207: Attachment: HBASE-12207-v7.patch What I committed to Master. A script to help keep your Git repo fresh - Key: HBASE-12207 URL: https://issues.apache.org/jira/browse/HBASE-12207 Project: HBase Issue Type: Improvement Components: documentation, scripts Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-12207-v1.patch, HBASE-12207-v2.patch, HBASE-12207-v3.patch, HBASE-12207-v4.patch, HBASE-12207-v5.patch, HBASE-12207-v6.patch, HBASE-12207-v7.patch, HBASE-12207.patch I have a script that does a {code}git pull --rebase{code} on each tracking branch, and then attempts an automatic rebase of each local branch against its tracking branch. It also prompts you to delete local branches for HBASE- JIRAs that have been closed. I think this script may help to enforce good Git practices. It may be a good candidate to be included in dev-support/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12207) A script to help keep your Git repo fresh
[ https://issues.apache.org/jira/browse/HBASE-12207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones updated HBASE-12207: Resolution: Fixed Fix Version/s: 2.0.0 Status: Resolved (was: Patch Available) A script to help keep your Git repo fresh - Key: HBASE-12207 URL: https://issues.apache.org/jira/browse/HBASE-12207 Project: HBase Issue Type: Improvement Components: documentation, scripts Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Fix For: 2.0.0 Attachments: HBASE-12207-v1.patch, HBASE-12207-v2.patch, HBASE-12207-v3.patch, HBASE-12207-v4.patch, HBASE-12207-v5.patch, HBASE-12207-v6.patch, HBASE-12207-v7.patch, HBASE-12207.patch I have a script that does a {code}git pull --rebase{code} on each tracking branch, and then attempts an automatic rebase of each local branch against its tracking branch. It also prompts you to delete local branches for HBASE- JIRAs that have been closed. I think this script may help to enforce good Git practices. It may be a good candidate to be included in dev-support/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key
[ https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184820#comment-14184820 ] Anoop Sam John commented on HBASE-12313: bq.The sizings that are in this patch as I see it cause no problem; perhaps a slight overcount but its for metrics only – not for anything important (You agree?) Yes I am ok with it Stack. bq.No you are right but 'do we need to fix it?' It is ok that the size calculated is 'rough', approx As you see below the estimatedSerializedSizeOfKey() returns the key parts lengths only when it is KeyValue. But when it is non KV Cell, we end up adding value and tags length. This wont be slight change as the value length normally can be very large. I am concerned over this. {code} + private static int getSumOfKeyElementLengths(final Cell cell) { +return cell.getRowLength() + cell.getFamilyLength() + +cell.getQualifierLength() + +cell.getValueLength() + +cell.getTagsLength() + +KeyValue.TIMESTAMP_TYPE_SIZE; + } + + public static int estimatedSerializedSizeOfKey(final Cell cell) { +if (cell instanceof KeyValue) return ((KeyValue)cell).getKeyLength(); +// This will be a low estimate. Will do for now. +return getSumOfKeyElementLengths(cell); + } {code} Redo the hfile index length optimization so cell-based rather than serialized KV key Key: HBASE-12313 URL: https://issues.apache.org/jira/browse/HBASE-12313 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: stack Assignee: stack Attachments: 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt Trying to remove API that returns the 'key' of a KV serialized into a byte array is thorny. I tried to move over the first and last key serializations and the hfile index entries to be cell but patch was turning massive. Here is a smaller patch that just redoes the optimization that tries to find 'short' midpoints between last key of last block and first key of next block so it is Cell-based rather than byte array based (presuming Keys serialized in a certain way). Adds unit tests which we didn't have before. Also remove CellKey. Not needed... at least not yet. Its just utility for toString. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11419) After increasing TTL value of a hbase table having pre-split regions and decreasing TTL value, table becomes inaccessible.
[ https://issues.apache.org/jira/browse/HBASE-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184832#comment-14184832 ] Prabhu Joseph commented on HBASE-11419: --- Hi Lars, This issue happens in Distributed mode. We have two regionservers. I have attached our hbase-site.xml. Hbase version is hbase 0.94.6 After increasing TTL value of a hbase table having pre-split regions and decreasing TTL value, table becomes inaccessible. -- Key: HBASE-11419 URL: https://issues.apache.org/jira/browse/HBASE-11419 Project: HBase Issue Type: Bug Components: HFile Affects Versions: 0.94.6 Environment: Linux x86_64 Reporter: Prabhu Joseph Priority: Blocker Attachments: HBaseExporter.java, account.csv Original Estimate: 96h Remaining Estimate: 96h After increasing and decreasing the TTL value of a Hbase Table , table gets inaccessible. Scan table not working. Scan in hbase shell throws java.lang.IllegalStateException: Block index not loaded at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.hadoop.hbase.io.hfile.HFileReaderV1.blockContainingKey(HFileReaderV1.java:181) at org.apache.hadoop.hbase.io.hfile.HFileReaderV1$AbstractScannerV1.seekTo(HFileReaderV1.java:426) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145) at org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:131) at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2015) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3706) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1761) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1753) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1730) at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2409) at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11419) After increasing TTL value of a hbase table having pre-split regions and decreasing TTL value, table becomes inaccessible.
[ https://issues.apache.org/jira/browse/HBASE-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated HBASE-11419: -- Attachment: hbase-site.xml After increasing TTL value of a hbase table having pre-split regions and decreasing TTL value, table becomes inaccessible. -- Key: HBASE-11419 URL: https://issues.apache.org/jira/browse/HBASE-11419 Project: HBase Issue Type: Bug Components: HFile Affects Versions: 0.94.6 Environment: Linux x86_64 Reporter: Prabhu Joseph Priority: Blocker Attachments: HBaseExporter.java, account.csv, hbase-site.xml Original Estimate: 96h Remaining Estimate: 96h After increasing and decreasing the TTL value of a Hbase Table , table gets inaccessible. Scan table not working. Scan in hbase shell throws java.lang.IllegalStateException: Block index not loaded at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.hadoop.hbase.io.hfile.HFileReaderV1.blockContainingKey(HFileReaderV1.java:181) at org.apache.hadoop.hbase.io.hfile.HFileReaderV1$AbstractScannerV1.seekTo(HFileReaderV1.java:426) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145) at org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:131) at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2015) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3706) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1761) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1753) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1730) at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2409) at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-10780) HFilePrettyPrinter#processFile should return immediately if file does not exists.
[ https://issues.apache.org/jira/browse/HBASE-10780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi reassigned HBASE-10780: - Assignee: Ashish Singhi HFilePrettyPrinter#processFile should return immediately if file does not exists. - Key: HBASE-10780 URL: https://issues.apache.org/jira/browse/HBASE-10780 Project: HBase Issue Type: Bug Components: HFile Affects Versions: 0.94.11 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Attachments: HBASE-10780.patch HFilePrettyPrinter#processFile should return immediately if file does not exists same like HLogPrettyPrinter#run {code} if (!fs.exists(file)) { System.err.println(ERROR, file doesnt exist: + file); }{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12304) CellCounter will throw AIOBE when output directory is not specified
[ https://issues.apache.org/jira/browse/HBASE-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-12304: -- Attachment: HBASE-12304-0.98.patch HBASE-12304-v3.patch Attached patch for master and 0.98 branch. Please review. CellCounter will throw AIOBE when output directory is not specified --- Key: HBASE-12304 URL: https://issues.apache.org/jira/browse/HBASE-12304 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.98.5 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Attachments: HBASE-12304-0.98.patch, HBASE-12304-v2.patch, HBASE-12304-v3.patch, HBase-12304.patch CellCounter will throw ArrayIndexOutOfBoundsException when output directory is not specified instead it should display the usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-10780) HFilePrettyPrinter#processFile should return immediately if file does not exists.
[ https://issues.apache.org/jira/browse/HBASE-10780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-10780: -- Attachment: HBASE-10780-v2.patch Thanks Ted for looking into it. Updated the patch as per your suggestion. Please review. HFilePrettyPrinter#processFile should return immediately if file does not exists. - Key: HBASE-10780 URL: https://issues.apache.org/jira/browse/HBASE-10780 Project: HBase Issue Type: Bug Components: HFile Affects Versions: 0.94.11 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Attachments: HBASE-10780-v2.patch, HBASE-10780.patch HFilePrettyPrinter#processFile should return immediately if file does not exists same like HLogPrettyPrinter#run {code} if (!fs.exists(file)) { System.err.println(ERROR, file doesnt exist: + file); }{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12304) CellCounter will throw AIOBE when output directory is not specified
[ https://issues.apache.org/jira/browse/HBASE-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184839#comment-14184839 ] Hadoop QA commented on HBASE-12304: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12677242/HBASE-12304-0.98.patch against trunk revision . ATTACHMENT ID: 12677242 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/11469//console This message is automatically generated. CellCounter will throw AIOBE when output directory is not specified --- Key: HBASE-12304 URL: https://issues.apache.org/jira/browse/HBASE-12304 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.98.5 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Attachments: HBASE-12304-0.98.patch, HBASE-12304-v2.patch, HBASE-12304-v3.patch, HBase-12304.patch CellCounter will throw ArrayIndexOutOfBoundsException when output directory is not specified instead it should display the usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11992) Backport HBASE-11367 (Pluggable replication endpoint) to 0.98
[ https://issues.apache.org/jira/browse/HBASE-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184840#comment-14184840 ] ramkrishna.s.vasudevan commented on HBASE-11992: [~apurtell],[~apurt...@yahoo.com] Pls a take a look at the RB. Would be useful to get the feature dependent on this to 0.98. (HBASE-11639). Backport HBASE-11367 (Pluggable replication endpoint) to 0.98 - Key: HBASE-11992 URL: https://issues.apache.org/jira/browse/HBASE-11992 Project: HBase Issue Type: Task Reporter: Andrew Purtell Assignee: ramkrishna.s.vasudevan Attachments: HBASE-11992_0.98_1.patch, hbase-11367_0.98.patch ReplicationSource tails the logs for each peer. HBASE-11367 introduces ReplicationEndpoint which is customizable per peer. ReplicationEndpoint is run in the same RS process and instantiated per replication peer per region server. Implementations of this interface handle the actual shipping of WAL edits to the remote cluster. This issue is for backporting HBASE-11367 to 0.98. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091
[ https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184843#comment-14184843 ] Hudson commented on HBASE-12285: SUCCESS: Integrated in HBase-1.0 #365 (See [https://builds.apache.org/job/HBase-1.0/365/]) HBASE-12285 Builds are failing, possibly because of SUREFIRE-1091 -- Setting log level back to DEBUG TO WARN -- second time (stack: rev 862faca7a4f82e032572f8426851968cf7ba017c) * hbase-server/src/test/resources/log4j.properties Builds are failing, possibly because of SUREFIRE-1091 - Key: HBASE-12285 URL: https://issues.apache.org/jira/browse/HBASE-12285 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dima Spivak Assignee: Dima Spivak Priority: Blocker Attachments: HBASE-12285_branch-1_v1.patch Our branch-1 builds on builds.apache.org have been failing in recent days after we switched over to an official version of Surefire a few days back (HBASE-4955). The version we're using, 2.17, is hit by a bug ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results in an IOException, which looks like what we're seeing on Jenkins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12347) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh
Misty Stanley-Jones created HBASE-12347: --- Summary: Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh Key: HBASE-12347 URL: https://issues.apache.org/jira/browse/HBASE-12347 Project: HBase Issue Type: Sub-task Components: scripts Reporter: Misty Stanley-Jones Priority: Minor The rebase_all_hbase_branches.sh script is unable to detect that HBASE-12207 is closed, because for that one JIRA, the curl command that detects the status is returning the status, but also the text from Hadoop QA for each patch it has evaluated on the JIRA: {code} $ curl -s https://issues.apache.org/jira/browse/HBASE-12207 | grep resolution-val span id=resolution-val class=value resolved + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) {code} All but the top line of output are from parsing comments from Hadoop QA. I think this is an edge case that will only to patches against that section of dev-support/rebase_all_git_branches.sh. However, I'm not sure the cleanest way to fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12347) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh
[ https://issues.apache.org/jira/browse/HBASE-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones updated HBASE-12347: Issue Type: Bug (was: Sub-task) Parent: (was: HBASE-12207) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh Key: HBASE-12347 URL: https://issues.apache.org/jira/browse/HBASE-12347 Project: HBase Issue Type: Bug Components: scripts Reporter: Misty Stanley-Jones Priority: Minor Fix For: 2.0.0 The rebase_all_hbase_branches.sh script is unable to detect that HBASE-12207 is closed, because for that one JIRA, the curl command that detects the status is returning the status, but also the text from Hadoop QA for each patch it has evaluated on the JIRA: {code} $ curl -s https://issues.apache.org/jira/browse/HBASE-12207 | grep resolution-val span id=resolution-val class=value resolved + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) {code} All but the top line of output are from parsing comments from Hadoop QA. I think this is an edge case that will only to patches against that section of dev-support/rebase_all_git_branches.sh. However, I'm not sure the cleanest way to fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-12347) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh
[ https://issues.apache.org/jira/browse/HBASE-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones reassigned HBASE-12347: --- Assignee: Misty Stanley-Jones Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh Key: HBASE-12347 URL: https://issues.apache.org/jira/browse/HBASE-12347 Project: HBase Issue Type: Bug Components: scripts Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Priority: Minor Fix For: 2.0.0 The rebase_all_hbase_branches.sh script is unable to detect that HBASE-12207 is closed, because for that one JIRA, the curl command that detects the status is returning the status, but also the text from Hadoop QA for each patch it has evaluated on the JIRA: {code} $ curl -s https://issues.apache.org/jira/browse/HBASE-12207 | grep resolution-val span id=resolution-val class=value resolved + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) {code} All but the top line of output are from parsing comments from Hadoop QA. I think this is an edge case that will only to patches against that section of dev-support/rebase_all_git_branches.sh. However, I'm not sure the cleanest way to fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12347) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh
[ https://issues.apache.org/jira/browse/HBASE-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184858#comment-14184858 ] Misty Stanley-Jones commented on HBASE-12347: - [~busbey] figured out a less brittle way than grepping the HTML: {code} curl -s 'https://issues.apache.org/jira/rest/api/2/issue/HBASE-5699?fields=resolution'|grep -q '{resolution:null}' {code} status is 0 if true (unresolved), 1 if false (something other than unresolved). I'll make a patch. Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh Key: HBASE-12347 URL: https://issues.apache.org/jira/browse/HBASE-12347 Project: HBase Issue Type: Bug Components: scripts Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Priority: Minor Fix For: 2.0.0 The rebase_all_hbase_branches.sh script is unable to detect that HBASE-12207 is closed, because for that one JIRA, the curl command that detects the status is returning the status, but also the text from Hadoop QA for each patch it has evaluated on the JIRA: {code} $ curl -s https://issues.apache.org/jira/browse/HBASE-12207 | grep resolution-val span id=resolution-val class=value resolved + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) {code} All but the top line of output are from parsing comments from Hadoop QA. I think this is an edge case that will only to patches against that section of dev-support/rebase_all_git_branches.sh. However, I'm not sure the cleanest way to fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12347) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh
[ https://issues.apache.org/jira/browse/HBASE-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones updated HBASE-12347: Attachment: HBASE-12347.patch Ready for review. To test, make a branch called HBASE-12207 and see if it gets picked up by the script as resolved. That JIRA ID is the only one that seems to be affected by this (though it would be easy to make a fake case). Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh Key: HBASE-12347 URL: https://issues.apache.org/jira/browse/HBASE-12347 Project: HBase Issue Type: Bug Components: scripts Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Priority: Minor Fix For: 2.0.0 Attachments: HBASE-12347.patch The rebase_all_hbase_branches.sh script is unable to detect that HBASE-12207 is closed, because for that one JIRA, the curl command that detects the status is returning the status, but also the text from Hadoop QA for each patch it has evaluated on the JIRA: {code} $ curl -s https://issues.apache.org/jira/browse/HBASE-12207 | grep resolution-val span id=resolution-val class=value resolved + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) {code} All but the top line of output are from parsing comments from Hadoop QA. I think this is an edge case that will only to patches against that section of dev-support/rebase_all_git_branches.sh. However, I'm not sure the cleanest way to fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12347) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh
[ https://issues.apache.org/jira/browse/HBASE-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones updated HBASE-12347: Status: Patch Available (was: Open) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh Key: HBASE-12347 URL: https://issues.apache.org/jira/browse/HBASE-12347 Project: HBase Issue Type: Bug Components: scripts Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Priority: Minor Fix For: 2.0.0 Attachments: HBASE-12347.patch The rebase_all_hbase_branches.sh script is unable to detect that HBASE-12207 is closed, because for that one JIRA, the curl command that detects the status is returning the status, but also the text from Hadoop QA for each patch it has evaluated on the JIRA: {code} $ curl -s https://issues.apache.org/jira/browse/HBASE-12207 | grep resolution-val span id=resolution-val class=value resolved + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1)br/ + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed -e s/.*class=\value\ //|cut -d'' -f 1) {code} All but the top line of output are from parsing comments from Hadoop QA. I think this is an edge case that will only to patches against that section of dev-support/rebase_all_git_branches.sh. However, I'm not sure the cleanest way to fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12345) Unsafe based Comparator for BB
[ https://issues.apache.org/jira/browse/HBASE-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184864#comment-14184864 ] ramkrishna.s.vasudevan commented on HBASE-12345: Will check this patch later tomorrow. I created a simple patch like this reading the unsafe APIs. Will do some tests with the attached patch. Have some doubts in that. Unsafe based Comparator for BB --- Key: HBASE-12345 URL: https://issues.apache.org/jira/browse/HBASE-12345 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-12345.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12282) Ensure Cells and its implementations work with Buffers also
[ https://issues.apache.org/jira/browse/HBASE-12282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184865#comment-14184865 ] ramkrishna.s.vasudevan commented on HBASE-12282: +1 for new Cell interface that extends current Cell but adds BB APIs to it. Every where in the comparator we will have a condition based check. One thing to note is that for the compartors one cell can be the BB based cell and the other one can be byte[] array back. Changes to KV is hacky but that is basically to make things work and ensure that we have a KV backed by buffer and byte[]. Atleast the fake keys that we create could directlly be buffer based and so those comparisons can be buffer based only. Ensure Cells and its implementations work with Buffers also --- Key: HBASE-12282 URL: https://issues.apache.org/jira/browse/HBASE-12282 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: 0.99.1 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0, 0.99.2 Attachments: HBASE-12224_2.patch This issue can be used to brainstorm and then do the necessary changes for the offheap work. All impl of cells deal with byte[] but when we change the Hfileblocks/Readers to work purely with Buffers then the byte[] usage would mean that always the data is copied to the onheap. Cell may need some interface change to implement this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)