[jira] [Commented] (HBASE-12075) Preemptive Fast Fail

2014-10-26 Thread Manukranth Kolloju (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184416#comment-14184416
 ] 

Manukranth Kolloju commented on HBASE-12075:


I can add some simpler example illustrating what we can do, in the release 
notes. The NoOpRetryableCallerInterceptor will be used by default. The client 
behavior by default doesn't change. As long as 
hbase.client.enable.fast.fail.mode is set to false, the code will use the 
NoOpInterceptor.
About the new in the getNewRpcRetryingCallerFactory, I too felt that it 
didn't sound so much like a builder method. On the contrary I didn't 
particularly like (create/build)RpcRetryingCallerFactory either. I didn't have 
a specific choice about it and left it like that and commented the same on the 
diff. I can make the classes which I am not using in the server tests as 
package private. 

 Preemptive Fast Fail
 

 Key: HBASE-12075
 URL: https://issues.apache.org/jira/browse/HBASE-12075
 Project: HBase
  Issue Type: Sub-task
  Components: Client
Affects Versions: 0.99.0, 2.0.0, 0.98.6.1
Reporter: Manukranth Kolloju
Assignee: Manukranth Kolloju
 Attachments: 0001-Add-a-test-case-for-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch


 In multi threaded clients, we use a feature developed on 0.89-fb branch 
 called Preemptive Fast Fail. This allows the client threads which would 
 potentially fail, fail fast. The idea behind this feature is that we allow, 
 among the hundreds of client threads, one thread to try and establish 
 connection with the regionserver and if that succeeds, we mark it as a live 
 node again. Meanwhile, other threads which are trying to establish connection 
 to the same server would ideally go into the timeouts which is effectively 
 unfruitful. We can in those cases return appropriate exceptions to those 
 clients instead of letting them retry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12345) Unsafe based Comparator for BB

2014-10-26 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184436#comment-14184436
 ] 

Anoop Sam John commented on HBASE-12345:


Yep.  As per the test I have done in HBASE-11425. Added comment there.  Just 
copying that comment to here

{quote}
Testing with a 2 million Cells with single cell per row.
Writing all cells to a BB/DBB and trying a seek with to last kv. (To make 
compare across all cells in BB/DBB)
Seek code is like what we have in ScannerV3#blockSeek
with RK length 17 bytes (1st 13 bytes are same) Getting almost same result.
With RK length 117 bytes (1st 113 bytes are same) the DBB based read is ~3% 
degrade
{quote}
Well in this test, the read and compare were from HBB and DBB and those are 
almost same.
In case of our CellComparator we have Unsafe based optimization. In my old test 
this was not in use. With Unsafe based read from HBB#array() [this is what 
happens in HFileReaderV2/V3] there is a significant perf diff with DBB. Here RK 
length of 117 bytes and 2 millions cells and we seek to last cell, the DBB test 
is 50% slower.

I am thinking of doing Unsafe based compares for data in DBB as well.

Just done Unsafe based access from DBB/HBB and then we are in a better shape. 
The DBB based above test is ~12% slower than old HBB.array() based compares. 
Will raise a subtask and attach the approach there.


 Unsafe based Comparator for BB 
 ---

 Key: HBASE-12345
 URL: https://issues.apache.org/jira/browse/HBASE-12345
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: HBASE-12345.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12345) Unsafe based Comparator for BB

2014-10-26 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184437#comment-14184437
 ] 

Anoop Sam John commented on HBASE-12345:


We can expose APIs getLong/Int etc in BBUtil which uses Unsafe, if it is 
available, and use that to read from BB. We will need that in HFileReaderV2/V3 
seek, next etc.  Also when the Cell is backed by buffer and the lengths , like 
rklength, tagsLength etc are part of the buffer, we can make use of the API for 
faster reads.

 Unsafe based Comparator for BB 
 ---

 Key: HBASE-12345
 URL: https://issues.apache.org/jira/browse/HBASE-12345
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: HBASE-12345.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key

2014-10-26 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184438#comment-14184438
 ] 

Anoop Sam John commented on HBASE-12313:


{code}
   for (Cell cell : rr.rawCells()) {
-resultSize += CellUtil.estimatedLengthOf(cell);
+resultSize += CellUtil.estimatedSerializedSizeOf(cell);
{code}
estimatedLengthOf was returning the total length. estimatedSerializedSizeOf() 
is having extra count 4 bytes +.   Do you want to change really Stack?

 Redo the hfile index length optimization so cell-based rather than serialized 
 KV key
 

 Key: HBASE-12313
 URL: https://issues.apache.org/jira/browse/HBASE-12313
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: stack
Assignee: stack
 Attachments: 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt


 Trying to remove API that returns the 'key' of a KV serialized into a byte 
 array is thorny.
 I tried to move over the first and last key serializations and the hfile 
 index entries to be cell but patch was turning massive.  Here is a smaller 
 patch that just redoes the optimization that tries to find 'short' midpoints 
 between last key of last block and first key of next block so it is 
 Cell-based rather than byte array based (presuming Keys serialized in a 
 certain way).  Adds unit tests which we didn't have before.
 Also remove CellKey.  Not needed... at least not yet.  Its just utility for 
 toString.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key

2014-10-26 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184438#comment-14184438
 ] 

Anoop Sam John edited comment on HBASE-12313 at 10/26/14 9:12 AM:
--

{code}
   for (Cell cell : rr.rawCells()) {
-resultSize += CellUtil.estimatedLengthOf(cell);
+resultSize += CellUtil.estimatedSerializedSizeOf(cell);
{code}
estimatedLengthOf was returning the total length. estimatedSerializedSizeOf() 
is having extra count 4 bytes +.   Do you want to change really Stack?

Replacing estimatedLengthOf with estimatedSerializedSizeOf is correct?


was (Author: anoop.hbase):
{code}
   for (Cell cell : rr.rawCells()) {
-resultSize += CellUtil.estimatedLengthOf(cell);
+resultSize += CellUtil.estimatedSerializedSizeOf(cell);
{code}
estimatedLengthOf was returning the total length. estimatedSerializedSizeOf() 
is having extra count 4 bytes +.   Do you want to change really Stack?

 Redo the hfile index length optimization so cell-based rather than serialized 
 KV key
 

 Key: HBASE-12313
 URL: https://issues.apache.org/jira/browse/HBASE-12313
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: stack
Assignee: stack
 Attachments: 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt


 Trying to remove API that returns the 'key' of a KV serialized into a byte 
 array is thorny.
 I tried to move over the first and last key serializations and the hfile 
 index entries to be cell but patch was turning massive.  Here is a smaller 
 patch that just redoes the optimization that tries to find 'short' midpoints 
 between last key of last block and first key of next block so it is 
 Cell-based rather than byte array based (presuming Keys serialized in a 
 certain way).  Adds unit tests which we didn't have before.
 Also remove CellKey.  Not needed... at least not yet.  Its just utility for 
 toString.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key

2014-10-26 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184438#comment-14184438
 ] 

Anoop Sam John edited comment on HBASE-12313 at 10/26/14 10:16 AM:
---

{code}
   for (Cell cell : rr.rawCells()) {
-resultSize += CellUtil.estimatedLengthOf(cell);
+resultSize += CellUtil.estimatedSerializedSizeOf(cell);
{code}
estimatedLengthOf was returning the total length. estimatedSerializedSizeOf() 
is having extra count 4 bytes +.   Do you want to change really Stack?

Replacing estimatedLengthOf with estimatedSerializedSizeOf is correct?

{code}
+  private static int getSumOfKeyElementLengths(final Cell cell) {
+return cell.getRowLength() + cell.getFamilyLength() +
+cell.getQualifierLength() +
+cell.getValueLength() +
+cell.getTagsLength() +
+KeyValue.TIMESTAMP_TYPE_SIZE;
+  }
+
+  public static int estimatedSerializedSizeOfKey(final Cell cell) {
+if (cell instanceof KeyValue) return ((KeyValue)cell).getKeyLength();
+// This will be a low estimate.  Will do for now.
+return getSumOfKeyElementLengths(cell);
+  }
{code}
getSumOfKeyElementLengths - including lengths of tags and value?

{code}
-return cell.getRowLength() + cell.getFamilyLength() + 
cell.getQualifierLength()
-+ cell.getValueLength() + cell.getTagsLength() + 
KeyValue.TIMESTAMP_TYPE_SIZE;
+// TODO: Add sizing of references that hold the row, family, etc., arrays.
+return estimatedSerializedSizeOf(cell);
{code}
No need to add the extra 4 bytes for heapSize which will come in 
estimatedSerializedSizeOf (?)

{code}
+  public static String getCellKeyAsString(Cell cell) {
+StringBuilder sb = new StringBuilder(Bytes.toStringBinary(
+  cell.getRowArray(), cell.getRowOffset(), cell.getRowLength()));
+sb.append(cell.getFamilyLength() == 0? :
+  Bytes.toStringBinary(cell.getFamilyArray(), cell.getFamilyOffset(), 
cell.getFamilyLength()));
+sb.append(cell.getQualifierLength() == 0? :
+  Bytes.toStringBinary(cell.getQualifierArray(), cell.getQualifierOffset(),
+cell.getQualifierLength()));
{code}
Can we add a separator in between rk, f and q parts?

{code}
-// h goes to the next block
-assertEquals(-2, scanner.seekTo(toKV(h, tagUsage)));
+// 'h' does not exist so we will get a '1' back for not found.
+assertEquals(0, scanner.seekTo(toKV(i, tagUsage)));
assertEquals(i, toRowStr(scanner.getKeyValue()));
{code}
What if we do seekTo 'h' only ?


{code}
-assertEquals(1, blockIndexReader.rootBlockContainingKey(
-toKV(h, tagUsage)));
+// 'h', being midpoint between 'g' and 'i', used to be the block index key 
because of the
+// little optimization done creating block index keys where we try to get 
a midpoint and then
+// make this midpoint short as possible so index blocks are kept tight. 
But now, we won't do
+// the 'optimization' -- create new key -- if there is no gain to be had 
by way of making
+// a shorter key; in this case we just use the start key in the index.  
This means the below
+// test changes.  Looking for 'h', it'll be in the 0 block rather than 1 
block now (though 'h'
+// does not exist in this file).
+assertEquals(0, blockIndexReader.rootBlockContainingKey(toKV(h, 
tagUsage)));
{code}
Read your comment to see why is the change. Will this change in mid point calc 
make any issue in reads?


was (Author: anoop.hbase):
{code}
   for (Cell cell : rr.rawCells()) {
-resultSize += CellUtil.estimatedLengthOf(cell);
+resultSize += CellUtil.estimatedSerializedSizeOf(cell);
{code}
estimatedLengthOf was returning the total length. estimatedSerializedSizeOf() 
is having extra count 4 bytes +.   Do you want to change really Stack?

Replacing estimatedLengthOf with estimatedSerializedSizeOf is correct?

 Redo the hfile index length optimization so cell-based rather than serialized 
 KV key
 

 Key: HBASE-12313
 URL: https://issues.apache.org/jira/browse/HBASE-12313
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: stack
Assignee: stack
 Attachments: 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt


 Trying to remove API that returns the 'key' of a KV serialized into a byte 
 array is thorny.
 I tried to move over the first and last key serializations and the hfile 
 index entries to be cell but patch was turning massive.  Here is 

[jira] [Commented] (HBASE-12075) Preemptive Fast Fail

2014-10-26 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184503#comment-14184503
 ] 

Ted Yu commented on HBASE-12075:


bq. hbase.client.enable.fast.fail.mode is set to false
Since the above config takes boolean value, maybe call it 
hbase.client.fast.fail.enabled ?

 Preemptive Fast Fail
 

 Key: HBASE-12075
 URL: https://issues.apache.org/jira/browse/HBASE-12075
 Project: HBase
  Issue Type: Sub-task
  Components: Client
Affects Versions: 0.99.0, 2.0.0, 0.98.6.1
Reporter: Manukranth Kolloju
Assignee: Manukranth Kolloju
 Attachments: 0001-Add-a-test-case-for-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch


 In multi threaded clients, we use a feature developed on 0.89-fb branch 
 called Preemptive Fast Fail. This allows the client threads which would 
 potentially fail, fail fast. The idea behind this feature is that we allow, 
 among the hundreds of client threads, one thread to try and establish 
 connection with the regionserver and if that succeeds, we mark it as a live 
 node again. Meanwhile, other threads which are trying to establish connection 
 to the same server would ideally go into the timeouts which is effectively 
 unfruitful. We can in those cases return appropriate exceptions to those 
 clients instead of letting them retry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key

2014-10-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184538#comment-14184538
 ] 

stack commented on HBASE-12313:
---

bq. Do you want to change really Stack?

This patch cleans up the CellUtil methods that do size counting.  There were a 
few too many methods each only slightly different from each other.  In this 
particular case, we are just doing an estimate and serialized size is probably 
closest to what we are putting on wire at this stage.  I don't see a problem 
that it is slightly different from what was there before (what was there before 
was an 'estimate').  Do you?

bq. Replacing estimatedLengthOf with estimatedSerializedSizeOf is correct?

Where we were using estimatedLengthOf (What is this anyways -- smile? 
Serialized 'length' or size on heap?  Or size of the serialized KeyValue byte 
array -- which is going away), we were talking serialized size.  I was thinking 
estimatedSerializedSizeOf more appropriate where I did the replaces.

bq. No need to add the extra 4 bytes for heapSize which will come in 
estimatedSerializedSizeOf

Are your referring to the TODO? I'd think that serialized size and heap size 
will be calculated differently when we get around to it.

bq. Can we add a separator in between rk, f and q parts?

Whoops.  Will fix.

bq. What if we do seekTo 'h' only ?

There is no 'h' in the dataset.  It was 'artificial' midpoint.  If you seek to 
'h', you end up in the second block which starts with 'i'.

bq. Will this change in mid point calc make any issue in reads?

I don't believe so.  This whole area was without tests previously.  I made the 
mid calc code stand apart and added a bunch in this patch.  I also as part of 
making this patch put in place the old code and the new and when the result did 
not equate, I threw exception as our unit test suite ran.  I looked at each 
case to see if the difference was legit?  What I found was that the differences 
were because we made midkeys even when no advantage (as in the above 'h' case 
-- no need to make a midkey if all sizes are the same).



 Redo the hfile index length optimization so cell-based rather than serialized 
 KV key
 

 Key: HBASE-12313
 URL: https://issues.apache.org/jira/browse/HBASE-12313
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: stack
Assignee: stack
 Attachments: 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt


 Trying to remove API that returns the 'key' of a KV serialized into a byte 
 array is thorny.
 I tried to move over the first and last key serializations and the hfile 
 index entries to be cell but patch was turning massive.  Here is a smaller 
 patch that just redoes the optimization that tries to find 'short' midpoints 
 between last key of last block and first key of next block so it is 
 Cell-based rather than byte array based (presuming Keys serialized in a 
 certain way).  Adds unit tests which we didn't have before.
 Also remove CellKey.  Not needed... at least not yet.  Its just utility for 
 toString.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12346) Scan's default auths behavior under Visibility labels

2014-10-26 Thread Jerry He (JIRA)
Jerry He created HBASE-12346:


 Summary: Scan's default auths behavior under Visibility labels
 Key: HBASE-12346
 URL: https://issues.apache.org/jira/browse/HBASE-12346
 Project: HBase
  Issue Type: Bug
  Components: API, security
Affects Versions: 0.99.1, 0.98.7
Reporter: Jerry He


In Visibility Labels security, a set of labels (auths) are administered and 
associated with a user.
A user can normally  only see cell data during scan that are part of the user's 
label set (auths).
Scan uses setAuthorizations to indicates its wants to use the auths to access 
the cells.
Similarly in the shell:
{code}
scan 'table1', AUTHORIZATIONS = ['private']
{code}
But it is a surprise to find that setAuthorizations seems to be 'mandatory' in 
the default visibility label security setting.  Every scan needs to 
setAuthorizations before the scan can get any cells even the cells are under 
the labels the request user is part of.

The following steps will illustrate the issue:

Run as superuser.
{code}
1. create a visibility label called 'private'
2. create 'table1'
3. put into 'table1' data and label the data as 'private'
4. set_auths 'user1', 'private'
5. grant 'user1', 'RW', 'table1'
{code}
Run as 'user1':
{code}
1. scan 'table1'
This show no cells.
2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private']
This will show all the data.
{code}

I am not sure if this is expected by design or a bug.
But a more reasonable, more client application backward compatible, and less 
surprising default behavior should probably look like this:

A scan's default auths, if its Authorizations attributes is not set explicitly, 
should be all the auths the request user is administered and allowed on the 
server.

If scan.setAuthorizations is used, then the server further filter the auths 
during scan: use the input auths minus what is not in user's label set on the 
server.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels

2014-10-26 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184587#comment-14184587
 ] 

Jerry He commented on HBASE-12346:
--

In this accumulo doc:

http://accumulo.apache.org/1.6/examples/visibility.html

The default authorizations for a scan are the user's entire set of 
authorizations.

 Scan's default auths behavior under Visibility labels
 -

 Key: HBASE-12346
 URL: https://issues.apache.org/jira/browse/HBASE-12346
 Project: HBase
  Issue Type: Bug
  Components: API, security
Affects Versions: 0.98.7, 0.99.1
Reporter: Jerry He

 In Visibility Labels security, a set of labels (auths) are administered and 
 associated with a user.
 A user can normally  only see cell data during scan that are part of the 
 user's label set (auths).
 Scan uses setAuthorizations to indicates its wants to use the auths to access 
 the cells.
 Similarly in the shell:
 {code}
 scan 'table1', AUTHORIZATIONS = ['private']
 {code}
 But it is a surprise to find that setAuthorizations seems to be 'mandatory' 
 in the default visibility label security setting.  Every scan needs to 
 setAuthorizations before the scan can get any cells even the cells are under 
 the labels the request user is part of.
 The following steps will illustrate the issue:
 Run as superuser.
 {code}
 1. create a visibility label called 'private'
 2. create 'table1'
 3. put into 'table1' data and label the data as 'private'
 4. set_auths 'user1', 'private'
 5. grant 'user1', 'RW', 'table1'
 {code}
 Run as 'user1':
 {code}
 1. scan 'table1'
 This show no cells.
 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private']
 This will show all the data.
 {code}
 I am not sure if this is expected by design or a bug.
 But a more reasonable, more client application backward compatible, and less 
 surprising default behavior should probably look like this:
 A scan's default auths, if its Authorizations attributes is not set 
 explicitly, should be all the auths the request user is administered and 
 allowed on the server.
 If scan.setAuthorizations is used, then the server further filter the auths 
 during scan: use the input auths minus what is not in user's label set on the 
 server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12346) Scan's default auths behavior under Visibility labels

2014-10-26 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-12346:
-
Attachment: HBASE-12346-master.patch

 Scan's default auths behavior under Visibility labels
 -

 Key: HBASE-12346
 URL: https://issues.apache.org/jira/browse/HBASE-12346
 Project: HBase
  Issue Type: Bug
  Components: API, security
Affects Versions: 0.98.7, 0.99.1
Reporter: Jerry He
 Attachments: HBASE-12346-master.patch


 In Visibility Labels security, a set of labels (auths) are administered and 
 associated with a user.
 A user can normally  only see cell data during scan that are part of the 
 user's label set (auths).
 Scan uses setAuthorizations to indicates its wants to use the auths to access 
 the cells.
 Similarly in the shell:
 {code}
 scan 'table1', AUTHORIZATIONS = ['private']
 {code}
 But it is a surprise to find that setAuthorizations seems to be 'mandatory' 
 in the default visibility label security setting.  Every scan needs to 
 setAuthorizations before the scan can get any cells even the cells are under 
 the labels the request user is part of.
 The following steps will illustrate the issue:
 Run as superuser.
 {code}
 1. create a visibility label called 'private'
 2. create 'table1'
 3. put into 'table1' data and label the data as 'private'
 4. set_auths 'user1', 'private'
 5. grant 'user1', 'RW', 'table1'
 {code}
 Run as 'user1':
 {code}
 1. scan 'table1'
 This show no cells.
 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private']
 This will show all the data.
 {code}
 I am not sure if this is expected by design or a bug.
 But a more reasonable, more client application backward compatible, and less 
 surprising default behavior should probably look like this:
 A scan's default auths, if its Authorizations attributes is not set 
 explicitly, should be all the auths the request user is administered and 
 allowed on the server.
 If scan.setAuthorizations is used, then the server further filter the auths 
 during scan: use the input auths minus what is not in user's label set on the 
 server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels

2014-10-26 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184601#comment-14184601
 ] 

Jerry He commented on HBASE-12346:
--

Attached a patch If everyone agrees with proposed change to the default auths 
behavior.

 Scan's default auths behavior under Visibility labels
 -

 Key: HBASE-12346
 URL: https://issues.apache.org/jira/browse/HBASE-12346
 Project: HBase
  Issue Type: Bug
  Components: API, security
Affects Versions: 0.98.7, 0.99.1
Reporter: Jerry He
 Attachments: HBASE-12346-master.patch


 In Visibility Labels security, a set of labels (auths) are administered and 
 associated with a user.
 A user can normally  only see cell data during scan that are part of the 
 user's label set (auths).
 Scan uses setAuthorizations to indicates its wants to use the auths to access 
 the cells.
 Similarly in the shell:
 {code}
 scan 'table1', AUTHORIZATIONS = ['private']
 {code}
 But it is a surprise to find that setAuthorizations seems to be 'mandatory' 
 in the default visibility label security setting.  Every scan needs to 
 setAuthorizations before the scan can get any cells even the cells are under 
 the labels the request user is part of.
 The following steps will illustrate the issue:
 Run as superuser.
 {code}
 1. create a visibility label called 'private'
 2. create 'table1'
 3. put into 'table1' data and label the data as 'private'
 4. set_auths 'user1', 'private'
 5. grant 'user1', 'RW', 'table1'
 {code}
 Run as 'user1':
 {code}
 1. scan 'table1'
 This show no cells.
 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private']
 This will show all the data.
 {code}
 I am not sure if this is expected by design or a bug.
 But a more reasonable, more client application backward compatible, and less 
 surprising default behavior should probably look like this:
 A scan's default auths, if its Authorizations attributes is not set 
 explicitly, should be all the auths the request user is administered and 
 allowed on the server.
 If scan.setAuthorizations is used, then the server further filter the auths 
 during scan: use the input auths minus what is not in user's label set on the 
 server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels

2014-10-26 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184641#comment-14184641
 ] 

Andrew Purtell commented on HBASE-12346:


The default scan label generator has the behavior you describe. If you don't 
ask for any authorizations you don't get any. There is another label generator 
that will do what you want but will force the user's assigned set. Label 
generators are stackable. It could make sense to change all of this around a 
bit to have the default configuration start with the generator that adds labels 
assigned to the user in the labels table, with another generator stacked on top 
that adds auths passed in on a Scan attribute. This would be more flexible then 
the result after the proposed patch is applied. 

 Scan's default auths behavior under Visibility labels
 -

 Key: HBASE-12346
 URL: https://issues.apache.org/jira/browse/HBASE-12346
 Project: HBase
  Issue Type: Bug
  Components: API, security
Affects Versions: 0.98.7, 0.99.1
Reporter: Jerry He
 Attachments: HBASE-12346-master.patch


 In Visibility Labels security, a set of labels (auths) are administered and 
 associated with a user.
 A user can normally  only see cell data during scan that are part of the 
 user's label set (auths).
 Scan uses setAuthorizations to indicates its wants to use the auths to access 
 the cells.
 Similarly in the shell:
 {code}
 scan 'table1', AUTHORIZATIONS = ['private']
 {code}
 But it is a surprise to find that setAuthorizations seems to be 'mandatory' 
 in the default visibility label security setting.  Every scan needs to 
 setAuthorizations before the scan can get any cells even the cells are under 
 the labels the request user is part of.
 The following steps will illustrate the issue:
 Run as superuser.
 {code}
 1. create a visibility label called 'private'
 2. create 'table1'
 3. put into 'table1' data and label the data as 'private'
 4. set_auths 'user1', 'private'
 5. grant 'user1', 'RW', 'table1'
 {code}
 Run as 'user1':
 {code}
 1. scan 'table1'
 This show no cells.
 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private']
 This will show all the data.
 {code}
 I am not sure if this is expected by design or a bug.
 But a more reasonable, more client application backward compatible, and less 
 surprising default behavior should probably look like this:
 A scan's default auths, if its Authorizations attributes is not set 
 explicitly, should be all the auths the request user is administered and 
 allowed on the server.
 If scan.setAuthorizations is used, then the server further filter the auths 
 during scan: use the input auths minus what is not in user's label set on the 
 server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels

2014-10-26 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184644#comment-14184644
 ] 

Andrew Purtell commented on HBASE-12346:


Also, I don't believe we should or even can aim to be transparently like 
Accumulo. The labels feature should be most useful and relevant for HBase users 
building HBase applications. Maybe the proposal here by consensus meets that 
test, but that would be independent of what any Accumulo documentation says (or 
doesn't). We did aim for some familiarity in the design of the API and shell 
commands but I'm not sure in retrospect that's more harmful (because we aren't 
going to get Accumulo exact semantics with a tag based implementation) than 
helpful. 

 Scan's default auths behavior under Visibility labels
 -

 Key: HBASE-12346
 URL: https://issues.apache.org/jira/browse/HBASE-12346
 Project: HBase
  Issue Type: Bug
  Components: API, security
Affects Versions: 0.98.7, 0.99.1
Reporter: Jerry He
 Attachments: HBASE-12346-master.patch


 In Visibility Labels security, a set of labels (auths) are administered and 
 associated with a user.
 A user can normally  only see cell data during scan that are part of the 
 user's label set (auths).
 Scan uses setAuthorizations to indicates its wants to use the auths to access 
 the cells.
 Similarly in the shell:
 {code}
 scan 'table1', AUTHORIZATIONS = ['private']
 {code}
 But it is a surprise to find that setAuthorizations seems to be 'mandatory' 
 in the default visibility label security setting.  Every scan needs to 
 setAuthorizations before the scan can get any cells even the cells are under 
 the labels the request user is part of.
 The following steps will illustrate the issue:
 Run as superuser.
 {code}
 1. create a visibility label called 'private'
 2. create 'table1'
 3. put into 'table1' data and label the data as 'private'
 4. set_auths 'user1', 'private'
 5. grant 'user1', 'RW', 'table1'
 {code}
 Run as 'user1':
 {code}
 1. scan 'table1'
 This show no cells.
 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private']
 This will show all the data.
 {code}
 I am not sure if this is expected by design or a bug.
 But a more reasonable, more client application backward compatible, and less 
 surprising default behavior should probably look like this:
 A scan's default auths, if its Authorizations attributes is not set 
 explicitly, should be all the auths the request user is administered and 
 allowed on the server.
 If scan.setAuthorizations is used, then the server further filter the auths 
 during scan: use the input auths minus what is not in user's label set on the 
 server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-8607) Allow custom filters and coprocessors to be updated for a region server without requiring a restart

2014-10-26 Thread Julian Wissmann (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184667#comment-14184667
 ] 

Julian Wissmann commented on HBASE-8607:


Andrew, your suggestion sounds really interesting. I've just been thinking 
about it for a while in order to estimate how big an effort prototyping this 
would be. The way I understand it, the idea is, that we have an OSGi 
Coprocessor, that the regular coprocessors are registered to as an OSGi 
Service. However, for this to work, there will either need to be a Service 
Registry on each Region or we go with Distributed OSGi and dump it in the 
client. Either way, there also needs to be a mechanism to check service 
availability on the Regions form the client side.

Right now, I'd consider the version with each region server holding its own 
service registry to be quite feasible. I'm thinking the following approach:
The OSGi Coprocessor wil discover Bundles, the only service provided by the 
Bundle will actually be starting discovered coprocessors with its own 
environment. That way the client side wil be rather clean and the actual 
coprocessors will behave as usual (their own protocol and client) thus allowing 
for the OSGi Coprocessor to be rather simple and maximizing flexibility. The 
only required client side functionality will then be representing the Service 
Registry and starting region side services by name (at least I can't think of 
another way considering that protobuf is in the middle of it). 

Any more thoughts on this?

 Allow custom filters and coprocessors to be updated for a region server 
 without requiring a restart
 ---

 Key: HBASE-8607
 URL: https://issues.apache.org/jira/browse/HBASE-8607
 Project: HBase
  Issue Type: New Feature
  Components: regionserver
Reporter: James Taylor

 One solution to allowing custom filters and coprocessors to be updated for a 
 region server without requiring a restart might be to run the HBase server in 
 an OSGi container (maybe there are other approaches as well?). Typically, 
 applications that use coprocessors and custom filters also have shared 
 classes underneath, so putting the burden on the user to include some kind of 
 version name in the class is not adequate. Including the version name in the 
 package might work in some cases (at least until dependent jars start to 
 change as well), but is cumbersome and overburdens the app developer.
 Regardless of what approach is taken, we'd need to define the life cycle of 
 the coprocessors and custom filters when a new version is loaded. For 
 example, in-flight invocations could continue to use the old version while 
 new invocations would use the new ones. Once the in-flight invocations are 
 complete, the old code/jar could be unloaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12343) Document recommended configuration for 0.98 from HBASE-11964

2014-10-26 Thread Misty Stanley-Jones (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184668#comment-14184668
 ] 

Misty Stanley-Jones commented on HBASE-12343:
-

+1 from me, as long as you're sure about the technical content. :) 

 Document recommended configuration for 0.98 from HBASE-11964
 

 Key: HBASE-12343
 URL: https://issues.apache.org/jira/browse/HBASE-12343
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 2.0.0

 Attachments: HBASE-12343.patch


 We're not committing the configuration changes from HBASE-11964 to 0.98 but 
 they should be the recommend configuration for replication. Add a paragraph 
 to the replication section of the manual on this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12343) Document recommended configuration for 0.98 from HBASE-11964

2014-10-26 Thread Misty Stanley-Jones (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184669#comment-14184669
 ] 

Misty Stanley-Jones commented on HBASE-12343:
-

By the way, this would have  come up in my JIRA filter of docs issues to review 
if it had the Documentation component.

 Document recommended configuration for 0.98 from HBASE-11964
 

 Key: HBASE-12343
 URL: https://issues.apache.org/jira/browse/HBASE-12343
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 2.0.0

 Attachments: HBASE-12343.patch


 We're not committing the configuration changes from HBASE-11964 to 0.98 but 
 they should be the recommend configuration for replication. Add a paragraph 
 to the replication section of the manual on this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12343) Document recommended configuration for 0.98 from HBASE-11964

2014-10-26 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184678#comment-14184678
 ] 

Andrew Purtell commented on HBASE-12343:


Thanks! I'll remember that for next time 

 Document recommended configuration for 0.98 from HBASE-11964
 

 Key: HBASE-12343
 URL: https://issues.apache.org/jira/browse/HBASE-12343
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 2.0.0

 Attachments: HBASE-12343.patch


 We're not committing the configuration changes from HBASE-11964 to 0.98 but 
 they should be the recommend configuration for replication. Add a paragraph 
 to the replication section of the manual on this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12075) Preemptive Fast Fail

2014-10-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184681#comment-14184681
 ] 

stack commented on HBASE-12075:
---

Ok [~manukranthk] If default doesn't change, if the classes are private, if 
there is explaination and example of how to use this stuff, i'd be good w/ 
commit.

 Preemptive Fast Fail
 

 Key: HBASE-12075
 URL: https://issues.apache.org/jira/browse/HBASE-12075
 Project: HBase
  Issue Type: Sub-task
  Components: Client
Affects Versions: 0.99.0, 2.0.0, 0.98.6.1
Reporter: Manukranth Kolloju
Assignee: Manukranth Kolloju
 Attachments: 0001-Add-a-test-case-for-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch, 
 0001-Implement-Preemptive-Fast-Fail.patch


 In multi threaded clients, we use a feature developed on 0.89-fb branch 
 called Preemptive Fast Fail. This allows the client threads which would 
 potentially fail, fail fast. The idea behind this feature is that we allow, 
 among the hundreds of client threads, one thread to try and establish 
 connection with the regionserver and if that succeeds, we mark it as a live 
 node again. Meanwhile, other threads which are trying to establish connection 
 to the same server would ideally go into the timeouts which is effectively 
 unfruitful. We can in those cases return appropriate exceptions to those 
 clients instead of letting them retry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-11792) Organize PerformanceEvaluation usage output

2014-10-26 Thread Misty Stanley-Jones (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misty Stanley-Jones reassigned HBASE-11792:
---

Assignee: Misty Stanley-Jones

 Organize PerformanceEvaluation usage output
 ---

 Key: HBASE-11792
 URL: https://issues.apache.org/jira/browse/HBASE-11792
 Project: HBase
  Issue Type: Improvement
  Components: Performance, test
Reporter: Nick Dimiduk
Assignee: Misty Stanley-Jones
Priority: Minor
  Labels: beginner

 PerformanceEvaluation has enjoyed a good bit of attention recently. All the 
 new features are muddled together. It would be nice to organize the output of 
 the Options list according to some scheme. I was thinking you're group 
 entries by when they're used. For example
 *General options*
 - nomapred
 - rows
 - oneCon
 - ...
 *Table Creation/Write tests*
 - compress
 - flushCommits
 - valueZipf
 - ...
 *Read tests*
 - filterAll
 - multiGet
 - replicas
 - ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11985) Document sizing rules of thumb

2014-10-26 Thread Misty Stanley-Jones (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184697#comment-14184697
 ] 

Misty Stanley-Jones commented on HBASE-11985:
-

{quote}
Indicating that 50 to 100 regions are recommended for between 1
to 2 CF would be a useful clarification.  It would also be a good way
for customers to be aware of the impact of increasing the number of
column families.
{quote}

Thanks [~gkamat]

{quote}
If you are storing time based machine data or logging information and the load 
is distributed by device id or service id + time , you can end up with the 
pattern where older data regions never have additional writes beyond a certain 
age. This can occur when the solution involves something like Hbase for new 
data (for example last 30 days) + Impala for older data 
In these situations, you can end up with a small number of active regions + a 
set of older regions no longer being written.

For these situations you can tolerate greater number of regions as your main 
resource consumption is driven by the active regions. This, of course, is very 
dependent on type of load and query patterns.
{quote}

Thanks [~rstokes]

{quote}
if only one CF is busy with writes, only that one accumulates memory. That is 
the same with inactive (only read-from) regions for the a single CF. 
{quote}

Thanks [~larsgeorge], you also had a diagram to illustrate this but the link I 
have doesn't work now. Can you point me there?

 Document sizing rules of thumb
 --

 Key: HBASE-11985
 URL: https://issues.apache.org/jira/browse/HBASE-11985
 Project: HBase
  Issue Type: Task
  Components: documentation
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones

 I'm looking for tuning/sizing rules of thumb to put in the Ref Guide.
 Info I have gleaned so far:
 A reasonable region size is between 10 GB and 50 GB.
 A reasonable maximum cell size is 1 MB to 10 MB. If your cells are larger 
 than 10 MB, consider storing the cell contents in HDFS and storing a 
 reference to the location in HBase. Pending MOB work for 10 MB - 64 MB window.
 When you size your regions and cells, keep in mind that a region cannot split 
 across a row. If your row size is too large, or your region size is too 
 small, you can end up with a single row per region, which is not a good 
 pattern. It is also possible that one big column causes splits while other 
 columns are tiny, and this may not be great.
 A large # of columns probably means you are doing it wrong.
 Column names need to be short because they get stored for every value 
 (barring encoding). Don't need to be self-documenting like in RDBMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091

2014-10-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184698#comment-14184698
 ] 

stack commented on HBASE-12285:
---

After changing the jenkins config for branch-1 to remove  
-Dmaven.test.redirectTestOutputToFile=true and stuff continued to pass, I've 
just set branch-1 back to use DEBUG again from WARN.

 Builds are failing, possibly because of SUREFIRE-1091
 -

 Key: HBASE-12285
 URL: https://issues.apache.org/jira/browse/HBASE-12285
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Dima Spivak
Assignee: Dima Spivak
Priority: Blocker
 Attachments: HBASE-12285_branch-1_v1.patch


 Our branch-1 builds on builds.apache.org have been failing in recent days 
 after we switched over to an official version of Surefire a few days back 
 (HBASE-4955). The version we're using, 2.17, is hit by a bug 
 ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results 
 in an IOException, which looks like what we're seeing on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels

2014-10-26 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184699#comment-14184699
 ] 

Jerry He commented on HBASE-12346:
--

HI, [~apurtell]

Thanks for the comment.

I agree that we should not and can not follow Accumulo blindly. 
On the contrary, if you think about it, it is probably more ok for Accumulo to 
force their scanner applications to setAuthorizations(), since Accumulo has had 
it since the beginning.  At least no backward compatible issue.

For us, ask users to re-write their read applications to order to use 
visibility labels  is less desirable and not practical. 
While doing enablement and advocacy work for this feature, the feedbacks I got 
include 'confusion'.

Stacking multiple label generators will do the trick. But it is probably more 
suitable for advanced users and will complicate things.
I think a reasonable and practical out-of-box  experience is more important.


 Scan's default auths behavior under Visibility labels
 -

 Key: HBASE-12346
 URL: https://issues.apache.org/jira/browse/HBASE-12346
 Project: HBase
  Issue Type: Bug
  Components: API, security
Affects Versions: 0.98.7, 0.99.1
Reporter: Jerry He
 Attachments: HBASE-12346-master.patch


 In Visibility Labels security, a set of labels (auths) are administered and 
 associated with a user.
 A user can normally  only see cell data during scan that are part of the 
 user's label set (auths).
 Scan uses setAuthorizations to indicates its wants to use the auths to access 
 the cells.
 Similarly in the shell:
 {code}
 scan 'table1', AUTHORIZATIONS = ['private']
 {code}
 But it is a surprise to find that setAuthorizations seems to be 'mandatory' 
 in the default visibility label security setting.  Every scan needs to 
 setAuthorizations before the scan can get any cells even the cells are under 
 the labels the request user is part of.
 The following steps will illustrate the issue:
 Run as superuser.
 {code}
 1. create a visibility label called 'private'
 2. create 'table1'
 3. put into 'table1' data and label the data as 'private'
 4. set_auths 'user1', 'private'
 5. grant 'user1', 'RW', 'table1'
 {code}
 Run as 'user1':
 {code}
 1. scan 'table1'
 This show no cells.
 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private']
 This will show all the data.
 {code}
 I am not sure if this is expected by design or a bug.
 But a more reasonable, more client application backward compatible, and less 
 surprising default behavior should probably look like this:
 A scan's default auths, if its Authorizations attributes is not set 
 explicitly, should be all the auths the request user is administered and 
 allowed on the server.
 If scan.setAuthorizations is used, then the server further filter the auths 
 during scan: use the input auths minus what is not in user's label set on the 
 server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11179) API parity between mapred and mapreduce

2014-10-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11179:
--
Fix Version/s: (was: 0.99.2)
   2.0.0

 API parity between mapred and mapreduce
 ---

 Key: HBASE-11179
 URL: https://issues.apache.org/jira/browse/HBASE-11179
 Project: HBase
  Issue Type: Sub-task
  Components: mapreduce
Reporter: Nick Dimiduk
  Labels: beginner
 Fix For: 2.0.0


 This ticket is for bringing the mapred package up to feature parity with 
 mapreduce. Might become an umbrella ticket in and of itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11179) API parity between mapred and mapreduce

2014-10-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184701#comment-14184701
 ] 

stack commented on HBASE-11179:
---

Moved it out of 1.0. Move back if I have it wrong [~ndimiduk]

 API parity between mapred and mapreduce
 ---

 Key: HBASE-11179
 URL: https://issues.apache.org/jira/browse/HBASE-11179
 Project: HBase
  Issue Type: Sub-task
  Components: mapreduce
Reporter: Nick Dimiduk
  Labels: beginner
 Fix For: 2.0.0


 This ticket is for bringing the mapred package up to feature parity with 
 mapreduce. Might become an umbrella ticket in and of itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091

2014-10-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184721#comment-14184721
 ] 

Hudson commented on HBASE-12285:


FAILURE: Integrated in HBase-1.0 #364 (See 
[https://builds.apache.org/job/HBase-1.0/364/])
HBASE-12285 Builds are failing, possibly because of SUREFIRE-1091 -- Setting 
log level back to DEBUG from WARN (stack: rev 
65c60ce873b4216dc2d05c28191e7f1a724de8b5)
* hbase-server/src/test/resources/log4j.properties


 Builds are failing, possibly because of SUREFIRE-1091
 -

 Key: HBASE-12285
 URL: https://issues.apache.org/jira/browse/HBASE-12285
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Dima Spivak
Assignee: Dima Spivak
Priority: Blocker
 Attachments: HBASE-12285_branch-1_v1.patch


 Our branch-1 builds on builds.apache.org have been failing in recent days 
 after we switched over to an official version of Surefire a few days back 
 (HBASE-4955). The version we're using, 2.17, is hit by a bug 
 ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results 
 in an IOException, which looks like what we're seeing on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091

2014-10-26 Thread Dima Spivak (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184723#comment-14184723
 ] 

Dima Spivak commented on HBASE-12285:
-

Damn, looks like just removing the output redirection isn't enough ([stream 
error is back|https://builds.apache.org/view/All/job/HBase-1.0/364/console]). 
Might as well move logging back to WARN, [~stack]. One thing that's a bit 
strange is that it doesn't seem to be caused by any particular test since the 
same number of tests completed before the error occurs is constant (3430, as 
seen [here|https://builds.apache.org/view/All/job/HBase-1.0/348/testReport/] 
and [here|https://builds.apache.org/view/All/job/HBase-1.0/347/testReport/]) 
even though the tests are run in random order. It's also worth pointing out 
that a large number of tests produce logs over 5 MB (some over 50 MB) even 
though the test that uncovered SUREFIRE-1091 only had to output 1 MB. And, of 
course, there's the why does this only hit branch-1? question that I can't 
answer either. I'll keep digging...

 Builds are failing, possibly because of SUREFIRE-1091
 -

 Key: HBASE-12285
 URL: https://issues.apache.org/jira/browse/HBASE-12285
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Dima Spivak
Assignee: Dima Spivak
Priority: Blocker
 Attachments: HBASE-12285_branch-1_v1.patch


 Our branch-1 builds on builds.apache.org have been failing in recent days 
 after we switched over to an official version of Surefire a few days back 
 (HBASE-4955). The version we're using, 2.17, is hit by a bug 
 ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results 
 in an IOException, which looks like what we're seeing on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12187) Review in source the paper Simple Testing Can Prevent Most Critical Failures

2014-10-26 Thread Ding Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184734#comment-14184734
 ] 

Ding Yuan commented on HBASE-12187:
---

I have implemented the three checks from aspirator into error-prone version 
1.1.2. These three checks are:
(1). Catch block that ignores exception (including containing only a log 
printing statement);
(2). Aborting the system on exception over-catch; 
(3). Catch block containing TODO or FIXME in comments

Among them, (1) is a bit complicated since I included quite a few false 
positive suppression heuristics as described in the paper. 

I have tested all three checks on HBase-0.99.0. The first check found 111 
cases, while the other two found less than 10 each. I have attached the 
reported cases as attachments.

Currently I assigned all of the three checks as ERROR severity. So if one 
thinks that a case is fine, an annotation like @SupressWarnings(EmptyCatch) 
is needed to get the compilation to succeed. 

I am attaching the patch to error-prone v1.1.2, which contains the three added 
checks. I have also uploaded my error-prone repository to:
https://github.com/diy1/error-prone-aspirator

Please let me know how i can further help.
cheers,
ding

 Review in source the paper Simple Testing Can Prevent Most Critical Failures
 --

 Key: HBASE-12187
 URL: https://issues.apache.org/jira/browse/HBASE-12187
 Project: HBase
  Issue Type: Bug
Reporter: stack
Priority: Critical

 Review the helpful paper 
 https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf
 It describes 'catastrophic failures', especially issues where exceptions are 
 thrown but not properly handled.  Their static analysis tool Aspirator turns 
 up a bunch of the obvious offenders (Lets add to test-patch.sh alongside 
 findbugs?).  This issue is about going through code base making sub-issues to 
 root out these and others (Don't we have the test described in figure #6 
 already? I thought we did?  If we don't, need to add).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12187) Review in source the paper Simple Testing Can Prevent Most Critical Failures

2014-10-26 Thread Ding Yuan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ding Yuan updated HBASE-12187:
--
Attachment: todoInCatch.warnings.txt
emptyCatch.warnings.txt
abortInOvercatch.warnings.txt
HBASE-12187.patch

 Review in source the paper Simple Testing Can Prevent Most Critical Failures
 --

 Key: HBASE-12187
 URL: https://issues.apache.org/jira/browse/HBASE-12187
 Project: HBase
  Issue Type: Bug
Reporter: stack
Priority: Critical
 Attachments: HBASE-12187.patch, abortInOvercatch.warnings.txt, 
 emptyCatch.warnings.txt, todoInCatch.warnings.txt


 Review the helpful paper 
 https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf
 It describes 'catastrophic failures', especially issues where exceptions are 
 thrown but not properly handled.  Their static analysis tool Aspirator turns 
 up a bunch of the obvious offenders (Lets add to test-patch.sh alongside 
 findbugs?).  This issue is about going through code base making sub-issues to 
 root out these and others (Don't we have the test described in figure #6 
 already? I thought we did?  If we don't, need to add).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12326) Document scanner timeout workarounds in troubleshooting section

2014-10-26 Thread Misty Stanley-Jones (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184756#comment-14184756
 ] 

Misty Stanley-Jones commented on HBASE-12326:
-

Anyone available to review? [~saint@gmail.com] [~apurtell] [~ndimiduk] 
perhaps?

 Document scanner timeout workarounds in troubleshooting section
 ---

 Key: HBASE-12326
 URL: https://issues.apache.org/jira/browse/HBASE-12326
 Project: HBase
  Issue Type: Bug
  Components: documentation
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
 Attachments: HBASE-12326.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels

2014-10-26 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184772#comment-14184772
 ] 

Anoop Sam John commented on HBASE-12346:


We have EnforcingScanLabelGenerator which will always give back the user's auth 
labels and ignore whatever passed in Scan Authorizations. Even if the scan 
passes a subset of user's auth labels, that is ignored and will assign all user 
auths.  And this is not the default impl
DefaultScanLabelGenerator when passing no Authorizations in Scan, give back no 
results.
I am more inclined towards doing what this jira proposes. That looks easy for 
use. We can have stack of ScanLabelGenerator and achieve what user wants. But a 
default behaving the said would be more user friendly IMO.
I was also thinking on this some time back and make things easier.   What do 
you say Andy?

 Scan's default auths behavior under Visibility labels
 -

 Key: HBASE-12346
 URL: https://issues.apache.org/jira/browse/HBASE-12346
 Project: HBase
  Issue Type: Bug
  Components: API, security
Affects Versions: 0.98.7, 0.99.1
Reporter: Jerry He
 Attachments: HBASE-12346-master.patch


 In Visibility Labels security, a set of labels (auths) are administered and 
 associated with a user.
 A user can normally  only see cell data during scan that are part of the 
 user's label set (auths).
 Scan uses setAuthorizations to indicates its wants to use the auths to access 
 the cells.
 Similarly in the shell:
 {code}
 scan 'table1', AUTHORIZATIONS = ['private']
 {code}
 But it is a surprise to find that setAuthorizations seems to be 'mandatory' 
 in the default visibility label security setting.  Every scan needs to 
 setAuthorizations before the scan can get any cells even the cells are under 
 the labels the request user is part of.
 The following steps will illustrate the issue:
 Run as superuser.
 {code}
 1. create a visibility label called 'private'
 2. create 'table1'
 3. put into 'table1' data and label the data as 'private'
 4. set_auths 'user1', 'private'
 5. grant 'user1', 'RW', 'table1'
 {code}
 Run as 'user1':
 {code}
 1. scan 'table1'
 This show no cells.
 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private']
 This will show all the data.
 {code}
 I am not sure if this is expected by design or a bug.
 But a more reasonable, more client application backward compatible, and less 
 surprising default behavior should probably look like this:
 A scan's default auths, if its Authorizations attributes is not set 
 explicitly, should be all the auths the request user is administered and 
 allowed on the server.
 If scan.setAuthorizations is used, then the server further filter the auths 
 during scan: use the input auths minus what is not in user's label set on the 
 server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key

2014-10-26 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184782#comment-14184782
 ] 

Anoop Sam John commented on HBASE-12313:


bq.Where we were using estimatedLengthOf (What is this anyways – smile? 
Serialized 'length' or size on heap? Or size of the serialized KeyValue byte 
array – which is going away), we were talking serialized size. I was thinking 
estimatedSerializedSizeOf more appropriate where I did the replaces.
This is mostly used in metric calc now. Some extra bytes is ok there. This is 
used in SizedCellScanner size calc also but I can not see this size is really 
been used now. 
Said it is estimate am ok with the change. Basically the diff btw 
estimatedSizeOf and estimatedLengthOf was the former is having and INT size 
extra. This is because when we serialize the KV over wire, we write the KV 
length (4 bytes) first followed by the kv buffer(KL, VL, Key and Value)

getSumOfKeyElementLengths  This is supposed to add the rk, cf, q and ts type 
parts but in patch we end up adding value and tags part also. Am I missing some 
thing? We need fix this?



 Redo the hfile index length optimization so cell-based rather than serialized 
 KV key
 

 Key: HBASE-12313
 URL: https://issues.apache.org/jira/browse/HBASE-12313
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: stack
Assignee: stack
 Attachments: 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt


 Trying to remove API that returns the 'key' of a KV serialized into a byte 
 array is thorny.
 I tried to move over the first and last key serializations and the hfile 
 index entries to be cell but patch was turning massive.  Here is a smaller 
 patch that just redoes the optimization that tries to find 'short' midpoints 
 between last key of last block and first key of next block so it is 
 Cell-based rather than byte array based (presuming Keys serialized in a 
 certain way).  Adds unit tests which we didn't have before.
 Also remove CellKey.  Not needed... at least not yet.  Its just utility for 
 toString.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091

2014-10-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184793#comment-14184793
 ] 

stack commented on HBASE-12285:
---

Ok. I'll set it back.  I'll put back toot he record to logs config.

 Builds are failing, possibly because of SUREFIRE-1091
 -

 Key: HBASE-12285
 URL: https://issues.apache.org/jira/browse/HBASE-12285
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Dima Spivak
Assignee: Dima Spivak
Priority: Blocker
 Attachments: HBASE-12285_branch-1_v1.patch


 Our branch-1 builds on builds.apache.org have been failing in recent days 
 after we switched over to an official version of Surefire a few days back 
 (HBASE-4955). The version we're using, 2.17, is hit by a bug 
 ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results 
 in an IOException, which looks like what we're seeing on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091

2014-10-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184794#comment-14184794
 ] 

stack commented on HBASE-12285:
---

Set it back.  Maybe next up is hosting our own surefilre build but we should 
spend some time on tests that log 50MB for sure; we will only annoy people 
logging that much.

 Builds are failing, possibly because of SUREFIRE-1091
 -

 Key: HBASE-12285
 URL: https://issues.apache.org/jira/browse/HBASE-12285
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Dima Spivak
Assignee: Dima Spivak
Priority: Blocker
 Attachments: HBASE-12285_branch-1_v1.patch


 Our branch-1 builds on builds.apache.org have been failing in recent days 
 after we switched over to an official version of Surefire a few days back 
 (HBASE-4955). The version we're using, 2.17, is hit by a bug 
 ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results 
 in an IOException, which looks like what we're seeing on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key

2014-10-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184796#comment-14184796
 ] 

stack commented on HBASE-12313:
---

bq. Basically the diff btw estimatedSizeOf and estimatedLengthOf was the former 
is having and INT size extra. 

I think in the end we want serialized size and heap size and maybe an estimated 
size that would be cheaper to calculate than either of the former for places 
where it is not that important.

The patch makes a start on it.

The sizings that are in this patch as I see it cause no problem; perhaps a 
slight overcount but its for metrics only -- not for anything important (You 
agree?)

bq. Am I missing some thing? We need fix this?

No you are right but 'do we need to fix it?'  It is ok that the size calculated 
is 'rough', approx, in the context, or do you think otherwise?



 Redo the hfile index length optimization so cell-based rather than serialized 
 KV key
 

 Key: HBASE-12313
 URL: https://issues.apache.org/jira/browse/HBASE-12313
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: stack
Assignee: stack
 Attachments: 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt


 Trying to remove API that returns the 'key' of a KV serialized into a byte 
 array is thorny.
 I tried to move over the first and last key serializations and the hfile 
 index entries to be cell but patch was turning massive.  Here is a smaller 
 patch that just redoes the optimization that tries to find 'short' midpoints 
 between last key of last block and first key of next block so it is 
 Cell-based rather than byte array based (presuming Keys serialized in a 
 certain way).  Adds unit tests which we didn't have before.
 Also remove CellKey.  Not needed... at least not yet.  Its just utility for 
 toString.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11912) Catch some bad practices at compile time with error-prone

2014-10-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184801#comment-14184801
 ] 

stack commented on HBASE-11912:
---

You commit your fixup [~apurtell]?  Looks like we need this (smile).  See over 
in HBASE-12187.

 Catch some bad practices at compile time with error-prone
 -

 Key: HBASE-11912
 URL: https://issues.apache.org/jira/browse/HBASE-11912
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: HBASE-11912.patch, HBASE-11912.patch, HBASE-11912.patch, 
 HBASE-11912.patch


 Google's error-prone (https://code.google.com/p/error-prone/) wraps javac 
 with some additional static analysis that will generate additional warnings 
 or errors at compile time if certain bug patterns 
 (https://code.google.com/p/error-prone/wiki/BugPatterns) are detected. What's 
 nice about this approach, as opposed to findbugs, is the compile time 
 detection and erroring out prevent the detected problems from getting into 
 the codebase up front.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12207) A script to help keep your Git repo fresh

2014-10-26 Thread Misty Stanley-Jones (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184811#comment-14184811
 ] 

Misty Stanley-Jones commented on HBASE-12207:
-

No further comments. Since this is just a script and not part of HBase itself, 
I will go ahead and commit it (fixing the tabs in the output on commit).

 A script to help keep your Git repo fresh
 -

 Key: HBASE-12207
 URL: https://issues.apache.org/jira/browse/HBASE-12207
 Project: HBase
  Issue Type: Improvement
  Components: documentation, scripts
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
 Attachments: HBASE-12207-v1.patch, HBASE-12207-v2.patch, 
 HBASE-12207-v3.patch, HBASE-12207-v4.patch, HBASE-12207-v5.patch, 
 HBASE-12207-v6.patch, HBASE-12207.patch


 I have a script that does a {code}git pull --rebase{code} on each tracking 
 branch, and then attempts an automatic rebase of each local branch against 
 its tracking branch. It also prompts you to delete local branches for HBASE- 
 JIRAs that have been closed. I think this script may help to enforce good Git 
 practices. It may be a good candidate to be included in dev-support/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12207) A script to help keep your Git repo fresh

2014-10-26 Thread Misty Stanley-Jones (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misty Stanley-Jones updated HBASE-12207:

Attachment: HBASE-12207-v7.patch

What I committed to Master.

 A script to help keep your Git repo fresh
 -

 Key: HBASE-12207
 URL: https://issues.apache.org/jira/browse/HBASE-12207
 Project: HBase
  Issue Type: Improvement
  Components: documentation, scripts
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
 Attachments: HBASE-12207-v1.patch, HBASE-12207-v2.patch, 
 HBASE-12207-v3.patch, HBASE-12207-v4.patch, HBASE-12207-v5.patch, 
 HBASE-12207-v6.patch, HBASE-12207-v7.patch, HBASE-12207.patch


 I have a script that does a {code}git pull --rebase{code} on each tracking 
 branch, and then attempts an automatic rebase of each local branch against 
 its tracking branch. It also prompts you to delete local branches for HBASE- 
 JIRAs that have been closed. I think this script may help to enforce good Git 
 practices. It may be a good candidate to be included in dev-support/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12207) A script to help keep your Git repo fresh

2014-10-26 Thread Misty Stanley-Jones (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misty Stanley-Jones updated HBASE-12207:

   Resolution: Fixed
Fix Version/s: 2.0.0
   Status: Resolved  (was: Patch Available)

 A script to help keep your Git repo fresh
 -

 Key: HBASE-12207
 URL: https://issues.apache.org/jira/browse/HBASE-12207
 Project: HBase
  Issue Type: Improvement
  Components: documentation, scripts
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
 Fix For: 2.0.0

 Attachments: HBASE-12207-v1.patch, HBASE-12207-v2.patch, 
 HBASE-12207-v3.patch, HBASE-12207-v4.patch, HBASE-12207-v5.patch, 
 HBASE-12207-v6.patch, HBASE-12207-v7.patch, HBASE-12207.patch


 I have a script that does a {code}git pull --rebase{code} on each tracking 
 branch, and then attempts an automatic rebase of each local branch against 
 its tracking branch. It also prompts you to delete local branches for HBASE- 
 JIRAs that have been closed. I think this script may help to enforce good Git 
 practices. It may be a good candidate to be included in dev-support/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key

2014-10-26 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184820#comment-14184820
 ] 

Anoop Sam John commented on HBASE-12313:


bq.The sizings that are in this patch as I see it cause no problem; perhaps a 
slight overcount but its for metrics only – not for anything important (You 
agree?)
Yes I am ok with it Stack.

bq.No you are right but 'do we need to fix it?' It is ok that the size 
calculated is 'rough', approx
As you see below the estimatedSerializedSizeOfKey() returns the key parts 
lengths only when it is KeyValue. But when it is non KV Cell, we end up adding 
value and tags length. This wont be slight change as the value length normally 
can be very large.  I am concerned over this.
{code}
+  private static int getSumOfKeyElementLengths(final Cell cell) {
+return cell.getRowLength() + cell.getFamilyLength() +
+cell.getQualifierLength() +
+cell.getValueLength() +
+cell.getTagsLength() +
+KeyValue.TIMESTAMP_TYPE_SIZE;
+  }
+
+  public static int estimatedSerializedSizeOfKey(final Cell cell) {
+if (cell instanceof KeyValue) return ((KeyValue)cell).getKeyLength();
+// This will be a low estimate.  Will do for now.
+return getSumOfKeyElementLengths(cell);
+  }
{code}


 Redo the hfile index length optimization so cell-based rather than serialized 
 KV key
 

 Key: HBASE-12313
 URL: https://issues.apache.org/jira/browse/HBASE-12313
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: stack
Assignee: stack
 Attachments: 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt


 Trying to remove API that returns the 'key' of a KV serialized into a byte 
 array is thorny.
 I tried to move over the first and last key serializations and the hfile 
 index entries to be cell but patch was turning massive.  Here is a smaller 
 patch that just redoes the optimization that tries to find 'short' midpoints 
 between last key of last block and first key of next block so it is 
 Cell-based rather than byte array based (presuming Keys serialized in a 
 certain way).  Adds unit tests which we didn't have before.
 Also remove CellKey.  Not needed... at least not yet.  Its just utility for 
 toString.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11419) After increasing TTL value of a hbase table having pre-split regions and decreasing TTL value, table becomes inaccessible.

2014-10-26 Thread Prabhu Joseph (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184832#comment-14184832
 ] 

Prabhu Joseph commented on HBASE-11419:
---

Hi Lars,

  This issue happens in Distributed mode. We have two regionservers. I have 
attached our hbase-site.xml.

  Hbase version is hbase 0.94.6


 After increasing TTL value of a hbase table having pre-split regions and 
 decreasing TTL value, table becomes inaccessible.
 --

 Key: HBASE-11419
 URL: https://issues.apache.org/jira/browse/HBASE-11419
 Project: HBase
  Issue Type: Bug
  Components: HFile
Affects Versions: 0.94.6
 Environment: Linux x86_64 
Reporter: Prabhu Joseph
Priority: Blocker
 Attachments: HBaseExporter.java, account.csv

   Original Estimate: 96h
  Remaining Estimate: 96h

 After increasing and decreasing the TTL value of a Hbase Table , table gets 
 inaccessible. Scan table not working.
 Scan in hbase shell throws
 java.lang.IllegalStateException: Block index not loaded
 at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
 at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV1.blockContainingKey(HFileReaderV1.java:181)
 at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV1$AbstractScannerV1.seekTo(HFileReaderV1.java:426)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145)
 at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:131)
 at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2015)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3706)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1761)
 at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1753)
 at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1730)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2409)
 at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
 at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11419) After increasing TTL value of a hbase table having pre-split regions and decreasing TTL value, table becomes inaccessible.

2014-10-26 Thread Prabhu Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated HBASE-11419:
--
Attachment: hbase-site.xml

 After increasing TTL value of a hbase table having pre-split regions and 
 decreasing TTL value, table becomes inaccessible.
 --

 Key: HBASE-11419
 URL: https://issues.apache.org/jira/browse/HBASE-11419
 Project: HBase
  Issue Type: Bug
  Components: HFile
Affects Versions: 0.94.6
 Environment: Linux x86_64 
Reporter: Prabhu Joseph
Priority: Blocker
 Attachments: HBaseExporter.java, account.csv, hbase-site.xml

   Original Estimate: 96h
  Remaining Estimate: 96h

 After increasing and decreasing the TTL value of a Hbase Table , table gets 
 inaccessible. Scan table not working.
 Scan in hbase shell throws
 java.lang.IllegalStateException: Block index not loaded
 at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
 at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV1.blockContainingKey(HFileReaderV1.java:181)
 at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV1$AbstractScannerV1.seekTo(HFileReaderV1.java:426)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145)
 at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:131)
 at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2015)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3706)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1761)
 at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1753)
 at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1730)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2409)
 at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
 at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-10780) HFilePrettyPrinter#processFile should return immediately if file does not exists.

2014-10-26 Thread Ashish Singhi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Singhi reassigned HBASE-10780:
-

Assignee: Ashish Singhi

 HFilePrettyPrinter#processFile should return immediately if file does not 
 exists.
 -

 Key: HBASE-10780
 URL: https://issues.apache.org/jira/browse/HBASE-10780
 Project: HBase
  Issue Type: Bug
  Components: HFile
Affects Versions: 0.94.11
Reporter: Ashish Singhi
Assignee: Ashish Singhi
Priority: Minor
 Attachments: HBASE-10780.patch


 HFilePrettyPrinter#processFile should return immediately if file does not 
 exists same like HLogPrettyPrinter#run
 {code}
 if (!fs.exists(file)) {
   System.err.println(ERROR, file doesnt exist:  + file);
 }{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12304) CellCounter will throw AIOBE when output directory is not specified

2014-10-26 Thread Ashish Singhi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Singhi updated HBASE-12304:
--
Attachment: HBASE-12304-0.98.patch
HBASE-12304-v3.patch

Attached patch for master and 0.98 branch.
Please review.

 CellCounter will throw AIOBE when output directory is not specified
 ---

 Key: HBASE-12304
 URL: https://issues.apache.org/jira/browse/HBASE-12304
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.98.5
Reporter: Ashish Singhi
Assignee: Ashish Singhi
Priority: Minor
 Attachments: HBASE-12304-0.98.patch, HBASE-12304-v2.patch, 
 HBASE-12304-v3.patch, HBase-12304.patch


 CellCounter will throw ArrayIndexOutOfBoundsException when output directory 
 is not specified instead it should display the usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10780) HFilePrettyPrinter#processFile should return immediately if file does not exists.

2014-10-26 Thread Ashish Singhi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Singhi updated HBASE-10780:
--
Attachment: HBASE-10780-v2.patch

Thanks Ted for looking into it.
Updated the patch as per your suggestion.
Please review.

 HFilePrettyPrinter#processFile should return immediately if file does not 
 exists.
 -

 Key: HBASE-10780
 URL: https://issues.apache.org/jira/browse/HBASE-10780
 Project: HBase
  Issue Type: Bug
  Components: HFile
Affects Versions: 0.94.11
Reporter: Ashish Singhi
Assignee: Ashish Singhi
Priority: Minor
 Attachments: HBASE-10780-v2.patch, HBASE-10780.patch


 HFilePrettyPrinter#processFile should return immediately if file does not 
 exists same like HLogPrettyPrinter#run
 {code}
 if (!fs.exists(file)) {
   System.err.println(ERROR, file doesnt exist:  + file);
 }{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12304) CellCounter will throw AIOBE when output directory is not specified

2014-10-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184839#comment-14184839
 ] 

Hadoop QA commented on HBASE-12304:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12677242/HBASE-12304-0.98.patch
  against trunk revision .
  ATTACHMENT ID: 12677242

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11469//console

This message is automatically generated.

 CellCounter will throw AIOBE when output directory is not specified
 ---

 Key: HBASE-12304
 URL: https://issues.apache.org/jira/browse/HBASE-12304
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.98.5
Reporter: Ashish Singhi
Assignee: Ashish Singhi
Priority: Minor
 Attachments: HBASE-12304-0.98.patch, HBASE-12304-v2.patch, 
 HBASE-12304-v3.patch, HBase-12304.patch


 CellCounter will throw ArrayIndexOutOfBoundsException when output directory 
 is not specified instead it should display the usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11992) Backport HBASE-11367 (Pluggable replication endpoint) to 0.98

2014-10-26 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184840#comment-14184840
 ] 

ramkrishna.s.vasudevan commented on HBASE-11992:


[~apurtell],[~apurt...@yahoo.com]
Pls a take a look at the RB. Would be useful to get the feature dependent on 
this to 0.98. (HBASE-11639).

 Backport HBASE-11367 (Pluggable replication endpoint) to 0.98
 -

 Key: HBASE-11992
 URL: https://issues.apache.org/jira/browse/HBASE-11992
 Project: HBase
  Issue Type: Task
Reporter: Andrew Purtell
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-11992_0.98_1.patch, hbase-11367_0.98.patch


 ReplicationSource tails the logs for each peer. HBASE-11367 introduces 
 ReplicationEndpoint which is customizable per peer. ReplicationEndpoint is 
 run in the same RS process and instantiated per replication peer per region 
 server. Implementations of this interface handle the actual shipping of WAL 
 edits to the remote cluster.
 This issue is for backporting HBASE-11367 to 0.98.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091

2014-10-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184843#comment-14184843
 ] 

Hudson commented on HBASE-12285:


SUCCESS: Integrated in HBase-1.0 #365 (See 
[https://builds.apache.org/job/HBase-1.0/365/])
HBASE-12285 Builds are failing, possibly because of SUREFIRE-1091 -- Setting 
log level back to DEBUG TO WARN -- second time (stack: rev 
862faca7a4f82e032572f8426851968cf7ba017c)
* hbase-server/src/test/resources/log4j.properties


 Builds are failing, possibly because of SUREFIRE-1091
 -

 Key: HBASE-12285
 URL: https://issues.apache.org/jira/browse/HBASE-12285
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Dima Spivak
Assignee: Dima Spivak
Priority: Blocker
 Attachments: HBASE-12285_branch-1_v1.patch


 Our branch-1 builds on builds.apache.org have been failing in recent days 
 after we switched over to an official version of Surefire a few days back 
 (HBASE-4955). The version we're using, 2.17, is hit by a bug 
 ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results 
 in an IOException, which looks like what we're seeing on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12347) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh

2014-10-26 Thread Misty Stanley-Jones (JIRA)
Misty Stanley-Jones created HBASE-12347:
---

 Summary: Fix the edge case where Hadoop QA's parsing of attached 
patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh
 Key: HBASE-12347
 URL: https://issues.apache.org/jira/browse/HBASE-12347
 Project: HBase
  Issue Type: Sub-task
  Components: scripts
Reporter: Misty Stanley-Jones
Priority: Minor


The rebase_all_hbase_branches.sh script is unable to detect that HBASE-12207 is 
closed, because for that one JIRA, the curl command that detects the status is 
returning the status, but also the text from Hadoop QA for each patch it has 
evaluated on the JIRA:

{code}
$ curl -s https://issues.apache.org/jira/browse/HBASE-12207 | grep 
resolution-val
span id=resolution-val class=value resolved 
+   jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
-e s/.*class=\value\ //|cut -d'' -f 1)br/
+   jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
-e s/.*class=\value\ //|cut -d'' -f 1)
+   jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
-e s/.*class=\value\ //|cut -d'' -f 1)br/
+   jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
-e s/.*class=\value\ //|cut -d'' -f 1)
+   jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
-e s/.*class=\value\ //|cut -d'' -f 1)br/
+   jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
-e s/.*class=\value\ //|cut -d'' -f 1)
+   jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
-e s/.*class=\value\ //|cut -d'' -f 1)br/
+   jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
-e s/.*class=\value\ //|cut -d'' -f 1)
{code}

All but the top line of output are from parsing comments from Hadoop QA. I 
think this is an edge case that will only to patches against that section of 
dev-support/rebase_all_git_branches.sh. However, I'm not sure the cleanest way 
to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12347) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh

2014-10-26 Thread Misty Stanley-Jones (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misty Stanley-Jones updated HBASE-12347:

Issue Type: Bug  (was: Sub-task)
Parent: (was: HBASE-12207)

 Fix the edge case where Hadoop QA's parsing of attached patches breaks the 
 JIRA status checker in dev-support/rebase_all_git_branches.sh
 

 Key: HBASE-12347
 URL: https://issues.apache.org/jira/browse/HBASE-12347
 Project: HBase
  Issue Type: Bug
  Components: scripts
Reporter: Misty Stanley-Jones
Priority: Minor
 Fix For: 2.0.0


 The rebase_all_hbase_branches.sh script is unable to detect that HBASE-12207 
 is closed, because for that one JIRA, the curl command that detects the 
 status is returning the status, but also the text from Hadoop QA for each 
 patch it has evaluated on the JIRA:
 {code}
 $ curl -s https://issues.apache.org/jira/browse/HBASE-12207 | grep 
 resolution-val
 span id=resolution-val class=value resolved 
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 {code}
 All but the top line of output are from parsing comments from Hadoop QA. I 
 think this is an edge case that will only to patches against that section of 
 dev-support/rebase_all_git_branches.sh. However, I'm not sure the cleanest 
 way to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-12347) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh

2014-10-26 Thread Misty Stanley-Jones (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misty Stanley-Jones reassigned HBASE-12347:
---

Assignee: Misty Stanley-Jones

 Fix the edge case where Hadoop QA's parsing of attached patches breaks the 
 JIRA status checker in dev-support/rebase_all_git_branches.sh
 

 Key: HBASE-12347
 URL: https://issues.apache.org/jira/browse/HBASE-12347
 Project: HBase
  Issue Type: Bug
  Components: scripts
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
Priority: Minor
 Fix For: 2.0.0


 The rebase_all_hbase_branches.sh script is unable to detect that HBASE-12207 
 is closed, because for that one JIRA, the curl command that detects the 
 status is returning the status, but also the text from Hadoop QA for each 
 patch it has evaluated on the JIRA:
 {code}
 $ curl -s https://issues.apache.org/jira/browse/HBASE-12207 | grep 
 resolution-val
 span id=resolution-val class=value resolved 
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 {code}
 All but the top line of output are from parsing comments from Hadoop QA. I 
 think this is an edge case that will only to patches against that section of 
 dev-support/rebase_all_git_branches.sh. However, I'm not sure the cleanest 
 way to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12347) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh

2014-10-26 Thread Misty Stanley-Jones (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184858#comment-14184858
 ] 

Misty Stanley-Jones commented on HBASE-12347:
-

[~busbey] figured out a less brittle way than grepping the HTML:
{code}
curl -s 
'https://issues.apache.org/jira/rest/api/2/issue/HBASE-5699?fields=resolution'|grep
 -q '{resolution:null}'
{code}

status is 0 if true (unresolved), 1 if false (something other than unresolved). 
I'll make a patch.

 Fix the edge case where Hadoop QA's parsing of attached patches breaks the 
 JIRA status checker in dev-support/rebase_all_git_branches.sh
 

 Key: HBASE-12347
 URL: https://issues.apache.org/jira/browse/HBASE-12347
 Project: HBase
  Issue Type: Bug
  Components: scripts
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
Priority: Minor
 Fix For: 2.0.0


 The rebase_all_hbase_branches.sh script is unable to detect that HBASE-12207 
 is closed, because for that one JIRA, the curl command that detects the 
 status is returning the status, but also the text from Hadoop QA for each 
 patch it has evaluated on the JIRA:
 {code}
 $ curl -s https://issues.apache.org/jira/browse/HBASE-12207 | grep 
 resolution-val
 span id=resolution-val class=value resolved 
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 {code}
 All but the top line of output are from parsing comments from Hadoop QA. I 
 think this is an edge case that will only to patches against that section of 
 dev-support/rebase_all_git_branches.sh. However, I'm not sure the cleanest 
 way to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12347) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh

2014-10-26 Thread Misty Stanley-Jones (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misty Stanley-Jones updated HBASE-12347:

Attachment: HBASE-12347.patch

Ready for review. To test, make a branch called HBASE-12207 and see if it gets 
picked up by the script as resolved. That JIRA ID is the only one that seems to 
be affected by this (though it would be easy to make a fake case).

 Fix the edge case where Hadoop QA's parsing of attached patches breaks the 
 JIRA status checker in dev-support/rebase_all_git_branches.sh
 

 Key: HBASE-12347
 URL: https://issues.apache.org/jira/browse/HBASE-12347
 Project: HBase
  Issue Type: Bug
  Components: scripts
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
Priority: Minor
 Fix For: 2.0.0

 Attachments: HBASE-12347.patch


 The rebase_all_hbase_branches.sh script is unable to detect that HBASE-12207 
 is closed, because for that one JIRA, the curl command that detects the 
 status is returning the status, but also the text from Hadoop QA for each 
 patch it has evaluated on the JIRA:
 {code}
 $ curl -s https://issues.apache.org/jira/browse/HBASE-12207 | grep 
 resolution-val
 span id=resolution-val class=value resolved 
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 {code}
 All but the top line of output are from parsing comments from Hadoop QA. I 
 think this is an edge case that will only to patches against that section of 
 dev-support/rebase_all_git_branches.sh. However, I'm not sure the cleanest 
 way to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12347) Fix the edge case where Hadoop QA's parsing of attached patches breaks the JIRA status checker in dev-support/rebase_all_git_branches.sh

2014-10-26 Thread Misty Stanley-Jones (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misty Stanley-Jones updated HBASE-12347:

Status: Patch Available  (was: Open)

 Fix the edge case where Hadoop QA's parsing of attached patches breaks the 
 JIRA status checker in dev-support/rebase_all_git_branches.sh
 

 Key: HBASE-12347
 URL: https://issues.apache.org/jira/browse/HBASE-12347
 Project: HBase
  Issue Type: Bug
  Components: scripts
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
Priority: Minor
 Fix For: 2.0.0

 Attachments: HBASE-12347.patch


 The rebase_all_hbase_branches.sh script is unable to detect that HBASE-12207 
 is closed, because for that one JIRA, the curl command that detects the 
 status is returning the status, but also the text from Hadoop QA for each 
 patch it has evaluated on the JIRA:
 {code}
 $ curl -s https://issues.apache.org/jira/browse/HBASE-12207 | grep 
 resolution-val
 span id=resolution-val class=value resolved 
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)br/
 + jira_status=$(curl -s $jira_url/$jira|grep resolution-val|sed 
 -e s/.*class=\value\ //|cut -d'' -f 1)
 {code}
 All but the top line of output are from parsing comments from Hadoop QA. I 
 think this is an edge case that will only to patches against that section of 
 dev-support/rebase_all_git_branches.sh. However, I'm not sure the cleanest 
 way to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12345) Unsafe based Comparator for BB

2014-10-26 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184864#comment-14184864
 ] 

ramkrishna.s.vasudevan commented on HBASE-12345:


Will check this patch later tomorrow.  I created a simple patch like this 
reading the unsafe APIs. Will do some tests with the attached patch. Have some 
doubts in that.

 Unsafe based Comparator for BB 
 ---

 Key: HBASE-12345
 URL: https://issues.apache.org/jira/browse/HBASE-12345
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Attachments: HBASE-12345.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12282) Ensure Cells and its implementations work with Buffers also

2014-10-26 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184865#comment-14184865
 ] 

ramkrishna.s.vasudevan commented on HBASE-12282:


+1 for new Cell interface that extends current Cell but adds BB APIs to it. 
Every where in the comparator we will have a condition based check.  One thing 
to note is that for the compartors one cell can be the BB based cell and the 
other one can be byte[] array back.
Changes to KV is hacky but that is basically to make things work and ensure 
that we have a KV backed by buffer and byte[]. Atleast  the fake keys that we 
create could directlly be buffer based and so those comparisons can be buffer 
based only.

 Ensure Cells and its implementations work with Buffers also
 ---

 Key: HBASE-12282
 URL: https://issues.apache.org/jira/browse/HBASE-12282
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver, Scanners
Affects Versions: 0.99.1
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 2.0.0, 0.99.2

 Attachments: HBASE-12224_2.patch


 This issue can be used to brainstorm and then do the necessary changes for 
 the offheap work.  All impl of cells deal with byte[] but when we change the 
 Hfileblocks/Readers to work purely with Buffers then the byte[] usage would 
 mean that always the data is copied to the onheap.  Cell may need some 
 interface change to implement this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)