[jira] [Commented] (HBASE-13301) Possible memory leak in BucketCache

2015-04-10 Thread zhangduo (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490819#comment-14490819
 ] 

zhangduo commented on HBASE-13301:
--

Any other questions? [~ndimiduk]
Thanks.

> Possible memory leak in BucketCache
> ---
>
> Key: HBASE-13301
> URL: https://issues.apache.org/jira/browse/HBASE-13301
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Reporter: zhangduo
>Assignee: zhangduo
> Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2
>
> Attachments: HBASE-13301-0.98.patch, HBASE-13301-branch-1.0.patch, 
> HBASE-13301-branch-1.0.patch, HBASE-13301-branch-1.patch, 
> HBASE-13301-testcase.patch, HBASE-13301-testcase_v1.patch, HBASE-13301.patch, 
> HBASE-13301_v1.patch, HBASE-13301_v2.patch, HBASE-13301_v3.patch
>
>
> {code:title=BucketCache.java}
> public boolean evictBlock(BlockCacheKey cacheKey) {
>   ...
>   if (bucketEntry.equals(backingMap.remove(cacheKey))) {
> bucketAllocator.freeBlock(bucketEntry.offset());
> realCacheSize.addAndGet(-1 * bucketEntry.getLength());
> blocksByHFile.remove(cacheKey.getHfileName(), cacheKey);
> if (removedBlock == null) {
>   this.blockNumber.decrementAndGet();
> }
>   } else {
> return false;
>   }
>   ...
> {code}
> I think the problem is here. We remove a BucketEntry that should not be 
> removed by us, but we do not put it back and also do not do any clean up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13301) Possible memory leak in BucketCache

2015-04-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490799#comment-14490799
 ] 

Hadoop QA commented on HBASE-13301:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12724728/HBASE-13301_v3.patch
  against master branch at commit e994b491aca8ab2edeb60a328c690ddbc88f8b51.
  ATTACHMENT ID: 12724728

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13669//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13669//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13669//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13669//console

This message is automatically generated.

> Possible memory leak in BucketCache
> ---
>
> Key: HBASE-13301
> URL: https://issues.apache.org/jira/browse/HBASE-13301
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Reporter: zhangduo
>Assignee: zhangduo
> Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2
>
> Attachments: HBASE-13301-0.98.patch, HBASE-13301-branch-1.0.patch, 
> HBASE-13301-branch-1.0.patch, HBASE-13301-branch-1.patch, 
> HBASE-13301-testcase.patch, HBASE-13301-testcase_v1.patch, HBASE-13301.patch, 
> HBASE-13301_v1.patch, HBASE-13301_v2.patch, HBASE-13301_v3.patch
>
>
> {code:title=BucketCache.java}
> public boolean evictBlock(BlockCacheKey cacheKey) {
>   ...
>   if (bucketEntry.equals(backingMap.remove(cacheKey))) {
> bucketAllocator.freeBlock(bucketEntry.offset());
> realCacheSize.addAndGet(-1 * bucketEntry.getLength());
> blocksByHFile.remove(cacheKey.getHfileName(), cacheKey);
> if (removedBlock == null) {
>   this.blockNumber.decrementAndGet();
> }
>   } else {
> return false;
>   }
>   ...
> {code}
> I think the problem is here. We remove a BucketEntry that should not be 
> removed by us, but we do not put it back and also do not do any clean up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13453) Master should not bind to region server ports

2015-04-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490769#comment-14490769
 ] 

stack commented on HBASE-13453:
---

[~esteban] Add up a patch so can see what you thinking E?

We could do reflection to figure if RS is a Master, and if it is, then do 
Master ports... but I'd agree with the above that it is probably just easier on 
the user to keep on as we did previous to 1.0 where we had separate configs for 
Master, at least until we are for sure one location for Master and RegionServer 
is the way to go (IMO, we do not want to do this).

Thanks E.



> Master should not bind to region server ports
> -
>
> Key: HBASE-13453
> URL: https://issues.apache.org/jira/browse/HBASE-13453
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Devaraj Das
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 34111-2.txt
>
>
> In 1.0, master by default binds to the region server ports (rpc and info). We 
> have done it so thinking that in the long term, master and meta co-location 
> will be default, and we can merge the master and region server as a single 
> daemon. 
> Over at HBASE-11165, if the conclusion end up being that meta will not be 
> colocated at all, then master hosting a region server will just become an 
> implementation detail. [~saint@gmail.com] says that we might never allow 
> master to host regions. 
> Now, we are stuck in a state where we have made master bind to RS ports in 
> 1.0, which might create some confusion (and frustration) for small cluster 
> users who traditionally used to host a master and a region server on the same 
> node.
> I think we should undo this in 1.1 and use the previous master ports (16000) 
> and not bind to 16030, so that the user does not need to do anything to bring 
> up a RS on the same host. At least users going from 0.98 -> 1.1 will not take 
> a hit. Users going from 1.0 -> 1.1 will see changed default ports. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13078) IntegrationTestSendTraceRequests is a noop

2015-04-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490757#comment-14490757
 ] 

stack commented on HBASE-13078:
---

Purge htrace tests altogether? The old stuff has little relation to the new and 
is not worth salvaging.

> IntegrationTestSendTraceRequests is a noop
> --
>
> Key: HBASE-13078
> URL: https://issues.apache.org/jira/browse/HBASE-13078
> Project: HBase
>  Issue Type: Test
>  Components: integration tests
>Reporter: Nick Dimiduk
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2
>
> Attachments: HBASE-13078-0.98-v1.patch, HBASE-13078-v1.patch, 
> HBASE-13078.patch
>
>
> While pair-debugging with [~jeffreyz] on HBASE-13077, we noticed that 
> IntegrationTestSendTraceRequests doesn't actually assert anything. This test 
> should be converted to use a mini cluster, setup a POJOSpanReceiver, and then 
> verify the spans collected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13078) IntegrationTestSendTraceRequests is a noop

2015-04-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490755#comment-14490755
 ] 

Hadoop QA commented on HBASE-13078:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12724714/HBASE-13078-0.98-v1.patch
  against 0.98 branch at commit e994b491aca8ab2edeb60a328c690ddbc88f8b51.
  ATTACHMENT ID: 12724714

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
26 warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13668//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13668//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13668//artifact/patchprocess/checkstyle-aggregate.html

  Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13668//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13668//console

This message is automatically generated.

> IntegrationTestSendTraceRequests is a noop
> --
>
> Key: HBASE-13078
> URL: https://issues.apache.org/jira/browse/HBASE-13078
> Project: HBase
>  Issue Type: Test
>  Components: integration tests
>Reporter: Nick Dimiduk
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2
>
> Attachments: HBASE-13078-0.98-v1.patch, HBASE-13078-v1.patch, 
> HBASE-13078.patch
>
>
> While pair-debugging with [~jeffreyz] on HBASE-13077, we noticed that 
> IntegrationTestSendTraceRequests doesn't actually assert anything. This test 
> should be converted to use a mini cluster, setup a POJOSpanReceiver, and then 
> verify the spans collected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13078) IntegrationTestSendTraceRequests is a noop

2015-04-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490743#comment-14490743
 ] 

Hadoop QA commented on HBASE-13078:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12724711/HBASE-13078-v1.patch
  against master branch at commit e994b491aca8ab2edeb60a328c690ddbc88f8b51.
  ATTACHMENT ID: 12724711

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+assertEquals("Found unexpected number of create table spans", 1, 
traces.get(CREATE_TABLE).get());
+  assertEquals("Found unexpected number of delete table spans", 1, 
traces.get(DELETE_TABLE).get());

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13667//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13667//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13667//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13667//console

This message is automatically generated.

> IntegrationTestSendTraceRequests is a noop
> --
>
> Key: HBASE-13078
> URL: https://issues.apache.org/jira/browse/HBASE-13078
> Project: HBase
>  Issue Type: Test
>  Components: integration tests
>Reporter: Nick Dimiduk
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2
>
> Attachments: HBASE-13078-0.98-v1.patch, HBASE-13078-v1.patch, 
> HBASE-13078.patch
>
>
> While pair-debugging with [~jeffreyz] on HBASE-13077, we noticed that 
> IntegrationTestSendTraceRequests doesn't actually assert anything. This test 
> should be converted to use a mini cluster, setup a POJOSpanReceiver, and then 
> verify the spans collected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13301) Possible memory leak in BucketCache

2015-04-10 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-13301:
-
Attachment: HBASE-13301_v3.patch

[~ndimiduk] Yes, I tried it on every branch. Just change 
'backingMap.remove(cacheKey, bucketEntry)' back to 
'bucketEntry.equals(backingMap.remove(cacheKey))' in BucketCache.evictBlock, 
the test will fail every time.

And for the sleep in testcase...
For the evictThread, it is not easy to add a count down latch since we expect 
the thread to be blocked on the IdLock. And for the BucketCache.cacheBlock, it 
is a simple queue based async operation, I think it is not worth to add more 
logic other than a simple sleep wait, it is fast...

I extracted the cacheAndWait operation to a method and add some comments to 
explain the reason. And I added a method in IdLock to check the number of 
waiters who are waiting on the given id and use this method to confirm the 
evictThread is blocked on the IdLock.

Thanks.

> Possible memory leak in BucketCache
> ---
>
> Key: HBASE-13301
> URL: https://issues.apache.org/jira/browse/HBASE-13301
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Reporter: zhangduo
>Assignee: zhangduo
> Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2
>
> Attachments: HBASE-13301-0.98.patch, HBASE-13301-branch-1.0.patch, 
> HBASE-13301-branch-1.0.patch, HBASE-13301-branch-1.patch, 
> HBASE-13301-testcase.patch, HBASE-13301-testcase_v1.patch, HBASE-13301.patch, 
> HBASE-13301_v1.patch, HBASE-13301_v2.patch, HBASE-13301_v3.patch
>
>
> {code:title=BucketCache.java}
> public boolean evictBlock(BlockCacheKey cacheKey) {
>   ...
>   if (bucketEntry.equals(backingMap.remove(cacheKey))) {
> bucketAllocator.freeBlock(bucketEntry.offset());
> realCacheSize.addAndGet(-1 * bucketEntry.getLength());
> blocksByHFile.remove(cacheKey.getHfileName(), cacheKey);
> if (removedBlock == null) {
>   this.blockNumber.decrementAndGet();
> }
>   } else {
> return false;
>   }
>   ...
> {code}
> I think the problem is here. We remove a BucketEntry that should not be 
> removed by us, but we do not put it back and also do not do any clean up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction

2015-04-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490731#comment-14490731
 ] 

Lars Hofhansl commented on HBASE-13408:
---

One - maybe simpler - thing we could do is flushing earlier if we have a lot of 
Cells with more versions than MAX_VERSION, or a lot of expired or deleted 
Cells. (see also HBASE-4241)

> HBase In-Memory Memstore Compaction
> ---
>
> Key: HBASE-13408
> URL: https://issues.apache.org/jira/browse/HBASE-13408
> Project: HBase
>  Issue Type: New Feature
>Reporter: Eshcar Hillel
> Attachments: HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf
>
>
> A store unit holds a column family in a region, where the memstore is its 
> in-memory component. The memstore absorbs all updates to the store; from time 
> to time these updates are flushed to a file on disk, where they are 
> compacted. Unlike disk components, the memstore is not compacted until it is 
> written to the filesystem and optionally to block-cache. This may result in 
> underutilization of the memory due to duplicate entries per row, for example, 
> when hot data is continuously updated. 
> Generally, the faster the data is accumulated in memory, more flushes are 
> triggered, the data sinks to disk more frequently, slowing down retrieval of 
> data, even if very recent.
> In high-churn workloads, compacting the memstore can help maintain the data 
> in memory, and thereby speed up data retrieval. 
> We suggest a new compacted memstore with the following principles:
> 1.The data is kept in memory for as long as possible
> 2.Memstore data is either compacted or in process of being compacted 
> 3.Allow a panic mode, which may interrupt an in-progress compaction and 
> force a flush of part of the memstore.
> We suggest applying this optimization only to in-memory column families.
> A design document is attached.
> This feature was previously discussed in HBASE-5311.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13446) Add docs warning about missing data for downstream on versions prior to HBASE-13262

2015-04-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490732#comment-14490732
 ] 

Lars Hofhansl commented on HBASE-13446:
---

Let's put it in the docs. You guys are right.

> Add docs warning about missing data for downstream on versions prior to 
> HBASE-13262
> ---
>
> Key: HBASE-13446
> URL: https://issues.apache.org/jira/browse/HBASE-13446
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 0.98.0, 1.0.0
>Reporter: Sean Busbey
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 2.0.0, 0.98.13, 1.0.2
>
>
> From conversation at the end of HBASE-13262:
> [~davelatham]
> {quote}
> Should we put a warning somewhere (mailing list? book?) about this? Something 
> like:
> IF (client OR server is <= 0.98.11/1.0.0) AND server has a smaller value for 
> hbase.client.scanner.max.result.size than client does, THEN scan requests 
> that reach the server's hbase.client.scanner.max.result.size are likely to 
> miss data. In particular, 0.98.11 defaults 
> hbase.client.scanner.max.result.size to 2MB but other versions default to 
> larger values, so be very careful using 0.98.11 servers with any other client 
> version.
> {quote}
> [~busbey]
> {quote}
> How about we add a note in the ref guide for upgrades and for
> troubleshooting?
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12006) [JDK 8] KeyStoreTestUtil#generateCertificate fails due to "subject class type invalid"

2015-04-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490727#comment-14490727
 ] 

Hadoop QA commented on HBASE-12006:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12724704/HBASE-12006.patch
  against master branch at commit e994b491aca8ab2edeb60a328c690ddbc88f8b51.
  ATTACHMENT ID: 12724704

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13666//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13666//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13666//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13666//console

This message is automatically generated.

> [JDK 8] KeyStoreTestUtil#generateCertificate fails due to "subject class type 
> invalid"
> --
>
> Key: HBASE-12006
> URL: https://issues.apache.org/jira/browse/HBASE-12006
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.99.0, 2.0.0
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2
>
> Attachments: HBASE-12006.patch
>
>
> Running tests on Java 8. All unit tests for branch 0.98 pass. On master 
> branch some variation in the security API is causing a failure in 
> TestSSLHttpServer:
> {noformat}
> Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.181 sec <<< 
> FAILURE! - in org.apache.hadoop.hbase.http.TestSSLHttpServer
> org.apache.hadoop.hbase.http.TestSSLHttpServer  Time elapsed: 0.181 sec  <<< 
> ERROR!
> java.security.cert.CertificateException: Subject class type invalid.
>   at sun.security.x509.X509CertInfo.setSubject(X509CertInfo.java:888)
>   at sun.security.x509.X509CertInfo.set(X509CertInfo.java:415)
>   at 
> org.apache.hadoop.hbase.http.ssl.KeyStoreTestUtil.generateCertificate(KeyStoreTestUtil.java:94)
>   at 
> org.apache.hadoop.hbase.http.ssl.KeyStoreTestUtil.setupSSLConfig(KeyStoreTestUtil.java:246)
>   at 
> org.apache.hadoop.hbase.http.TestSSLHttpServer.setup(TestSSLHttpServer.java:72)
> org.apache.hadoop.hbase.http.TestSSLHttpServer  Time elapsed: 0.181 sec  <<< 
> ERROR!
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.hbase.http.TestSSLHttpServer.cleanup(TestSSLHttpServer.java:100)
> Tests in error: 
>   TestSSLHttpServer.setup:72 » Certificate Subject class type invalid.
>   TestSSLHttpServer.cleanup:100 NullPointer
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13453) Master should not bind to region server ports

2015-04-10 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490725#comment-14490725
 ] 

Esteban Gutierrez commented on HBASE-13453:
---

[~devaraj] I don't think we need to add a new property, just via reflection is 
enough to check if we should use hbase.master.port to restore it a pre-1.0 
behavior.

> Master should not bind to region server ports
> -
>
> Key: HBASE-13453
> URL: https://issues.apache.org/jira/browse/HBASE-13453
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Devaraj Das
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 34111-2.txt
>
>
> In 1.0, master by default binds to the region server ports (rpc and info). We 
> have done it so thinking that in the long term, master and meta co-location 
> will be default, and we can merge the master and region server as a single 
> daemon. 
> Over at HBASE-11165, if the conclusion end up being that meta will not be 
> colocated at all, then master hosting a region server will just become an 
> implementation detail. [~saint@gmail.com] says that we might never allow 
> master to host regions. 
> Now, we are stuck in a state where we have made master bind to RS ports in 
> 1.0, which might create some confusion (and frustration) for small cluster 
> users who traditionally used to host a master and a region server on the same 
> node.
> I think we should undo this in 1.1 and use the previous master ports (16000) 
> and not bind to 16030, so that the user does not need to do anything to bring 
> up a RS on the same host. At least users going from 0.98 -> 1.1 will not take 
> a hit. Users going from 1.0 -> 1.1 will see changed default ports. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13436) Include user name in ADE for scans

2015-04-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490724#comment-14490724
 ] 

Hudson commented on HBASE-13436:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #899 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/899/])
HBASE-13436 Include user name in ADE for scans (ssrungarapu: rev 
188b7d611c644497ff603989e6ba2bb78076738c)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java


> Include user name in ADE for scans
> --
>
> Key: HBASE-13436
> URL: https://issues.apache.org/jira/browse/HBASE-13436
> Project: HBase
>  Issue Type: Improvement
>Reporter: Srikanth Srungarapu
>Assignee: Srikanth Srungarapu
>Priority: Minor
> Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2
>
> Attachments: HBASE-13436.patch
>
>
> Currently, we're not including user name in case of ADE for scans, whereas in 
> case of other operations we're doing so. 
> {code}
> ERROR: org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient 
> permissions (table=test, action=READ)
> {code}
> Got bumped into this internally. Helps during debugging. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13453) Master should not bind to region server ports

2015-04-10 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490720#comment-14490720
 ] 

Devaraj Das commented on HBASE-13453:
-

[~esteban] yes one can get the old behavior via passing arguments to the 
startup scripts. But the point is to see if we can provide the pre-1.0 behavior 
w.r.t port binding without needing the deployment tools to change...

> Master should not bind to region server ports
> -
>
> Key: HBASE-13453
> URL: https://issues.apache.org/jira/browse/HBASE-13453
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Devaraj Das
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 34111-2.txt
>
>
> In 1.0, master by default binds to the region server ports (rpc and info). We 
> have done it so thinking that in the long term, master and meta co-location 
> will be default, and we can merge the master and region server as a single 
> daemon. 
> Over at HBASE-11165, if the conclusion end up being that meta will not be 
> colocated at all, then master hosting a region server will just become an 
> implementation detail. [~saint@gmail.com] says that we might never allow 
> master to host regions. 
> Now, we are stuck in a state where we have made master bind to RS ports in 
> 1.0, which might create some confusion (and frustration) for small cluster 
> users who traditionally used to host a master and a region server on the same 
> node.
> I think we should undo this in 1.1 and use the previous master ports (16000) 
> and not bind to 16030, so that the user does not need to do anything to bring 
> up a RS on the same host. At least users going from 0.98 -> 1.1 will not take 
> a hit. Users going from 1.0 -> 1.1 will see changed default ports. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13453) Master should not bind to region server ports

2015-04-10 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490715#comment-14490715
 ] 

Esteban Gutierrez commented on HBASE-13453:
---

I think this can be addressed via hbase-site.xml or passing arguments. So far I 
haven't seen this as a blocker that could require to add new configuration 
properties and provide the pre 1.0 behavior.

> Master should not bind to region server ports
> -
>
> Key: HBASE-13453
> URL: https://issues.apache.org/jira/browse/HBASE-13453
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Devaraj Das
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 34111-2.txt
>
>
> In 1.0, master by default binds to the region server ports (rpc and info). We 
> have done it so thinking that in the long term, master and meta co-location 
> will be default, and we can merge the master and region server as a single 
> daemon. 
> Over at HBASE-11165, if the conclusion end up being that meta will not be 
> colocated at all, then master hosting a region server will just become an 
> implementation detail. [~saint@gmail.com] says that we might never allow 
> master to host regions. 
> Now, we are stuck in a state where we have made master bind to RS ports in 
> 1.0, which might create some confusion (and frustration) for small cluster 
> users who traditionally used to host a master and a region server on the same 
> node.
> I think we should undo this in 1.1 and use the previous master ports (16000) 
> and not bind to 16030, so that the user does not need to do anything to bring 
> up a RS on the same host. At least users going from 0.98 -> 1.1 will not take 
> a hit. Users going from 1.0 -> 1.1 will see changed default ports. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13453) Master should not bind to region server ports

2015-04-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490708#comment-14490708
 ] 

Ted Yu commented on HBASE-13453:


lgtm

> Master should not bind to region server ports
> -
>
> Key: HBASE-13453
> URL: https://issues.apache.org/jira/browse/HBASE-13453
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Assignee: Devaraj Das
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 34111-2.txt
>
>
> In 1.0, master by default binds to the region server ports (rpc and info). We 
> have done it so thinking that in the long term, master and meta co-location 
> will be default, and we can merge the master and region server as a single 
> daemon. 
> Over at HBASE-11165, if the conclusion end up being that meta will not be 
> colocated at all, then master hosting a region server will just become an 
> implementation detail. [~saint@gmail.com] says that we might never allow 
> master to host regions. 
> Now, we are stuck in a state where we have made master bind to RS ports in 
> 1.0, which might create some confusion (and frustration) for small cluster 
> users who traditionally used to host a master and a region server on the same 
> node.
> I think we should undo this in 1.1 and use the previous master ports (16000) 
> and not bind to 16030, so that the user does not need to do anything to bring 
> up a RS on the same host. At least users going from 0.98 -> 1.1 will not take 
> a hit. Users going from 1.0 -> 1.1 will see changed default ports. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5558) Add HBASE-5535 'Make the functions in task monitor synchronized' to 0.92 branch

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5558.
---
Resolution: Not A Problem

> Add HBASE-5535 'Make the functions in task monitor synchronized' to 0.92 
> branch
> ---
>
> Key: HBASE-5558
> URL: https://issues.apache.org/jira/browse/HBASE-5558
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-4630) If you shutdown all RS an active master is never able to recover when RS come back online

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-4630.
---
Resolution: Invalid

> If you shutdown all RS an active master is never able to recover when RS come 
> back online
> -
>
> Key: HBASE-4630
> URL: https://issues.apache.org/jira/browse/HBASE-4630
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jonathan Gray
>
> I've been doing some isolated benchmarking of a single RS and can repeatedly 
> trigger some craziness in the master if I shutdown the RS.  It is never able 
> to recover after bringing RSs back online.  I seem to see different behavior 
> across different branches / revisions of the 92 branch, but there does seem 
> to be an issue in several of them.
> Putting against 0.92.1 so we don't hold up the release of 0.92.  Should not 
> be a blocker.
> Working on a unit test now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5183) Render the monitored tasks as a treeview

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5183.
---
Resolution: Incomplete
  Assignee: (was: Mubarak Seyed)

> Render the monitored tasks as a treeview
> 
>
> Key: HBASE-5183
> URL: https://issues.apache.org/jira/browse/HBASE-5183
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>
> Andy made the suggestion here:
> https://issues.apache.org/jira/browse/HBASE-5174?focusedCommentId=13184571&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13184571



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-6144) Master mistakenly splits live server's HLog file

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-6144.
---
  Resolution: Cannot Reproduce
Release Note:   (was: Underlying hadoop is 0.22)

Reopen if still an issue with current code

> Master mistakenly splits live server's HLog file
> 
>
> Key: HBASE-6144
> URL: https://issues.apache.org/jira/browse/HBASE-6144
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>
> RS abcdn0590 is live, but Master does not have it on its onlineserver list. 
> So, Master put up the hlog for splitting as shown in the Master log below:
> {code}
> 2012-05-17 21:43:57,692 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
> task 
> /hbase/splitlog/hdfs%3A%2F%2Fnamenode.xyz.com%2Fhbase%2F.logs%2Fabcdn0590.xyz.com%2C60020%2C1337315957185-splitting%2Fabcdn0590.xyz.com%252C60020%252C1337315957185.1337315957711
>  acquired by abcdn0770.xyz.com,60020,1337315956278. 
> {code}
> After splitting succeeded, Master deleted the file:
> {code}
> 2012-05-17 21:43:58,721 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2Fnamenode.xyz.com%2Fhbase%2F.logs%2Fabcdn0590.xyz.com%2C60020%2C1337315957185-splitting%2Fabcdn0590.xyz.com%252C60020%252C1337315957185.1337315957711
> {code}
> RS abcdn0590 lost the lease to RS abcdn0770, and try to do a Log Roller which 
> closes the current hlog, and create a new one, as shown in the namenode log:
> {code}
> 2012-05-17 21:43:58,422 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> commitBlockSynchronization(newblock=blk_2867982016684075739_12741027, 
> file=/hbase/.logs/abcdn0590.xyz.com,60020,1337315957185-splitting/abcdn0590.xyz.com%2C60020%2C1337315957185.1337315957711,
>  newgenerationstamp=12911920, newlength=134, newtargets=[10.115.13.24:50010, 
> 10.115.15.46:50010, 10.115.15.23:50010]) successful
> 2012-05-17 21:43:59,883 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.allocateBlock: 
> /hbase/.logs/abcdn0590.xyz.com,60020,1337315957185/abcdn0590.xyz.com%2C60020%2C1337315957185.1337316238882.
>  blk_3811725326431482476_12913541{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[10.115.13.24:50010|RBW], 
> ReplicaUnderConstruction[10.115.17.18:50010|RBW], 
> ReplicaUnderConstruction[10.115.17.15:50010|RBW]]}
> {code}
>  
> When RS 0590 try to close the old hlog 1337315957711, it received fatal error 
> below due to the original hlog is already deleted. The fatal error will cause 
> RS abcdn0590 to shutdown itself later.
> {code}
> 2012-05-17 21:43:58,889 ERROR org.apache.hadoop.hbase.master.HMaster: Region 
> server ^@^@abcdn0590.xyz.com,60020,1337315957185 reported a fatal error:
> ABORTING region server abcdn0590.xyz.com,60020,1337315957185: IOE in log 
> roller
> Cause:
> java.io.FileNotFoundException: File does not exist: 
> hdfs://namenode.xyz.com/hbase/.logs/abcdn0590.xyz.com,60020,1337315957185/abcdn0590.xyz.com%2C60020%2C1337315957185.1337315957711
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:742)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:583)
> at 
> org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:94)
> {code}
>  
> RS abcdn0590 shutdown at around 21:44. But in the /hbase/.logs dir, it left 
> two sub folder for the RS abcdn0590 with the same startcode 1337315957185 , 
> they are
> · /hbase/.logs/abcdn0590.xyz.com,60020,1337315957185-splitting/
> · /hbase/.logs/abcdn0590.xyz.com,60020,1337315957185/
>  
> Later on, at around 21:46:30, Master retry log splitting, this time,  it 
> still consider RS abcdn0590 as dead RS and try to put up its hlog for others 
> to grab and split. It finds the folder 
> /hbase/.logs/abcdn0590.xyz.com,60020,1337315957185/, and the first step it 
> does is to rename it to adding suffix of –splitting.  However, the same 
> folder already exist. The rename function does not handle the case where the 
> destination folder already exist, instead, the behavior is putting the src 
> folder under the dst folder, so the path structure looks like dst/src/file. 
> In our case, It is 
> /hbase/.logs.20120518.1204/abcdn0590.xyz.com,60020,1337315957185-splitting/abcdn0590.xyz.com,60020,1337315957185/abcdn0590.xyz.com%2C60020%2C1337315957185.1337316238882.
>  
> This is from the master log, we can see that two folders for the same RS 0590 
> at same startcode exists:
> {code}
> 2012-05-17 21:46:30,749 INFO org.apache.hadoop.hbase.master.MasterFileSystem: 
> Log folder 
> hdfs://namenode.xyz.com/hbase/.logs/abcdn0590.xyz.com,60020,1329941607395-splitting
>  doesn't belong to a known region server, splitting
> 2012

[jira] [Resolved] (HBASE-6008) copy_tables_desc.rb imports non-existant ZooKeeperWatcher

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-6008.
---
Resolution: Incomplete
  Assignee: (was: Jean-Daniel Cryans)

> copy_tables_desc.rb imports non-existant ZooKeeperWatcher
> -
>
> Key: HBASE-6008
> URL: https://issues.apache.org/jira/browse/HBASE-6008
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.0, 0.92.0, 0.94.0, 0.95.2
>Reporter: Jonathan Hsieh
> Attachments: rm.txt
>
>
> This script, which seems to be part of replication does not work out of the 
> box against 0.90, 0.92, 0.94, or trunk because of this line:
> {code}
> import org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper
> {code}
> This file is not last appeared in the 0.89 branch, and actually is not needed 
> at all for the script to function.  
> The script is of dubious use -- we may want to consider removing it instead 
> of fixing it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5993) Add a no-read Append

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5993.
---
Resolution: Won't Fix

> Add a no-read Append
> 
>
> Key: HBASE-5993
> URL: https://issues.apache.org/jira/browse/HBASE-5993
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.94.0
>Reporter: Jacques
>Priority: Critical
>
> HBASE-4102 added an atomic append.  For high performance situations, it would 
> be helpful to be able to do appends that don't actually require a read of the 
> existing value.  This would be useful in building a growing set of values.  
> Our original use case was for implementing a form of search in HBase where a 
> cell would contain a list of document ids associated with a particular 
> keyword for search.  However it seems like it would also be useful to provide 
> substantial performance improvements for most Append scenarios.
> Within the client API, the simplest way to implement this would be to 
> leverage the existing Append api.  If the Append is marked as 
> setReturnResults(false), use this code path.  If result return is requested, 
> use the existing Append implementation.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-6093) Flatten timestamps during flush and compaction

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-6093.
---
Resolution: Incomplete

> Flatten timestamps during flush and compaction
> --
>
> Key: HBASE-6093
> URL: https://issues.apache.org/jira/browse/HBASE-6093
> Project: HBase
>  Issue Type: New Feature
>  Components: io, Performance, regionserver
>Reporter: Matt Corgan
>Priority: Minor
>
> Many applications run with maxVersions=1 and do not care about timestamps, or 
> they will specify one timestamp per row as a normal KeyValue rather than 
> per-cell.
> Then, DataBlockEncoders like those in HBASE-4218 and HBASE-4676 often encode 
> timestamps as diffs from the previous or diffs from the minimum timestamp in 
> the block.  If all timestamps in a block are the same, they will all compress 
> to basically <= 8 bytes total per block.  This can be 10% to 25% space 
> savings for some schemas, and that savings is realized both on disk and in 
> block cache.
> We could add a ColumnFamily setting flattenTimestamps=[true/false].  If true, 
> then all timestamps are modified during a flush/compaction to the 
> currentTimeMillis() at the start of the flush/compaction.  If all timestamps 
> are made identical in a file, then the encoder will be able to eliminate them.
> The simplest use case is probably that where all inserts are type=Put, there 
> are no overwrites, and there are no deletes.  As use cases get more complex, 
> then so does the implementation.  
> For example, what happens when there is a Put and a Delete of the same cell 
> in the same memstore?  Maybe for a flush at t=flushStartTime, the Put gets 
> timestamp=t, and the Delete gets timestamp=t+1.  Or maybe HBASE-4241 could 
> take care of this problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5265) Fix 'revoke' shell command

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5265.
---
Resolution: Invalid
  Assignee: (was: Eugene Koontz)

> Fix 'revoke' shell command
> --
>
> Key: HBASE-5265
> URL: https://issues.apache.org/jira/browse/HBASE-5265
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Andrew Purtell
>
> The 'revoke' shell command needs to be reworked for the AccessControlProtocol 
> implementation that was finalized for 0.92. The permissions being removed 
> must exactly match what was previously granted. No wildcard matching is done 
> server side.
> Allow two forms of the command in the shell for convenience:
> Revocation of a specific grant:
> {code}
> revoke , ,  [ ,  ]
> {code}
> Have the shell automatically do so for all permissions on a table for a given 
> user:
> {code}
> revoke , 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-6074) TestHLog is flaky

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-6074.
---
Resolution: Cannot Reproduce

Reopen if still an issue with current code

> TestHLog is flaky
> -
>
> Key: HBASE-6074
> URL: https://issues.apache.org/jira/browse/HBASE-6074
> Project: HBase
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.92.0
>Reporter: Devaraj Das
> Attachments: 6074-1.patch
>
>
> When I run TestHLog in a loop, I see failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5161) Compaction algorithm should prioritize reference files

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5161.
---
   Resolution: Incomplete
Fix Version/s: (was: 0.92.1)

> Compaction algorithm should prioritize reference files
> --
>
> Key: HBASE-5161
> URL: https://issues.apache.org/jira/browse/HBASE-5161
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Priority: Critical
>
> I got myself into a state where my table was un-splittable as long as the 
> insert load was coming in. Emergency flushes because of the low memory 
> barrier don't check the number of store files so it never blocks, to a point 
> where I had in one case 45 store files and the compactions were almost never 
> done on the reference files (had 15 of them, went down by one in 20 minutes). 
> Since you can't split regions with reference files, that region couldn't 
> split and was doomed to just get more store files until the load stopped.
> Marking this as a minor issue, what we really need is a better pushback 
> mechanism but not prioritizing reference files seems wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-6072) Make TableRecordReaderImpl more easily extended

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-6072.
---
Resolution: Not A Problem

> Make TableRecordReaderImpl more easily extended
> ---
>
> Key: HBASE-6072
> URL: https://issues.apache.org/jira/browse/HBASE-6072
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Dave Latham
>Priority: Minor
>
> We have a MR job that is very memory bound.  It reads a potentially large row 
> from hbase, then deserializes it into an (even larger) object representation, 
> then does a fair amount of computation requiring memory.  After converting 
> the Result into our object representation we want to free the memory holding 
> the Result to be available for the actual computation of output values.
> Currently we have our own custom modified copy of TableRecordReaderImpl to be 
> able to set the Result value to null after reading it, but it's almost 
> entirely a duplicate of hbase's TableRecordReaderImpl so we have to manually 
> keep it up to date with changes to the hbase version.  If the value field of 
> TableRecordReaderImpl were protected instead of private we could use a very 
> simple subclass instead.
> Are there any philosophical guidelines about what parts of HBase should or 
> should not be easily extensible?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-2893) Table metacolumns

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-2893.
---
Resolution: Later

> Table metacolumns
> -
>
> Key: HBASE-2893
> URL: https://issues.apache.org/jira/browse/HBASE-2893
> Project: HBase
>  Issue Type: New Feature
>  Components: Coprocessors
>Reporter: Andrew Purtell
>
> Some features like TTLs or access control lists have use cases that call for 
> per-value configurability. 
> Currently in HBase TTLs are set per column family. This leads to potentially 
> awkward "bucketing" of values into column families set up to accommodate the 
> common desired TTLs for all values within -- an unnecessarily wide schema, 
> with resulting unnecessary reduction in I/O locality in access patterns, more 
> store files than otherwise, and so on.
> Over in HBASE-1697 we're considering setting ACLs on column families. 
> However, we are aware of other BT-like systems which support per-value ACLs. 
> This allows for multitenancy in a single table as opposed to really requiring 
> tables for each customer (or, at least column families). The scale out 
> properties for a single table are better than alternatives. I think 
> supporting per-row ACLs would be generally sufficient: customer ID could be 
> part of the row key. We can still plan to maintain column-family level ACLs. 
> We would therefore not have to bloat the store with per-row ACLs for the 
> normal case -- but it would be highly useful to support overrides for 
> particular rows. So how to do that?
> I propose to introduce _metacolumns_. 
> A _metacolumn_ would be a column family intrinsic to every table, created by 
> the system at table create time.  It would be accessible like any other 
> column family, but we expect a default ACL that only allows access by the 
> system and operator principals, and would function like any other, except 
> administrative actions such as renaming or deletion would not be allowed.  
> Into the metacolumn would be stored per-row overrides for such things as ACLs 
> and TTLs. The metacolumn therefore would be as sparse as possible; no storage 
> would required for any overrides if a value is committed with defaults. A 
> reasonably sparse metacolumn for a region may fit entirely within blockcache. 
> It may be possible for all metacolumns on a RS to fit within blockcache 
> without undue pressure on other users. We can aim design effort at this 
> target. 
> The scope of changes required to support this is:
> - Introduce metacolumn concept in the code and into the security model 
> (default ACL): A flag in HCD, a default ACL, and a few additional checks for 
> rejecting disallowed administrative actions.
> - Automatically create metacolumns at table create time.
> - Consult metacolumn as part of processing reads or mutations, perhaps using 
> a bloom filter to shortcut lookups for rows with no metaentries, and apply 
> configuration or security policy overrides if found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-6064) Add timestamp to Mutation Thrift API

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-6064.
---
Resolution: Incomplete
  Assignee: (was: Mikhail Bautin)

> Add timestamp to Mutation Thrift API
> 
>
> Key: HBASE-6064
> URL: https://issues.apache.org/jira/browse/HBASE-6064
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> We need to be able to specify per-mutation timestamps in the HBase Thrift 
> API. If the timestamp is not specified, the timestamp passed to the Thrift 
> API method itself (mutateRowTs/mutateRowsTs) should be used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-6075) Improve delete(Latest-timestamp) performance: consider adding a delete_next type

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-6075.
---
Resolution: Not A Problem

> Improve delete(Latest-timestamp) performance: consider adding a delete_next 
> type
> 
>
> Key: HBASE-6075
> URL: https://issues.apache.org/jira/browse/HBASE-6075
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Amitanand Aiyer
>Priority: Minor
>
> Disclaimer: this will only work correctly if the application is not taking 
> control of the timestamp. 
> We have a version of deleteVersion, which deletes the last version, if no 
> specific timestamp is specified  (i.e. timestamp is left as Long.MAX_VALUE)
> On the server side, this translates to deleting the largest timestamped cell 
> in the specified column. Which entails doing a get, and then a delete.
> We don't seem to use this api a whole lot, so not a very high pri task. 
> But, for systems that use the api. We might be able to make this much faster 
> (as fast as the puts) by introducing a new delete type (say 
> DELETE_NEXT_VERSION) that sorts right after put in the column, and just 
> adding it as a put. The deleteTracker can be updated to keep track of this 
> delete_next and accordingly delete the nextKV asked for.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-4663) MR based copier for copying HFiles

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-4663.
---
Resolution: Won't Fix
  Assignee: (was: Karthik Ranganathan)

> MR based copier for copying HFiles
> --
>
> Key: HBASE-4663
> URL: https://issues.apache.org/jira/browse/HBASE-4663
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation, regionserver
>Reporter: Karthik Ranganathan
>
> This copier is a modification of the distcp tool in HDFS. It does the 
> following:
> 1. List out all the regions in the HBase cluster for the required table
> 2. Write the above out to a file
> 3. Each mapper 
>3.1 lists all the HFiles for a given region by querying the regionserver
>3.2 copies all the HFiles
>3.3 outputs success if the copy succeeded, failure otherwise. Failed 
> regions are retried in another loop
> 4. Mappers are placed on nodes which have maximum locality for a given region 
> to speed up copying



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-6037) Have a separate hbase-xxx.xml property file that only has client side configuration settings.

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-6037.
---
Resolution: Incomplete

> Have a separate hbase-xxx.xml property file that only has client side 
> configuration settings.
> -
>
> Key: HBASE-6037
> URL: https://issues.apache.org/jira/browse/HBASE-6037
> Project: HBase
>  Issue Type: New Feature
>Reporter: Jonathan Hsieh
>
> It would be good to have a separate hbase-xxx.xml (hbase-client.xml perhaps) 
> file that only contains settings relevant to hbase clients.  This would allow 
> "secrets" to be present in the hbase-site.xml file, and also have a smaller 
> config for clients that may be decoupled from the hmaster and regionservers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-6026) Reconcile .proto files after the dust settles

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-6026.
---
Resolution: Not A Problem

> Reconcile .proto files after the dust settles
> -
>
> Key: HBASE-6026
> URL: https://issues.apache.org/jira/browse/HBASE-6026
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>
> Four or five of us have been working on .proto stuff over last few months.  
> Andrew did pb work when we were all in diapers.  After HBASE-6000 goes in 
> which moves all .protos into one place, lets look at doing reconciliation and 
> cleanup.  For example, Andrew made a TableSchema that is also in stuff I 
> added.  I did not use Andrews because it has stuff I do not want (though I 
> took inspiration from his for what I did do) and ditto for a ColumnSchema -- 
> in  mine I call it ColumnFamilySchema.  After .proto resolution, we should 
> try and reconcile method names for the common operations done (there was a 
> bunch of overlap around our pb'ing work -- lets make sure we had same basic 
> pattern all around so its easier on those that come after us figuring whats 
> going on).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5980) Scanner responses from RS should include metrics on rows/KVs filtered

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5980.
---
Resolution: Incomplete

> Scanner responses from RS should include metrics on rows/KVs filtered
> -
>
> Key: HBASE-5980
> URL: https://issues.apache.org/jira/browse/HBASE-5980
> Project: HBase
>  Issue Type: Improvement
>  Components: Client, metrics, regionserver
>Affects Versions: 0.95.2
>Reporter: Todd Lipcon
>Priority: Minor
>
> Currently it's difficult to know, when issuing a filter, what percentage of 
> rows were skipped by that filter. We should expose some basic counters back 
> to the client scanner object. For example:
> - number of rows filtered by row key alone (filterRowKey())
> - number of times each filter response was returned by filterKeyValue() - 
> corresponding to Filter.ReturnCode
> What would be slickest is if this could actually return a tree of counters 
> for cases where FilterList or other combining filters are used. But a 
> top-level is a good start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5996) Improve multiPut/multiDelete by moving HLog.append and updateTimestamp out of the updateLock.readLock.lock()/unlock() functionality

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5996.
---
Resolution: Incomplete
  Assignee: (was: Amitanand Aiyer)

> Improve multiPut/multiDelete by moving HLog.append and updateTimestamp out of 
> the updateLock.readLock.lock()/unlock() functionality
> ---
>
> Key: HBASE-5996
> URL: https://issues.apache.org/jira/browse/HBASE-5996
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.89.20100924, 0.89-fb, 0.94.0
>Reporter: Amitanand Aiyer
>Priority: Minor
>
> whenever we do a batchMutateWithLocks in HRegion,
> we get the HRegion.updateLock.readLock() ... 
> My understanding is that we need the updateLock.readLock only
> to protect the updates to the memStore.
> (i) HLog.append() has its own serialization/locking using HLog.updateLock
> We can move this out of the HRegion.updateLock lock grabbing.
> (ii) updating the timestamp for deltes and puts, can also be done before 
> grabbing the lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5905) Protobuf interface for Admin: split between the internal and the external/customer interface

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5905.
---
Resolution: Not A Problem

> Protobuf interface for Admin: split between the internal and the 
> external/customer interface
> 
>
> Key: HBASE-5905
> URL: https://issues.apache.org/jira/browse/HBASE-5905
> Project: HBase
>  Issue Type: Improvement
>  Components: Client, master, regionserver
>Affects Versions: 0.95.2
>Reporter: Nicolas Liochon
>
> After a short discussion with Stack, I create a jira.
> --
> I'am a little bit confused by the protobuf interface for closeRegion.
> We have two types of closeRegion today:
> 1) the external ones; available in client.HBaseAdmin. They take the server 
> and the region identifier as a parameter and nothing else.
> 2) The internal ones, called for example by the master. They have more 
> parameters (like versionOfClosingNode or transitionInZK).
> When I look at protobuf.ProtobufUtil, I see:
>   public static void closeRegion(final AdminProtocol admin,
>   final byte[] regionName, final boolean transitionInZK) throws 
> IOException {
> CloseRegionRequest closeRegionRequest =
>   RequestConverter.buildCloseRegionRequest(regionName, transitionInZK);
> try {
>   admin.closeRegion(null, closeRegionRequest);
> } catch (ServiceException se) {
>   throw getRemoteException(se);
> }
>   }
> In other words, it seems that we merged the two interfaces into a single one. 
> Is that the intend?
> I checked, the internal fields in closeRegionRequest are all optional (that's 
> good). Still, it means that the end user could use them or at least would 
> need to distinguish between the "optional for functional reasons" and the 
> "optional - do not use".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5929) HBaseAdmin.compact and flush are giving confusing errors for ROOT, META, and regions that don't exist

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5929.
---
Resolution: Cannot Reproduce

Reopen if still an issue with current code

> HBaseAdmin.compact and flush are giving confusing errors for ROOT, META, and 
> regions that don't exist
> -
>
> Key: HBASE-5929
> URL: https://issues.apache.org/jira/browse/HBASE-5929
> Project: HBase
>  Issue Type: Bug
>  Components: Client, shell
>Affects Versions: 0.92.1
> Environment: Linux Ubuntu Lucid 64bit
>Reporter: Aravind Gottipati
>Priority: Minor
>
> I have been noticing that calls to HBaseAdmin.majorCompact throws exceptions 
> randomly for some regions.  I could not find a pattern to these exception.  
> The code I have simply does this 
> admin.majorCompact(region.getRegionNameAsString()).  admin is an instance of 
> HBaseAdmin and region is an instance of HRegionInfo.  The exception I get is 
> org.apache.hadoop.hbase.TableNotFoundException: -ROOT-,,0
> at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableNameString(HBaseAdmin.java:1473)
>  ~[hbase-0.92.1.jar:0.92.1]
> at 
> org.apache.hadoop.hbase.client.HBaseAdmin.compact(HBaseAdmin.java:1235) 
> ~[hbase-0.92.1.jar:0.92.1]
> at 
> org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1209) 
> ~[hbase-0.92.1.jar:0.92.1]
> at com.stumbleupon.hbaseadmin.HBaseCompact.compactAllServers(Unknown 
> Source) [hbase_compact.jar:na]
> In this case it's the root region, but I get similar exceptions for other 
> tables, like this.
> 2012-05-03 19:03:42,994 WARN  [main] HBaseCompact: Could not compact:
> org.apache.hadoop.hbase.TableNotFoundException: 
> ad_daily,49842:2009-07-10,1269763588508.1997607018
> at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableNameString(HBaseAdmin.java:1473)
>  ~[hbase-0.92.1.jar:0.92.1]
> at 
> org.apache.hadoop.hbase.client.HBaseAdmin.compact(HBaseAdmin.java:1235) 
> ~[hbase-0.92.1.jar:0.92.1]
> at 
> org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1209) 
> ~[hbase-0.92.1.jar:0.92.1]
> at 
> org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1196) 
> ~[hbase-0.92.1.jar:0.92.1]
> at com.stumbleupon.hbaseadmin.HBaseCompact.compactAllServers(Unknown 
> Source) [hbase_compact.jar:na]
> at com.stumbleupon.hbaseadmin.HBaseCompact.main(Unknown Source) 
> [hbase_compact.jar:na]
> I see this on hbase shell as well.  However, I don't see these exceptions if 
> I use admin.majorCompact(region.getRegionName()), so it looks like something 
> gets lost when I use getRegionNameAsString().
> Let me know if I can provide more information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5940) HBase in-cluster backup based on the HDFS hardlink

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5940.
---
Resolution: Invalid
  Assignee: (was: Liyin Tang)

> HBase in-cluster backup based on the HDFS hardlink
> --
>
> Key: HBASE-5940
> URL: https://issues.apache.org/jira/browse/HBASE-5940
> Project: HBase
>  Issue Type: New Feature
>Reporter: Liyin Tang
>
> The motivation of introducing the HardLink operation/api in HDFS is to get a 
> full copy of the file without copying the bytes. 
> So users/applications can create multiple hard links to the same source file 
> instantly. 
> And HBase can make full use of the hard-link to generate the in-cluster 
> backup snapshot instantly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5907) enhance HLog pretty printer to print additional useful stats

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5907.
---
Resolution: Incomplete

> enhance HLog pretty printer to print additional useful stats
> 
>
> Key: HBASE-5907
> URL: https://issues.apache.org/jira/browse/HBASE-5907
> Project: HBase
>  Issue Type: Improvement
>Reporter: Kannan Muthukkaruppan
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--D2979.1.patch, 
> ASF.LICENSE.NOT.GRANTED--D2979.2.patch
>
>
> It would be useful for analysis purposes to enhance the HLog pretty printer 
> to optionally print a bunch of additional stats such as:
> 1) # of txns
> 2) # of KVs updated
> 3) avg size of txns
> 4) avg size of KVs
> 5) avg # of KVs written per txn
> 5) unique CF signatures involved in put/delete operatons; and breakdown of 
> some of the above metrics by these signatures, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5891) Change Compression Based on Type of Compaction

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5891.
---
Resolution: Not A Problem

> Change Compression Based on Type of Compaction
> --
>
> Key: HBASE-5891
> URL: https://issues.apache.org/jira/browse/HBASE-5891
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Priority: Minor
>
> We currently use LZO on our production systems because the on-demand 
> decompression speed of GZ is too slow.  That said, many of our 
> major-compacted StoreFiles are infrequently read because of lazy seek 
> optimizations, but they occupy the majority of our disk space.  One idea is 
> to change the type of compression depending upon compaction characteristics 
> (input size or major compaction flag).  This would allow us to have our 
> largest and least-read files be GZ compressed and save space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5895) Slow query log in trunk is too verbose

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5895.
---
Resolution: Not A Problem

> Slow query log in trunk is too verbose
> --
>
> Key: HBASE-5895
> URL: https://issues.apache.org/jira/browse/HBASE-5895
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.95.2
>Reporter: Todd Lipcon
>Priority: Critical
>
> Running a YCSB workload against trunk, the slow query log ends up logging the 
> entire contents of "mutate" RPCs (in PB-encoded binary). This then makes the 
> logging back up, which makes more slow queries, which makes the whole thing 
> spin out of control. We should only summarize the RPC, rather than printing 
> the whole contents.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5938) Improve documentation of system tests such as TestAcidGuarantees

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5938.
---
Resolution: Incomplete

> Improve documentation of system tests such as TestAcidGuarantees
> 
>
> Key: HBASE-5938
> URL: https://issues.apache.org/jira/browse/HBASE-5938
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Jonathan Hsieh
>
> There are several unit tests that have main methods and can be used as long 
> running system tests.  Currently this includes TestAcidGuarantees 
> (HBASE-5887), but may include more in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5899) Local cluster tries to connect to HDFS which makes the startup failed

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5899.
---
Resolution: Invalid

> Local cluster tries to connect to HDFS which makes the startup failed
> -
>
> Key: HBASE-5899
> URL: https://issues.apache.org/jira/browse/HBASE-5899
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0, 0.92.1
> Environment: Mac OS X Lion
>Reporter: Yifeng Jiang
>Priority: Minor
>
> In 0.92, HBase local cluster won't start because of trying to connect to 
> local HDFS. This error does not happen in 0.90.
> We should not need to connect to HDFS to run a local cluster.
> Here is my hbase-site.xml
> {code:xml}
> 
>   
> hbase.rootdir
> file:///usr/local/hbase/var/hbase
>   
> 
> {code}
> This is the error:
> {noformat}
> 2012-04-30 11:32:22,225 ERROR 
> org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
> java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on 
> connection exception: java.net.ConnectException: Connection refused
> at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
> at org.apache.hadoop.ipc.Client.call(Client.java:1071)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
> at $Proxy11.getProtocolVersion(Unknown Source)
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
> at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
> at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:238)
> at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:203)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
> at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:185)
> at 
> org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:418)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:141)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1637)
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
> at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
> at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
> at org.apache.hadoop.ipc.Client.call(Client.java:1046)
> ... 20 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5745) SequenceFileLogReader#getPos may get wrong length on DFS restart

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5745.
---
Resolution: Incomplete
  Assignee: (was: Uma Maheswara Rao G)

> SequenceFileLogReader#getPos may get wrong length on DFS restart
> 
>
> Key: HBASE-5745
> URL: https://issues.apache.org/jira/browse/HBASE-5745
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.95.2
>Reporter: Uma Maheswara Rao G
>Priority: Critical
>
> This is actually a kind of integration bug from Hbase perspective.
> Currently HDFS will count the partial block length as 0, if there are no 
> locations found for the partial block. This can happend opn DFS restart, 
> before DNs completely reports to NN.
> Explained the scenario in HDFS-3222. Actually this is a bug in HDFS. we may 
> solve this in latest versions.
> So, whatever the versions Hbase using may have this bug. HMaster may not be 
> able to replay the complete edits if there is an Hmaster switch also at the 
> same time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5896) slow query log thresholds not in hbase-default.xml

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5896.
---
Resolution: Not A Problem

> slow query log thresholds not in hbase-default.xml
> --
>
> Key: HBASE-5896
> URL: https://issues.apache.org/jira/browse/HBASE-5896
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Priority: Minor
>
> hbase.ipc.warn.response.time and hbase.ipc.warn.response.size need 
> documentation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5628) Improve performance of uberhbck

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5628.
---
Resolution: Incomplete

> Improve performance of uberhbck
> ---
>
> Key: HBASE-5628
> URL: https://issues.apache.org/jira/browse/HBASE-5628
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.95.2
>Reporter: Jonathan Hsieh
>
> During reviews of HBASE-5128 there are several opportunities investigate for 
> improving the performance of the tool.
> - Change regionInfoMap and tablesInfo from TreeMap to HashMap.
> - Change some full region set reloads to be incremental to require fewer 
> passes.
> - Cache meta for subsequent calls of closeRegionSileneglyAndWait



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5805) TestServerCustomProtocol failing intermittently.

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5805.
---
Resolution: Cannot Reproduce

> TestServerCustomProtocol failing intermittently.
> 
>
> Key: HBASE-5805
> URL: https://issues.apache.org/jira/browse/HBASE-5805
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.95.2
>Reporter: Uma Maheswara Rao G
> Attachments: TestServerCustomProtocol.log
>
>
> Trace:
> java.lang.AssertionError: Results should contain region 
> test,ccc,1334638013935.b9d77206f6eb226928b898e66fd1d508. for row 'ccc'
>   at org.junit.Assert.fail(Assert.java:93)
>   at org.junit.Assert.assertTrue(Assert.java:43)
>   at 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol.verifyRegionResults(TestServerCustomProtocol.java:363)
>   at 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol.testNullReturn(TestServerCustomProtocol.java:330)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5846) HBase rpm packing is broken at multiple places

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5846.
---
Resolution: Not A Problem

> HBase rpm packing is broken at multiple places
> --
>
> Key: HBASE-5846
> URL: https://issues.apache.org/jira/browse/HBASE-5846
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.92.1
> Environment: CentOS release 5.7 (Final)
>Reporter: Shrijeet Paliwal
>
> Here is how I executed rpm build: 
> {noformat}
> MAVEN_OPTS="-Xmx2g" mvn clean package assembly:single -Prpm -DskipTests
> {noformat}
> The issues with the rpm build are: 
> * There is no clean (%clean) section in the hbase.spec file . Last run can 
> leave stuff in RPM_BUILD_ROOT which in turn will fail build. As a fix I added 
> 'rm -rf $RPM_BUILD_ROOT' to %clean section
> * The Buildroot is set to _build_dir . The build fails with this error. 
> {noformat}
> cp: cannot copy a directory, 
> `/data/9adda425-1f1e-4fe5-8a53-83bd2ce5ad45/app/jenkins/workspace/hbase.92/target/rpm/hbase/BUILD',
>  into itself, 
> `/data/9adda425-1f1e-4fe5-8a53-83bd2ce5ad45/app/jenkins/workspace/hbase.92/target/rpm/hbase/BUILD/BUILD'
> {noformat}
> If we set it to ' %{_tmppath}/%{name}-%{version}-root' build passes
> * The src/packages/update-hbase-env.sh script will leave inconsistent state 
> if 'yum update hbase' is executed. It deletes data from /etc/init.d/hbase* 
> and does not put scripts back during update. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5858) When HBase gets executed against Hadoop 2.X "SLF4j:CLASSPATH contains multiple SLF4j binding" warning is displayed

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5858.
---
Resolution: Invalid
  Assignee: (was: Roman Shaposhnik)

> When HBase gets executed against Hadoop 2.X "SLF4j:CLASSPATH contains 
> multiple SLF4j binding" warning is displayed
> --
>
> Key: HBASE-5858
> URL: https://issues.apache.org/jira/browse/HBASE-5858
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.0, 0.92.1
>Reporter: Roman Shaposhnik
>Priority: Minor
> Attachments: error.JPG
>
>
> Since HBase tries to find locations of the Hadoop jars from the environment 
> in certain cases it ends up with extra SLF4j jars coming from Hadoop 
> classpath. When that happens "SLF4j:CLASSPATH contains multiple SLF4j 
> binding" warning gets displayed.
> The good news here is that the warning is just that -- a warning. The bad 
> news is that it seems this issue will be tricky to fix properly. On one hand 
> we would like Hadoop itself to give us transitive closure of the set of jar 
> files required (computing that in the HBase script is a maintenance 
> nightmare). On the other hand that set of jar files could contain some of the 
> very same jars shipped with HBase.
> This problem existed for quiet some time. SLF4j is just the first component 
> to complain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5859) Optimize the rolling restart script

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5859.
---
Resolution: Not A Problem

> Optimize the rolling restart script
> ---
>
> Key: HBASE-5859
> URL: https://issues.apache.org/jira/browse/HBASE-5859
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver, scripts
>Affects Versions: 0.95.2
>Reporter: Nicolas Liochon
>Priority: Minor
>
> There is a graceful_stop script. This algorithm:
> {noformat}
> for i = 0 to servers.size {
>  regionsInServer = servers[i].regions
>  move servers[i].regions to random
>  stop servers[i]
>  start servers[i]
>  move regionsInServer to servers[i] //filled back with the same regions
> }
> {noformat}
> It would be possible to optimize it while keeping data locality with
> {noformat}
> for i = 0 to servers.size {
>  start servers[i*2+1] on the computer of servers[i] // Two RS on the same box
>  move servers[i].regions to servers[i*2+1]  // The one on the same box
>  stop servers[i]
> }
> {noformat}
> There would be an impact with a fixed port configuration. To fix this, we 
> could:
> - use a range of port instead of a single port. This could be an issue for 
> the web port.
> - start on a port then reuse the fixed ones when they become available. This 
> is not very elegant if a client code is already using the previous code. 
> Moreover the region server code is written in the meta table.
> - do a mix of the two solutions: a range for the server itself, while waiting 
> for the web port to be available.
> To be discussed...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5832) Add storefile count per region (or per cf even) metrics; add size too?

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5832.
---
Resolution: Not A Problem

> Add storefile count per region (or per cf even) metrics; add size too?
> --
>
> Key: HBASE-5832
> URL: https://issues.apache.org/jira/browse/HBASE-5832
> Project: HBase
>  Issue Type: Improvement
>Reporter: stack
>
> From IRC this morning:
> {code}
> 07:26 < ntelford> is there a way to monitor the number and size of
> store files *per region*?
> 07:26 < ntelford> I know region servers expose a metric on the total
> across all regions, but that's fairly unhelpful to us
> ...
> 08:11 < St^Ack> ntelford: no. number is easy but when you say size,
> you mean size of all the storefiles in the region, not the size per
> storefile (asking because one of the lads is exposing per region
> metrics at mo and those would be easy to add)
> 08:12 < ntelford> St^Ack, for size we're actually interested in the
> individual store file size
> 08:13 < St^Ack> ntelford: how would we do that in metric?  metric
> would be dynamic
> 08:13 < ntelford> specifically, we want to monitor: the maximum,
> minimum, mean (and some percentiles) number of store files within a
> region
> ...
> 08:13 < ntelford> and the maximum, minimum, mean (+ percentiles) size
> of individual store files
> 08:13 < ntelford> :)
> 08:13 < St^Ack> now you are verging on abuse!
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5822) Pull the "Pseudo-Dist. Extras" page into the reference guide and fix it

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5822.
---
Resolution: Not A Problem
  Assignee: (was: Doug Meil)

> Pull the "Pseudo-Dist. Extras" page into the reference guide and fix it
> ---
>
> Key: HBASE-5822
> URL: https://issues.apache.org/jira/browse/HBASE-5822
> Project: HBase
>  Issue Type: Task
>Reporter: Jean-Daniel Cryans
>Priority: Minor
>
> The "Pseudo-Dist. Extras"[1] on the website should be pulled into the 
> reference guide and should be fixed at the same time, for example it 
> references a hbase-site.xml.psuedo-distributed.template file that doesn't 
> exist since 0.90.0
> Assigning this to our doc master :)
> 1. http://hbase.apache.org/pseudo-distributed.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-4835) ConcurrentModificationException out of ZKConfig.makeZKProps

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-4835.
---
Resolution: Not A Problem

> ConcurrentModificationException out of ZKConfig.makeZKProps
> ---
>
> Key: HBASE-4835
> URL: https://issues.apache.org/jira/browse/HBASE-4835
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Andrew Purtell
> Attachments: HBASE-4835.patch
>
>
> Mikhail reported this from a five-node, three-RS cluster test:
> {code}
> 2011-11-21 01:30:15,188 FATAL 
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
> ,60020,1321867814890: Initialization of RS failed. Hence 
> aborting RS.
> java.util.ConcurrentModificationException
> at java.util.Hashtable$Enumerator.next(Hashtable.java:1031)
> at org.apache.hadoop.conf.Configuration.iterator(Configuration.java:1042)
> at org.apache.hadoop.hbase.zookeeper.ZKConfig.makeZKProps(ZKConfig.java:75)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:245)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:144)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:124)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1262)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:568)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.(HConnectionManager.java:559)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:183)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.(CatalogTracker.java:177)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:575)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:534)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:642)
> at java.lang.Thread.run(Thread.java:619)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5674) add support in HBase to overwrite hbase timestamp to a version number during major compaction

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5674.
---
Resolution: Incomplete
  Assignee: (was: He Yongqiang)

> add support in HBase to overwrite hbase timestamp to a version number during 
> major compaction
> -
>
> Key: HBASE-5674
> URL: https://issues.apache.org/jira/browse/HBASE-5674
> Project: HBase
>  Issue Type: Improvement
>Reporter: He Yongqiang
>
> Right now, a millisecond-level timestamp is attached to every record. 
> In our case, we only need a version number (mostly it will be just zero etc). 
> A millisecond timestamp is too heavy to carry. We should add support to 
> overwrite it to zero during major compaction. 
> KVs before major compaction will remain using system timestamp. And this 
> should be configurable, so that we should not mess up if the hbase timestamp 
> is specified by application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5723) Simple Design of Secondary Index

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5723.
---
Resolution: Won't Fix

> Simple Design of Secondary Index
> 
>
> Key: HBASE-5723
> URL: https://issues.apache.org/jira/browse/HBASE-5723
> Project: HBase
>  Issue Type: New Feature
>  Components: Coprocessors
>Reporter: ShiXing
>Priority: Minor
> Attachments: Simple Design of HBase SecondaryIndex.pdf
>
>
> Use coprocessor to create index. And primary tables' compaction to purge the 
> stale data. 
> Attach file is the Design of the Seconday Index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5752) Blank line in SPLITS_FILE causes Master to crash

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5752.
---
Resolution: Cannot Reproduce

Reopen if still an issue with current code

> Blank line in SPLITS_FILE causes Master to crash
> 
>
> Key: HBASE-5752
> URL: https://issues.apache.org/jira/browse/HBASE-5752
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 0.92.1
>Reporter: Jeremy Carroll
>Priority: Minor
> Attachments: test.txt
>
>
> When creating a new table with the hbase shell, and specifying a SPLITS_FILE 
> with a blank line in it will cause the master to crash.
> Uploading a sample splits file, here are the commands to test the split.
> create 'testTable', {NAME => 'a', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => 
> '0', COMPRESSION => 'NONE', MIN_VERSIONS => '3', TTL => '2147483647', 
> BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, 
> {SPLITS_FILE => '/tmp/test.txt'}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5769) Use builder pattern to create HServerLoad.RegionLoad

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5769.
---
Resolution: Incomplete

> Use builder pattern to create HServerLoad.RegionLoad
> 
>
> Key: HBASE-5769
> URL: https://issues.apache.org/jira/browse/HBASE-5769
> Project: HBase
>  Issue Type: Task
>Reporter: Ted Yu
>
> Currently, HRegionServer.createRegionLoad() calls RegionLoad ctor with all 
> the parameters.
> This makes adding new members to RegionLoad tedious.
> Builder pattern should be employed to create HServerLoad.RegionLoad



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5779) Master goes to infinite loop if AccessControlException occurs while setting cluster id during initialization

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5779.
---
Resolution: Cannot Reproduce

Reopen if still an issue with current code

> Master goes to infinite loop if AccessControlException occurs while setting 
> cluster id during initialization 
> -
>
> Key: HBASE-5779
> URL: https://issues.apache.org/jira/browse/HBASE-5779
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: Shrijeet Paliwal
>
> Steps to reproduce: 
> - change permission of /hbase to a user other than one running hbase
> - delete hbase.id if already exists
> - start master, it will try to create cluster ID file in /hbase and fail 
> while doing so with org.apache.hadoop.security.AccessControlException
> From this point it will go to infinite loop. 
> Reason: org.apache.hadoop.hbase.util.FSUtils.setClusterId  has a wait > 0 and 
> no control over retries when called during master initialization. 
> Quoting : checkRootDir in MasterFileSystem
> {noformat}
> // Make sure cluster ID exists
> if (!FSUtils.checkClusterIdExists(fs, rd, c.getInt(
> HConstants.THREAD_WAKE_FREQUENCY, 10 * 1000))) {
>   FSUtils.setClusterId(fs, rd, UUID.randomUUID().toString(), c.getInt(
>   HConstants.THREAD_WAKE_FREQUENCY, 10 * 1000));
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5702) MasterSchemaChangeTracker.excludeRegionServerForSchemaChanges leaks a MonitoredTask per call

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5702.
---
Resolution: Not A Problem
  Assignee: (was: Subbu M Iyer)

> MasterSchemaChangeTracker.excludeRegionServerForSchemaChanges leaks a 
> MonitoredTask per call
> 
>
> Key: HBASE-5702
> URL: https://issues.apache.org/jira/browse/HBASE-5702
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: Jean-Daniel Cryans
>Priority: Critical
>
> This bug is so easy to reproduce I'm wondering why it hasn't been reported 
> yet. Stop any number of region servers on a 0.94/6 cluster and you'll see in 
> the master interface one task per stopped region server saying the following:
> |Processing schema change exclusion for region server = 
> sv4r27s44,62023,1333402175340|RUNNING (since 5sec ago)|No schema change in 
> progress. Skipping exclusion for server = sv4r27s44,62023,1333402175340 
> (since 5sec ago)|
> It's gonna stay there until the master cleans it:
> bq. WARN org.apache.hadoop.hbase.monitoring.TaskMonitor: Status Processing 
> schema change exclusion for region server = sv4r27s44,62023,1333402175340: 
> status=No schema change in progress. Skipping exclusion for server = 
> sv4r27s44,62023,1333402175340, state=RUNNING, startTime=1333404636419, 
> completionTime=-1 appears to have been leaked
> It's not clear to me why it's using a MonitoredTask in the first place. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5710) NPE in MiniCluster during metadata scan for a pre-split table with multiple column families

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5710.
---
Resolution: Not A Problem

> NPE in MiniCluster during metadata scan for a pre-split table with multiple 
> column families
> ---
>
> Key: HBASE-5710
> URL: https://issues.apache.org/jira/browse/HBASE-5710
> Project: HBase
>  Issue Type: Bug
>  Components: test, util
>Affects Versions: 0.94.0
> Environment: MiniCluster
>Reporter: James Taylor
>Priority: Minor
>
> In the MiniCluster test environment, an NPE occurs while scanning regions
> of a pre-split table with multiple column families. Without this working
> in the test environment, you cannot write unit tests for these types of
> scenarios.
> Add the following to TestMetaScanner to repro:
>@Test
>public void testMultiFamilyMultiRegionMetaScanner() throws Exception {
>  LOG.info("Starting testMetaScanner");
>  final byte[] TABLENAME = Bytes.toBytes("testMetaScanner");
>  final byte[] FAMILY1 = Bytes.toBytes("family1");
>  final byte[] FAMILY2 = Bytes.toBytes("family2");
>  TEST_UTIL.createTable(TABLENAME, new byte[][] {FAMILY1,FAMILY2});
>  Configuration conf = TEST_UTIL.getConfiguration();
>  HTable table = new HTable(conf, TABLENAME);
>  TEST_UTIL.createMultiRegions(conf, table, FAMILY1,
>  new byte[][]{
>HConstants.EMPTY_START_ROW,
>Bytes.toBytes("region_a"),
>Bytes.toBytes("region_b")});
>  TEST_UTIL.createMultiRegions(conf, table, FAMILY2,
>  new byte[][]{
>HConstants.EMPTY_START_ROW,
>Bytes.toBytes("region_a"),
>Bytes.toBytes("region_b")});
>  // Make sure all the regions are deployed
>  TEST_UTIL.countRows(table);
>  // This fails with an NPE currently
>  MetaScanner.allTableRegions(conf, TABLENAME, false).keySet();
>  table.close();
>}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5587) Remove dns.interface configuration options

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5587.
---
Resolution: Won't Fix

> Remove dns.interface configuration options 
> ---
>
> Key: HBASE-5587
> URL: https://issues.apache.org/jira/browse/HBASE-5587
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.95.2
>Reporter: Eli Collins
>
> Are the {{hbase.*.dns.interface}} configuration options used or needed?  Per 
> HBASE-4109 it looks like these never really worked, at least in cases where 
> the hostname with a trailing dot doesn't resolve. The reason I asked is that 
> while these were introduced in Hadoop, I don't think they're actually used, 
> nor am I convinced bypassing the host for DNS lookups is a good idea (leads 
> to painful bugs where default Java DNS lookups differ with these lookups). 
> HBase started using these via a similar feature in HBASE-1279 and HBASE-1279.
> I filed HADOOP-8156 to remove the API which HBase uses, which is obviously an 
> incompatible change and would need to be worked around here if you wanted to 
> keep this functionality in HBase, ie *if* that were to get checked into 
> Hadoop we'd first need to get you on your own DNS class. Either way I'll 
> update DNS' InterfaceAudience annotation to indicate HBase is a user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5687) Should add backup masters to StorageClusterStatusResource.get()

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5687.
---
Resolution: Incomplete

> Should add backup masters to StorageClusterStatusResource.get()
> ---
>
> Key: HBASE-5687
> URL: https://issues.apache.org/jira/browse/HBASE-5687
> Project: HBase
>  Issue Type: Improvement
>  Components: REST
>Affects Versions: 0.92.1, 0.94.0, 0.95.2
>Reporter: David S. Wang
>
> Changes similar to HBASE-5209/HBASE-5596 need to be added for rest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5668) HRegionServer.checkFileSystem() should only abort() after fs is down for some time

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5668.
---
Resolution: Incomplete

> HRegionServer.checkFileSystem() should only abort() after fs is down for some 
> time
> --
>
> Key: HBASE-5668
> URL: https://issues.apache.org/jira/browse/HBASE-5668
> Project: HBase
>  Issue Type: Improvement
>Reporter: Prakash Khemani
>
> When checkFileSystem() fails then the region server should wait for sometime 
> before aborting. By default, the timeout can be same as zookeeper session 
> timeout.
> When say a rack switch reboots or fails for a few minutes, and all the 
> traffic to the region server dies ... then we don't want the region servers 
> to unnecessarily kill themselves when ongoing compactions or flushes fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-4653) Master can't easily get rid of LZO compressed tables when the codec isn't available

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-4653.
---
Resolution: Incomplete

> Master can't easily get rid of LZO compressed tables when the codec isn't 
> available
> ---
>
> Key: HBASE-4653
> URL: https://issues.apache.org/jira/browse/HBASE-4653
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.4
>Reporter: Benoit Sigoure
>Priority: Minor
>
> I mistakenly created a table with {{COMPRESSION => LZO}}, and I don't have 
> LZO installed.  I'm running a vanilla 0.90.4 release.  The master is unable 
> to deploy the region of that table because the codec is missing.  I can't get 
> rid of it.  I can't drop the table from  the shell, although it seems I could 
> disable it.  Thankfully I found a workaround for this bug (see further below).
> {code}
> hbase(main):003:0> disable 'mytable'
> 0 row(s) in 1.1010 seconds
> hbase(main):004:0> drop 'mytable'
> [hung forever]
> {code}
> in the logs:
> {code}
> 2011-10-22 03:05:42,153 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Instantiated mytable,,1319278131519.6eb6891a8b072402b5064f4cc68c210d.
> 2011-10-22 03:05:42,154 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=mytable,,1319278131519.6eb6891a8b072402b5064f4cc68c210d. 
> java.io.IOException: java.lang.RuntimeException:
> java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzoCodec
>at 
> org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:89)
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkCompressionCodecs(HRegion.java:2573)
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2562)
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2550)
>at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:272)
>at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:99)
>at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:156)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> com.hadoop.compression.lzo.LzoCodec
>at 
> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm$1.getCodec(Compression.java:92)
>at 
> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(Compression.java:197)
>at 
> org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:84)
>... 9 more
> Caused by: java.lang.ClassNotFoundException: 
> com.hadoop.compression.lzo.LzoCodec
>at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>at java.security.AccessController.doPrivileged(Native Method)
>at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>at 
> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm$1.getCodec(Compression.java:87)
>... 11 more
> [...]
> 2011-10-22 03:15:30,620 DEBUG 
> org.apache.hadoop.hbase.master.handler.DeleteTableHandler: Waiting on region 
> to clear regions in transition; 
> mytable,,1319278131519.6eb6891a8b072402b5064f4cc68c210d. state=OPENING, 
> ts=1319278483001
> 2011-10-22 03:15:31,621 DEBUG 
> org.apache.hadoop.hbase.master.handler.DeleteTableHandler: Waiting on region 
> to clear regions in transition;
> mytable,,1319278131519.6eb6891a8b072402b5064f4cc68c210d. state=OPENING, 
> ts=1319278483001
> [repeat message above indefinitely every 1s]
> {code}
> I tried restarting HBase, no luck.  How do I get rid of this table so I can 
> recreate it without {{COMPRESSION => LZO}}?
> h2. Workaround
> Change the schema for each family, restart HBase, drop the table.
> {code}
> hbase(main):004:0> alter 'mytable', {NAME => 'fam1', COMPRESSION => 'NONE'}
> 0 row(s) in 0.1160 seconds
> hbase(main):005:0> alter 'mytable', {NAME => 'fam2', COMPRESSION => 'NONE'}
> 0 row(s) in 0.0480 seconds
> hbase(main):007:0> drop 'mytable'
> ^C
> [hung forever]
> {code}
> [restart HBase]  :(
> {code}
> hbase(main):001:0> disable 'mytable'
> 0 row(s) in 2.5010 seconds
> hbase(main):002:0> drop 'mytable'
> 0 row(s) in 1.1240 seconds
> {code}



--
This message was sent by Atlassian JIRA
(v6.

[jira] [Resolved] (HBASE-5855) [findbugs] address remaining findbugs warnings

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5855.
---
Resolution: Not A Problem
  Assignee: (was: Uma Maheswara Rao G)

> [findbugs] address remaining findbugs warnings 
> ---
>
> Key: HBASE-5855
> URL: https://issues.apache.org/jira/browse/HBASE-5855
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Hsieh
>
> As we've been cleaning up the code related to findbugs warnings, new patches 
> are coming in that introduce new warnings.  This would is the last sub-isuse 
> that will cleanup any recently introduced warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5649) [findbugs] Fix security warning

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5649.
---
Resolution: Not A Problem

> [findbugs] Fix security warning
> ---
>
> Key: HBASE-5649
> URL: https://issues.apache.org/jira/browse/HBASE-5649
> Project: HBase
>  Issue Type: Sub-task
>  Components: scripts
>Reporter: Jonathan Hsieh
>
> See 
> https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html#Warnings_SECURITY
> Fix possible XSS Vuln.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-5854) [findbugs] Fix hbck findbugs warnings

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-5854:
--
Resolution: Not A Problem
  Assignee: (was: Jonathan Hsieh)
Status: Resolved  (was: Patch Available)

> [findbugs] Fix hbck findbugs warnings
> -
>
> Key: HBASE-5854
> URL: https://issues.apache.org/jira/browse/HBASE-5854
> Project: HBase
>  Issue Type: Sub-task
>  Components: hbck
>Reporter: Jonathan Hsieh
> Attachments: HBASE-5854-v1.patch
>
>
> In the reviews for HBASE-5654, Jon said he'd take on fixing the hbck findbugs 
> warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5650) [findbugs] Address extra synchronization on CLSM, Atomic*

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5650.
---
Resolution: Not A Problem

> [findbugs] Address extra synchronization on CLSM, Atomic*
> -
>
> Key: HBASE-5650
> URL: https://issues.apache.org/jira/browse/HBASE-5650
> Project: HBase
>  Issue Type: Sub-task
>  Components: scripts
>Reporter: Jonathan Hsieh
>
> See 
> https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html#Warnings_MT_CORRECTNESS
> Fix/exclude class JLM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-8463) Suppress Findbugs warnings links in Hadoop QA report if there was no Findbugs warning

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-8463.
---
Resolution: Not A Problem

> Suppress Findbugs warnings links in Hadoop QA report if there was no Findbugs 
> warning
> -
>
> Key: HBASE-8463
> URL: https://issues.apache.org/jira/browse/HBASE-8463
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>
> We often see the following report from Hadoop QA:
> {code}
> +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) 
> warnings.
> ...
> Findbugs warnings: 
> https://builds.apache.org/job/PreCommit-HBASE-Build/5485//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
> Findbugs warnings: 
> https://builds.apache.org/job/PreCommit-HBASE-Build/5485//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
> Findbugs warnings: 
> https://builds.apache.org/job/PreCommit-HBASE-Build/5485//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
> Findbugs warnings: 
> https://builds.apache.org/job/PreCommit-HBASE-Build/5485//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
> Findbugs warnings: 
> https://builds.apache.org/job/PreCommit-HBASE-Build/5485//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
> Findbugs warnings: 
> https://builds.apache.org/job/PreCommit-HBASE-Build/5485//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
> Findbugs warnings: 
> https://builds.apache.org/job/PreCommit-HBASE-Build/5485//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
> Findbugs warnings: 
> https://builds.apache.org/job/PreCommit-HBASE-Build/5485//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
> {code}
> The 8 Findbugs warnings links can be suppressed above since there was no 
> Findbugs warning



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5647) [findbugs] Address "Exposed internal representation" warnings

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5647.
---
Resolution: Not A Problem

> [findbugs] Address "Exposed internal representation" warnings
> -
>
> Key: HBASE-5647
> URL: https://issues.apache.org/jira/browse/HBASE-5647
> Project: HBase
>  Issue Type: Sub-task
>  Components: scripts
>Reporter: Jonathan Hsieh
>
> See 
> https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
> This class of warning may need to be fixed or excluded.  Fix or justify + 
> exclude.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5643) [findbugs] Fix compareTo/equals/hashcode warnings

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5643.
---
Resolution: Not A Problem
  Assignee: (was: Mubarak Seyed)

> [findbugs] Fix compareTo/equals/hashcode warnings
> -
>
> Key: HBASE-5643
> URL: https://issues.apache.org/jira/browse/HBASE-5643
> Project: HBase
>  Issue Type: Sub-task
>  Components: scripts
>Reporter: Jonathan Hsieh
>
> See 
> https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
> Fix code to eliminate [Eq,ES,HE] categories.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5648) [findbugs] Fix final/protected/constant declarations.

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5648.
---
Resolution: Not A Problem
  Assignee: (was: Mubarak Seyed)

> [findbugs] Fix final/protected/constant declarations.
> -
>
> Key: HBASE-5648
> URL: https://issues.apache.org/jira/browse/HBASE-5648
> Project: HBase
>  Issue Type: Sub-task
>  Components: scripts
>Reporter: Jonathan Hsieh
>
> See 
> https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
> Fix warnings from class MS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-3692) Handle RejectedExecutionException in HTable

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-3692.
---
Resolution: Won't Fix

> Handle RejectedExecutionException in HTable
> ---
>
> Key: HBASE-3692
> URL: https://issues.apache.org/jira/browse/HBASE-3692
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.1
>Reporter: Jean-Daniel Cryans
> Attachments: test_datanucleus.zip
>
>
> A user on IRC yesterday had an issue with RejectedExecutionException coming 
> out of HTable sometimes. Apart from being very confusing to the user as it 
> comes with no message at all, it exposes the HTable internals. 
> I think we should handle it and instead throw something like 
> DontUseHTableInMultipleThreadsException or something more clever. In his 
> case, the user had a HTable leak with the pool that he was able to figure out 
> once I told him what to look for.
> It could be an unchecked exception and we could consider adding in 0.90 but 
> marking for 0.92 at the moment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5629) Unify OfflineMetaRebuild with Hbck

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5629.
---
Resolution: Incomplete

> Unify OfflineMetaRebuild with Hbck
> --
>
> Key: HBASE-5629
> URL: https://issues.apache.org/jira/browse/HBASE-5629
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.95.2
>Reporter: Jonathan Hsieh
>
> Currently hbck and OfflineMetaRepair share a lot of code but 
> OfflineMetaRepair is currently behind in functionality.  It seems that we 
> could merge hbck and OfflineMetaRepair adding something like a -hdfsOnly or 
> -offline flag to hbck.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5646) [findbugs] Investigate experimental warnings

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5646.
---
Resolution: Not A Problem

> [findbugs] Investigate experimental warnings
> 
>
> Key: HBASE-5646
> URL: https://issues.apache.org/jira/browse/HBASE-5646
> Project: HBase
>  Issue Type: Sub-task
>  Components: scripts
>Reporter: Jonathan Hsieh
>
> See 
> https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
> analyze and fix/exclude warnings in the experimental section.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5602) Add cache access pattern statistics and report hot blocks/keys

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5602.
---
Resolution: Incomplete

> Add cache access pattern statistics and report hot blocks/keys
> --
>
> Key: HBASE-5602
> URL: https://issues.apache.org/jira/browse/HBASE-5602
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> In many practical applications it would be very useful to know how well 
> utilized the block cache is, i.e. how many times we actually access a block 
> once it gets into cache. This would also allow to evaluate cache-on-write on 
> flush. In addition, we need to keep track of and report some set of hottest 
> block in cache, and possibly even hottest keys. This would allow to diagnose 
> "hot-row" problems in real time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5543) Add a keepalive option for IPC connections

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5543.
---
Resolution: Not A Problem

> Add a keepalive option for IPC connections
> --
>
> Key: HBASE-5543
> URL: https://issues.apache.org/jira/browse/HBASE-5543
> Project: HBase
>  Issue Type: Improvement
>  Components: Client, Coprocessors, IPC/RPC
>Reporter: Andrew Purtell
>
> On the user list someone wrote in with a connection failure due to a long 
> running coprocessor:
> {quote}
> On Wed, Mar 7, 2012 at 10:59 PM, raghavendhra rahul wrote:
> 2012-03-08 12:03:09,475 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server 
> Responder, call execCoprocessor([B@50cb21, getProjection(), rpc version=1, 
> client version=0, methodsFingerPrint=0), rpc version=1, client version=29, 
> methodsFingerPrint=54742778 from 10.184.17.26:46472: output error
> 2012-03-08 12:03:09,476 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server 
> handler 7 on 60020 caught: java.nio.channels.ClosedChannelException
> {quote}
> I suggested in response we might consider give our RPC a keepalive option for 
> calls that may run for a long time (like execCoprocessor).
> LarsH +1ed the idea:
> {quote}
> +1 on "keepalive". It's a shame (especially for long running server code) to 
> do all the work, just to find out at the end that the client has given up.
> Or maybe there should be a way to cancel an operation if the clients decides 
> it does not want to wait any longer (PostgreSQL does that for example). Here 
> that would mean the server would need to check periodically and coprocessors 
> would need to be written to support that - so maybe that's no-starter.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-4999) Constraints - Enhance checkAndPut to do atomic arbitrary constraint checks

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-4999.
---
Resolution: Incomplete

> Constraints - Enhance checkAndPut to do atomic arbitrary constraint checks
> --
>
> Key: HBASE-4999
> URL: https://issues.apache.org/jira/browse/HBASE-4999
> Project: HBase
>  Issue Type: Improvement
>  Components: Client, Coprocessors
>Reporter: Suraj Varma
>  Labels: CAS, checkAndPut, constraints
>
> Related work: HBASE-4605
> It would be great if checkAndPut (CAS) can be enhanced to not just use a 
> value comparison as a gating factor for the put, but rather have the 
> capability of doing arbitrary constraint checks on the column value (where 
> the current comparinator approach is a subset of possible constraints that 
> can be checked). Commonly used constraints (like comparisons) can be provided 
> out of the box and we should have the ability to accept custom constraints 
> set by the client for the checkAndPut call. 
> One use-case would be the ability to implement something like the below in 
> HBase.
> Pseudo sql: 
> update table-name
> set column-name = new-value
> where (column-value - new-value) > threshold-value
> ... where the mutation would go through only if the specified constraint in 
> the where clause is true.
> Current options include using a co-processor to do 
> preCheckAndPut/postCheckAndPut constraint checks - but this is not atomic. 
> i.e. the row lock needs to be released by the co-processor before the real 
> checkAndPut call, thus not meeting the atomic requirement. 
> Everything above is still meant to be at row level (so, no cross-row 
> constraint checking is implied here).
> And ideal end result would be that an HBase client would be able to specify a 
> set of constraints on multiple column qualifiers as part of the checkAndPut 
> call. The call goes through if all the constraints are satisfied or doesn't 
> if any of the constraints fail. And the above checkAndPut should be 
> atomically executed (just like current checkAndPut semantics).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5590) Add more server mode state to jmx output.

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5590.
---
Resolution: Incomplete

> Add more server mode state to jmx output.
> -
>
> Key: HBASE-5590
> URL: https://issues.apache.org/jira/browse/HBASE-5590
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Jonathan Hsieh
>
> Related to HBASE-5325, and HBASE-5533 there is more state information that 
> would be good to expose in a machine readable fashion.
> Some suggestions for state information include:
> * the balancer is on or off.
> * if a master is active or a backup.
> * If a we are in hlog recovery mode
> * "tasks" distributed from distributed log splitting
> More suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5561) Create HFileSystemFactory

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5561.
---
Resolution: Implemented
  Assignee: (was: dhruba borthakur)

> Create HFileSystemFactory
> -
>
> Key: HBASE-5561
> URL: https://issues.apache.org/jira/browse/HBASE-5561
> Project: HBase
>  Issue Type: Improvement
>  Components: master, regionserver
>Reporter: dhruba borthakur
>
> The HFileSystem object can be used to paper over differences in HDFS 
> versions. Create it using a factory object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5550) ZooKeeper connection to get the clusterId in the HConnectionImplementation constructor should be removed

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5550.
---
Resolution: Incomplete
  Assignee: (was: Nicolas Liochon)

> ZooKeeper connection to get the clusterId in the HConnectionImplementation 
> constructor should be removed
> 
>
> Key: HBASE-5550
> URL: https://issues.apache.org/jira/browse/HBASE-5550
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 0.95.2
>Reporter: Nicolas Liochon
>Priority: Minor
>
> See title.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-4879) Add Constraint support to shell

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-4879.
---
Resolution: Incomplete

> Add Constraint support to shell
> ---
>
> Key: HBASE-4879
> URL: https://issues.apache.org/jira/browse/HBASE-4879
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.94.0
>Reporter: Jesse Yates
>
> Follow-on ticket to HBASE-4605. Extend the shell functionality to include 
> altering a table to add a Constraint. 
> Discussion around this can be found at:
> http://search-hadoop.com/m/EeYb3dezMM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-2818) Cannot force a region to close when it has no RS entry in META

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-2818.
---
Resolution: Cannot Reproduce

Reopen if still an issue with current code

> Cannot force a region to close when it has no RS entry in META
> --
>
> Key: HBASE-2818
> URL: https://issues.apache.org/jira/browse/HBASE-2818
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.1, 0.94.0
>Reporter: Todd Lipcon
>Priority: Critical
>
> I have a region that's open on a server, but META thinks it's not deployed 
> anywhere. I get the following when trying to close it:
> hbase(main):002:0> close_region 
> 'usertable,user302806495,1278457018956.c4ad0681f7be3995490c745861af66ea.', 
> '192.168.42.41:60020'
> ERROR: java.io.IOException: java.io.IOException: 
> java.lang.NullPointerException
> at org.apache.hadoop.hbase.util.Bytes.toLong(Bytes.java:479)
> at org.apache.hadoop.hbase.util.Bytes.toLong(Bytes.java:453)
> at 
> org.apache.hadoop.hbase.master.HMaster.modifyTable(HMaster.java:1021)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5438) A tool to check region balancing for a particular table

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5438.
---
Resolution: Incomplete
  Assignee: (was: Liyin Tang)

> A tool to check region balancing for a particular table
> ---
>
> Key: HBASE-5438
> URL: https://issues.apache.org/jira/browse/HBASE-5438
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
> Attachments: 0001-hbase-5438.patch, 
> ASF.LICENSE.NOT.GRANTED--D1827.1.patch, 
> ASF.LICENSE.NOT.GRANTED--D1827.1.patch, ASF.LICENSE.NOT.GRANTED--D1827.1.patch
>
>
> When debugging the table level region imbalance problem, I write a tool to 
> check how the region balanced across all the region server for a particular 
> table.
> bin/hbase org.jruby.Main region_balance_checker.rb test_table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5530) Create a framework to test backward compatibility of various HFileBlock disk formats

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5530.
---
Resolution: Not A Problem
  Assignee: (was: dhruba borthakur)

> Create a framework to test backward compatibility of various HFileBlock disk 
> formats
> 
>
> Key: HBASE-5530
> URL: https://issues.apache.org/jira/browse/HBASE-5530
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: dhruba borthakur
>
> We currently have major versions 0 and 1. The HBase checksum patch introduces 
> minor version 0 and 1. Then the columnar HFileBlock might introduce yet 
> another disk format version. We need a simple but elegant framework to test 
> the compatibility of code with all these disk format versions. We also want 
> to do this without much code duplication in TestHFileBlockCompatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-2408) Add envelope around client<->server communication so can pass state along w/ data during interchange

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-2408.
---
Resolution: Incomplete

> Add envelope around client<->server communication so can pass state along w/ 
> data during interchange
> 
>
> Key: HBASE-2408
> URL: https://issues.apache.org/jira/browse/HBASE-2408
> Project: HBase
>  Issue Type: Improvement
>Reporter: stack
>
> This issue is about adding a dimension along which we can pass metadata on 
> changes in client-server communications.  What i mean by this is that that 
> while an HTable#get will return a RowResult, we need to also be able to also 
> convey, messages like "I got the result for this row from a region other than 
> the one you asked for -- update your cache with this new location".
> I can think of two examples where this mechanism could be useful.
> 1. HBASE-72 "'Normal' operation should not depend on throwing of exceptions 
> (e.g. NotServingRegionException)".  Rather than have the server throw a 
> NotServingRegionException as we do now as signal to client to go look 
> elsewhere for the wanted data, we could instead signal the client to look 
> elsewhere by setting a state in the envelope.
> 2. If a client asks for a row and meantime the region has split, if the 
> regionserver queried is hosting the daughter that is carrying the wanted row, 
> it could save the client hops by passing back the wanted row with a message 
> in the envelope that client should update its cache removing parent and 
> replacing with daughter location.
> AVRO rpc carries headers?  We could stuff our enveloping stuff there?  Or, 
> shudder, if we used AVRO HTTP for RPC, we could do our messages as HTTP 
> headers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-3523) Rewrite our client (client 2.0)

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-3523.
---
Resolution: Not A Problem

> Rewrite our client (client 2.0)
> ---
>
> Key: HBASE-3523
> URL: https://issues.apache.org/jira/browse/HBASE-3523
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: stack
>
> Is it just me or do others sense that there is pressure building to redo the 
> client?  If just me, ignore the below... I'll just keep notes in here.  
> Otherwise, what would the requirements for a client rewrite look like?
> + Let out InterruptedException
> + Enveloping of messages or space for metadata that can be passed by client 
> to server and by server to client; e.g. the region a.b.c moved to server 
> x.y.z. or scanner is finished or timeout
> + A different RPC? One with tighter serialization.
> + More sane timeout/retry policy.
> Does it have to support async communication?  Do callbacks?
> What else?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5293) Purge hfile v1 from code base

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5293.
---
Resolution: Done

> Purge hfile v1 from code base
> -
>
> Key: HBASE-5293
> URL: https://issues.apache.org/jira/browse/HBASE-5293
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>
> Remove all hfile v1 references from code base.
> If we do this though, as Matt Corgan suggests up on mailing list, we will 
> need to make sure all hfile v1s in an hbase.rootdir have been compacted out 
> of existence.  We'll probably need to bump the hbase.version to indicate the 
> check for hfile v1s has been run.  A migration script will need to be run 
> that checks the hbase.rootdir for hfile v1s and runs a major compaction if 
> any found.
> I've not put a version on this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-4717) More efficient age-off of old data during major compaction

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-4717.
---
Resolution: Incomplete

> More efficient age-off of old data during major compaction
> --
>
> Key: HBASE-4717
> URL: https://issues.apache.org/jira/browse/HBASE-4717
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.94.0
>Reporter: Todd Lipcon
>
> Many applications need to implement efficient age-off of old data. We 
> currently only perform age-off during major compaction by scanning through 
> all of the KVs. Instead, we could implement the following:
> - Set hbase.hstore.compaction.max.size reasonably small. Thus, older store 
> files contain only smaller finite ranges of time.
> - Periodically run an "age-off compaction". This compaction would scan the 
> current list of storefiles. Any store file that falls entirely out of the TTL 
> time range would be dropped. Store files completely within the time range 
> would be un-altered. Those crossing the time-range boundary could either be 
> left alone or compacted using the existing compaction code.
> I don't have a design in mind for how exactly this would be implemented, but 
> hope to generate some discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-3220) Coprocessors: Streaming distributed computation framework

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-3220.
---
Resolution: Later

> Coprocessors: Streaming distributed computation framework
> -
>
> Key: HBASE-3220
> URL: https://issues.apache.org/jira/browse/HBASE-3220
> Project: HBase
>  Issue Type: Brainstorming
>  Components: Coprocessors
>Reporter: Andrew Purtell
>
> Consider a computational framework based on a stream processing model. 
> Logically: Generators emit keys (row keys, or full keys with 
> row+column:qualifier), fetch operators join keys to data fetched from the 
> region, filters drop according to (perhaps complex) matching on the keys 
> and/or values, combiners perform aggregation, mutators change values, 
> decorators add data, sinks do something useful with items arriving from the 
> stream, i.e. insert into response buffer, commit to region, replicate to 
> peer. Pipelines execute in parallel. Partitioners can split streams for 
> mulltithreading. Generators can be observers on a region for anchoring a 
> continuous process or an iterator as the first stage of a pipeline 
> constructed on demand with a terminating condition (like a Hadoop task). Kind 
> of like Cascading within regionserver processes, a nice model if not 
> literally Cascading the implementation. MapReduce can be supported with this 
> model, is a subset of it. Data can be ordered or unordered, depends on the 
> generator. Filters could be stateful or stateless: stateless filters could 
> handle data arriving in any order; stateful filters could be used with an 
> ordered generator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-2571) Coprocessors: Minitables

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-2571.
---
Resolution: Later

> Coprocessors: Minitables
> 
>
> Key: HBASE-2571
> URL: https://issues.apache.org/jira/browse/HBASE-2571
> Project: HBase
>  Issue Type: New Feature
>  Components: Coprocessors
>Reporter: Andrew Purtell
>
> From 
> http://turing.cs.washington.edu/papers/dataprojects-google-sigmodrecord08.pdf 
> :
> {quote}
> MINITABLES: SAMPLING BIGTABLE
> Alberto Lerner and S. Muthukrishnan
> [...] Because of [BigTable] semantics and storing scheme, skipping N rows is 
> not feasible without actually reading them. Even finding the count of rows in 
> a Bigtable at any point in time can be done only probabilistically. On the 
> bright side, since Bigtable does not provide a relational query engine, we do 
> not need to consider what are suitable sampling methods for various 
> relational operators (like joins) or take into account how sampling errors 
> compound with increasing levels of query composition. 
> _Uniform Random Sampling_
> Our sampling scheme extracts and presents a sample of a Bigtable's rows as if 
> it were a Bigtable itself, which we call a Minitable. The rationale here is 
> that code written to run against a Bigtable can run unchanged against a 
> sample thereof. Our sampling is based on a hash scheme. We pick a convenient 
> hash function that maps the rowname space into a very large keyspace (e.g., a 
> ax+b mod p function, where p is as large as 2128). The rows falling into the 
> first fp keys where f is the relative sample size (it is a fraction), would 
> belong in the sample. Formally, we pick a hash function h : R −> 0..p and if 
> h(r) E [0, fp−1], then add r to the sample. It is easy to see that the 
> expected size of the sample is f * 100% of the Bigtable rowcount independent 
> of the rowcount, and the probability that a particular row r is in the sample 
> is f, as desired. This hash-based sampling method supports maintenance of the 
> sample with each Bigtable mutation (insert, update, or deletion). Only the 
> system may forward relevant mutations from the Bigtable to the Minitable. 
> Otherwise, the latter would behave as just any other Bigtable: it could be 
> backed up and even be replicated. We are currently deploying Minitables in 
> the repository of documents that the crawling system generates. Several 
> Minitables, each with a different sample factor, allow that system to compute 
> aggregates much faster and more often.
> _Biased Sampling_
> Uniform random sampling is quite useful but some scenarios require biased 
> sampling methods. We are currently working on one such extension that we call 
> Mask Sampling. In this scheme, the decision to select a row to the sample is 
> still based on its rowname but now a user may specify a mask m over it. The 
> mask, which can be a regular expression that matches portions of a rowname, 
> is used to group rows together. Two rows belong to a same group if their 
> masks result in the same string. Mask sampling guarantees that if a group is 
> selected to the sample, that group will be adequately represented there, 
> regardless of that group's relative size.
> {quote}
> Clearly minitables can be constructed on the fly by a coprocessor attached to 
> the source table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5262.
---
Resolution: Not A Problem

> Structured event log for HBase for monitoring and auto-tuning performance
> -
>
> Key: HBASE-5262
> URL: https://issues.apache.org/jira/browse/HBASE-5262
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> Creating this JIRA to open a discussion about a structured (machine-readable) 
> log that will record events such as compaction start/end times, compaction 
> input/output files, their sizes, the same for flushes, etc. This can be 
> stored e.g. in a new system table in HBase itself. The data from this log can 
> then be analyzed and used to optimize compactions at run time, or otherwise 
> auto-tune HBase configuration to reduce the number of knobs the user has to 
> configure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5462) [monitor] Ganglia metric hbase.master.cluster_requests should exclude the scan meta request generated by master, or create a new metric which could show the real request

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5462.
---
Resolution: Invalid

> [monitor] Ganglia metric hbase.master.cluster_requests should exclude the 
> scan meta request generated by master, or create a new metric which could 
> show the real request from client
> -
>
> Key: HBASE-5462
> URL: https://issues.apache.org/jira/browse/HBASE-5462
> Project: HBase
>  Issue Type: Bug
>  Components: monitoring
>Affects Versions: 0.90.5, 0.92.0
> Environment: hbase 0.90.5
>Reporter: johnyang
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> We have a big table which have 30k regions but the request is not very high 
> (about 50K per day).
> We use the hbase.master.cluster_request metrics to monitor the cluster 
> request but find that lots of requests is generated by master, which scan the 
> meta table at regular intervals.
> It is hard for us to monitor the real request from the client, it is possible 
> to filter the scanning meta table or create a new metric which could show the 
> real request from client.
> Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5476) Merge flush/merging compaction; save on i/o by merging an existing file with content memstore

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5476.
---
Resolution: Incomplete

> Merge flush/merging compaction; save on i/o by merging an existing file with 
> content memstore
> -
>
> Key: HBASE-5476
> URL: https://issues.apache.org/jira/browse/HBASE-5476
> Project: HBase
>  Issue Type: Improvement
>Reporter: stack
>
> Compactions are slow.  To improve, we can work on  making compactions faster 
> (TODO) or make it so compactions have less work to do.  This issue is about 
> the later; doing something the bigtable paper talks off where the content of 
> memory is merged with existing content in the filesystem (I'm sure we've 
> discussed this a bunch in the past but can't find an explicit issue for it).
> We save on the reading of the size of a memstore from the filesystem if we 
> merge flush/merge compact.
> BT seems to include memstore content in minor compactions.  We should look at 
> doing that (would snapshot at start of compaction and then would integrate 
> the snapshot into resultant compacted file).  Alternatively we could flush 
> into a small file that is already in the filesystem (this could be tougher 
> given that we have this flush-compaction separation at the moment... how 
> would we make it so the file we're merging into is not picked up for a normal 
> compaction).
> Doing the merge compaction/flush merge would slow down flushes.  It could 
> back up memory into the global barrier such that we stop taking on writes 
> altogether.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5457) add inline index in data block for data which are not clustered together

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5457.
---
Resolution: Invalid

> add inline index in data block for data which are not clustered together
> 
>
> Key: HBASE-5457
> URL: https://issues.apache.org/jira/browse/HBASE-5457
> Project: HBase
>  Issue Type: New Feature
>Reporter: He Yongqiang
>
> As we are go through our data schema, and we found we have one large column 
> family which is just duplicating data from another column family and is just 
> a re-org of the data to cluster data in a different way than the original 
> column family in order to serve another type of queries efficiently.
> If we compare this second column family with similar situation in mysql, it 
> is like an index in mysql. So if we can add inline block index on required 
> columns, the second column family then is not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5383) Prevent the compaction read requests from changing the hotness of block cache

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5383.
---
Resolution: Incomplete
  Assignee: (was: Liyin Tang)

> Prevent the compaction read requests from changing the hotness of block cache
> -
>
> Key: HBASE-5383
> URL: https://issues.apache.org/jira/browse/HBASE-5383
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>
> Block cache is organized in an sorted way based on LRU or some other 
> algorithm and it will age out some blocks when the algorithm believes these 
> blocks are not hot any more. 
> The motivation here is to prevent the compaction read requests from changing 
> the hotness of block cache. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5374) useTableNameGlobally is not initialized for ReaderV2

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5374.
---
Resolution: Invalid

> useTableNameGlobally is not initialized for ReaderV2
> 
>
> Key: HBASE-5374
> URL: https://issues.apache.org/jira/browse/HBASE-5374
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: Ted Yu
>
> SchemaMetrics.useTableNameGlobally is a Boolean object that is not 
> initialized. It depends on public static method configureGlobally() to 
> initialize it based on the configuration file. But this is only done for 
> writer, not for reader. So when invoking hfile tool,
> {code}
> hbase/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -v -f YourFile
> {code}
> where HFileReaderV2 is invoked, it throws exception complaining the flag is 
> null. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-4925) Collect test cases for hadoop/hbase cluster

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-4925.
---
Resolution: Incomplete

> Collect test cases for hadoop/hbase cluster
> ---
>
> Key: HBASE-4925
> URL: https://issues.apache.org/jira/browse/HBASE-4925
> Project: HBase
>  Issue Type: Brainstorming
>  Components: test
>Reporter: Thomas Pan
>
> This entry is used to collect all the useful test cases to verify a 
> hadoop/hbase cluster. This is to follow up on yesterday's hack day in 
> Salesforce. Hopefully that the information would be very useful for the whole 
> community.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5370) Allow HBase shell to set HTableDescriptor values

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5370.
---
Resolution: Incomplete

> Allow HBase shell to set HTableDescriptor values
> 
>
> Key: HBASE-5370
> URL: https://issues.apache.org/jira/browse/HBASE-5370
> Project: HBase
>  Issue Type: Improvement
>Reporter: Lars Hofhansl
>Priority: Minor
>
> Currently it does not seem to be possible to set value on a table's 
> HTableDescriptor (either on creation or afterwards).
> The syntax I have in mind is something like:
> create {NAME=>'table', 'somekey'=>'somevalue'}, 'column'
> In analogy to how we allow a column to either a string ('column') or an 
> association {NAME=>'column', ...}
> alter would be changed to allow setting arbitrary values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5316) Separate the client configuration for HTable and HBaseAdmin

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5316.
---
Resolution: Not A Problem

> Separate the client configuration for HTable and HBaseAdmin
> ---
>
> Key: HBASE-5316
> URL: https://issues.apache.org/jira/browse/HBASE-5316
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
>
> Currently HTable and HBaseAdmin read configurations based on the same config 
> key, such as hbase.client.retries.number and hbase.client.retries.number
> Actually in some cases, the client needs different settings for HTable 
> operations and HBaseAdmin operations.
> One way is to pass different HBaseConfiguration objects to HTable and 
> HBaseAdmin.
> Another much clearer way is to separate the configurations for HTable and 
> HBaseAdmin by using different config keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5366) Improve bulk table disable/enable/drop in shell

2015-04-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-5366.
---
Resolution: Invalid

> Improve bulk table disable/enable/drop in shell
> ---
>
> Key: HBASE-5366
> URL: https://issues.apache.org/jira/browse/HBASE-5366
> Project: HBase
>  Issue Type: Task
>Reporter: Ted Yu
>
> HBASE-3506 added regex support for disabling, enabling and dropping tables.
> Currently the list of tables are shown one on each line, this may lead to too 
> many lines:
> {code}
> tychangTable998   
> 
> tychangTable999   
>
> Enable the above 901 tables (y/n)?
> y
> {code}
> When the number of tables is high, each line should display multiple tables.
> Disabling/enabling tables may take a long time. We should show the tables 
> that have been disabled/enabled in batches so that user knows the operation 
> didn't hang.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >