[jira] [Updated] (HBASE-11155) Fix Validation Errors in Ref Guide

2014-05-13 Thread Misty Stanley-Jones (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misty Stanley-Jones updated HBASE-11155:


Attachment: HBASE-11155-schema_design.xml.patch
HBASE-11155-configuration.xml.patch
HBASE-11155-book.xml.patch

> Fix Validation Errors in Ref Guide
> --
>
> Key: HBASE-11155
> URL: https://issues.apache.org/jira/browse/HBASE-11155
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.98.2
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Attachments: HBASE-11155-book.xml.patch, 
> HBASE-11155-configuration.xml.patch, HBASE-11155-schema_design.xml.patch
>
>
> Before I do serious documentation work, I have to fix all of the validation 
> errors that are somehow not causing the Ref Guide to break the builds. I will 
> attach one patch per file -- that's the easiest way I know how to do it. I 
> will try not to make any content changes, only validation changes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2014-05-13 Thread Li Jiajia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Jiajia updated HBASE-11144:
--

Status: Patch Available  (was: Open)

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Li Jiajia
> Attachments: MultiRowRangeFilter.patch, MultiRowRangeFilter2.patch
>
>
> Provide a filter feature to support scan multiple row key ranges. It can 
> construct the row key ranges from the passed list which can be accessed by 
> each region server. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11154) Document how to use Reverse Scan API

2014-05-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996078#comment-13996078
 ] 

stack commented on HBASE-11154:
---

lgtm [~misty]  How about we call this good and your first doc fixup?  Ok if I 
commit this?

> Document how to use Reverse Scan API
> 
>
> Key: HBASE-11154
> URL: https://issues.apache.org/jira/browse/HBASE-11154
> Project: HBase
>  Issue Type: Task
>  Components: documentation, Scanners
>Affects Versions: 0.98.2
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Attachments: HBASE-11154.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11154) Document how to use Reverse Scan API

2014-05-13 Thread Misty Stanley-Jones (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misty Stanley-Jones updated HBASE-11154:


Attachment: HBASE-11154-2.patch

I think this one will do it. Note that I did fix one problem where " characters 
needed to be escaped as " in a couple places, and that is not directly 
relevant to this fix. I had to do that to get the file to open -- not sure how 
I missed it before. I actually checked a fresh version out so I wouldn't have 
any unintended side effects, so not sure if that mistake was checked in by 
someone else. I'm not going to figure it out tonight.

> Document how to use Reverse Scan API
> 
>
> Key: HBASE-11154
> URL: https://issues.apache.org/jira/browse/HBASE-11154
> Project: HBase
>  Issue Type: Task
>  Components: documentation, Scanners
>Affects Versions: 0.98.2
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Attachments: HBASE-11154-2.patch, HBASE-11154.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11135) Change region sequenceid generation so happens earlier in the append cycle rather than just before added to file

2014-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995972#comment-13995972
 ] 

Hadoop QA commented on HBASE-11135:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12644499/11135v6.txt
  against trunk revision .
  ATTACHMENT ID: 12644499

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 5 
warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.wal.TestLogRolling

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9509//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9509//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9509//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9509//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9509//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9509//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9509//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9509//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9509//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9509//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9509//console

This message is automatically generated.

> Change region sequenceid generation so happens earlier in the append cycle 
> rather than just before added to file
> 
>
> Key: HBASE-11135
> URL: https://issues.apache.org/jira/browse/HBASE-11135
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: stack
> Attachments: 11135.wip.txt, 11135v2.txt, 11135v5.txt, 11135v5.txt, 
> 11135v5.txt, 11135v6.txt
>
>
> Currently we assign the region edit/sequence id just before we put it in the 
> WAL.  We do it in the single thread that feeds from the ring buffer.  Doing 
> it at this point, we can ensure order, that the edits will be in the file in 
> accordance w/ the ordering of the region sequence id.
> But the point at which region sequence id is assigned an edit is deep down in 
> the WAL system and there is a lag between our putting an edit into the WAL 
> system and the edit actually getting its edit/sequence id.
> This lag -- "late-binding" -- complicates the unification of mvcc and region 
> sequence id, especially around async WAL writes (and, related, for no-WAL 
> writes) -- the parent for this issue (For async, how you get the edit id in 
> our system when the threads have all gone home -- unless you make them wait?)
> Chatting w/ Jeffrey Zhong yesterday, we came up with a crazypants means of 
> getting the region sequence id near-immediately.  We'll run two ringbuffers.  
> The first will mesh all handler threads and the consumer will generate ids 
> (we will have order on other side of this first ring buffer), and then if 
> async or no sync, we will just let the threads return ... updating mvcc just 
> before we let them go.  All other calls will go up on to the second ring 
> buffer to be serviced as now (batching, distribution out among the sync'ing 
> threads).  The first rb will

[jira] [Updated] (HBASE-11108) Split ZKTable into interface and implementation

2014-05-13 Thread Mikhail Antonov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-11108:


Status: Patch Available  (was: Open)

> Split ZKTable into interface and implementation
> ---
>
> Key: HBASE-11108
> URL: https://issues.apache.org/jira/browse/HBASE-11108
> Project: HBase
>  Issue Type: Sub-task
>  Components: Consensus, Zookeeper
>Affects Versions: 0.99.0
>Reporter: Konstantin Boudnik
>Assignee: Mikhail Antonov
> Attachments: HBASE-11108.patch, HBASE-11108.patch, HBASE-11108.patch
>
>
> In HBASE-11071 we are trying to split admin handlers away from ZK. However, a 
> ZKTable instance is being used in multiple places, hence it would be 
> beneficial to hide its implementation behind a well defined interface.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11153) http webUI's should redirect to https when enabled

2014-05-13 Thread Nick Dimiduk (JIRA)
Nick Dimiduk created HBASE-11153:


 Summary: http webUI's should redirect to https when enabled
 Key: HBASE-11153
 URL: https://issues.apache.org/jira/browse/HBASE-11153
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.98.0
Reporter: Nick Dimiduk
Priority: Minor


When configured to listen on https, we should redirect non-secure requests to 
the appropriate port/protocol. Currently we respond with a 200 and no data, 
which is perplexing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-10909) Abstract out ZooKeeper usage in HBase

2014-05-13 Thread Mikhail Antonov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-10909:


Attachment: (was: HBase Consensus.pdf)

> Abstract out ZooKeeper usage in HBase
> -
>
> Key: HBASE-10909
> URL: https://issues.apache.org/jira/browse/HBASE-10909
> Project: HBase
>  Issue Type: Umbrella
>  Components: Consensus, Zookeeper
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Attachments: HBaseConsensus.pdf, HBaseConsensus.pdf, 
> HBaseConsensus.pdf
>
>
> As some sort of follow-up or initial step towards HBASE-10296.
> Whatever consensus algorithm/library may be the chosen, perhaps one of first 
> practical steps towards this goal would be to better abstract ZK-related API 
> and details, which are now throughout the codebase (mostly leaked throuth 
> ZkUtil, ZooKeeperWatcher and listeners).
> This jira is umbrella for relevant subtasks.
> As the design doc is in process of peer-review now, please use Google Doc 
> (linked below) instead of pdf.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-10909) Abstract out ZooKeeper usage in HBase

2014-05-13 Thread Mikhail Antonov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-10909:


Description: 
As some sort of follow-up or initial step towards HBASE-10296.
Whatever consensus algorithm/library may be the chosen, perhaps one of first 
practical steps towards this goal would be to better abstract ZK-related API 
and details, which are now throughout the codebase (mostly leaked throuth 
ZkUtil, ZooKeeperWatcher and listeners).

This jira is umbrella for relevant subtasks. Design doc is attached, for 
comments/questions there's a google doc linked.

  was:
As some sort of follow-up or initial step towards HBASE-10296.
Whatever consensus algorithm/library may be the chosen, perhaps one of first 
practical steps towards this goal would be to better abstract ZK-related API 
and details, which are now throughout the codebase (mostly leaked throuth 
ZkUtil, ZooKeeperWatcher and listeners).

This jira is umbrella for relevant subtasks. Design doc is attached, for 
comments/questions there's google doc linked.


> Abstract out ZooKeeper usage in HBase
> -
>
> Key: HBASE-10909
> URL: https://issues.apache.org/jira/browse/HBASE-10909
> Project: HBase
>  Issue Type: Umbrella
>  Components: Consensus, Zookeeper
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Attachments: HBaseConsensus.pdf, HBaseConsensus.pdf, 
> HBaseConsensus.pdf
>
>
> As some sort of follow-up or initial step towards HBASE-10296.
> Whatever consensus algorithm/library may be the chosen, perhaps one of first 
> practical steps towards this goal would be to better abstract ZK-related API 
> and details, which are now throughout the codebase (mostly leaked throuth 
> ZkUtil, ZooKeeperWatcher and listeners).
> This jira is umbrella for relevant subtasks. Design doc is attached, for 
> comments/questions there's a google doc linked.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-10909) Abstract out ZooKeeper usage in HBase

2014-05-13 Thread Mikhail Antonov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-10909:


Description: 
As some sort of follow-up or initial step towards HBASE-10296.
Whatever consensus algorithm/library may be the chosen, perhaps one of first 
practical steps towards this goal would be to better abstract ZK-related API 
and details, which are now throughout the codebase (mostly leaked throuth 
ZkUtil, ZooKeeperWatcher and listeners).

This jira is umbrella for relevant subtasks. Design doc is attached, for 
comments/questions there's google doc linked.

  was:
As some sort of follow-up or initial step towards HBASE-10296.
Whatever consensus algorithm/library may be the chosen, perhaps one of first 
practical steps towards this goal would be to better abstract ZK-related API 
and details, which are now throughout the codebase (mostly leaked throuth 
ZkUtil, ZooKeeperWatcher and listeners).

This jira is umbrella for relevant subtasks.

As the design doc is in process of peer-review now, please use Google Doc 
(linked below) instead of pdf.


> Abstract out ZooKeeper usage in HBase
> -
>
> Key: HBASE-10909
> URL: https://issues.apache.org/jira/browse/HBASE-10909
> Project: HBase
>  Issue Type: Umbrella
>  Components: Consensus, Zookeeper
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Attachments: HBaseConsensus.pdf, HBaseConsensus.pdf, 
> HBaseConsensus.pdf
>
>
> As some sort of follow-up or initial step towards HBASE-10296.
> Whatever consensus algorithm/library may be the chosen, perhaps one of first 
> practical steps towards this goal would be to better abstract ZK-related API 
> and details, which are now throughout the codebase (mostly leaked throuth 
> ZkUtil, ZooKeeperWatcher and listeners).
> This jira is umbrella for relevant subtasks. Design doc is attached, for 
> comments/questions there's google doc linked.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10965) Automate detection of presence of Filter#filterRow()

2014-05-13 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996049#comment-13996049
 ] 

ramkrishna.s.vasudevan commented on HBASE-10965:


{code}
+  Class declaringClass = m.getDeclaringClass();
+  Class superCls = declaringClass.getSuperclass();
+  if (declaringClass.equals(clazz) && superCls.equals(filterCls)) {
+// filter class directly overrides Filter
+return filter.hasFilterRow();
+  }
{code}
If the filter is just implementing directly the 'Filter' class.  
'declaringClass.getSuperclass();'  would return null?  Am not sure.  Can you 
verify this once? If so 'null' check may be needed.

> Automate detection of presence of Filter#filterRow()
> 
>
> Key: HBASE-10965
> URL: https://issues.apache.org/jira/browse/HBASE-10965
> Project: HBase
>  Issue Type: Task
>  Components: Filters
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 10965-v1.txt, 10965-v2.txt, 10965-v3.txt, 10965-v4.txt, 
> 10965-v6.txt, 10965-v7.txt
>
>
> There is potential inconsistency between the return value of 
> Filter#hasFilterRow() and presence of Filter#filterRow().
> Filters may override Filter#filterRow() while leaving return value of 
> Filter#hasFilterRow() being false (inherited from FilterBase).
> Downside to purely depending on hasFilterRow() telling us whether custom 
> filter overrides filterRow(List) or filterRow() is that the check below may 
> be rendered ineffective:
> {code}
>   if (nextKv == KV_LIMIT) {
> if (this.filter != null && filter.hasFilterRow()) {
>   throw new IncompatibleFilterException(
> "Filter whose hasFilterRow() returns true is incompatible 
> with scan with limit!");
> }
> {code}
> When user forgets to override hasFilterRow(), the above check becomes not 
> useful.
> Another limitation is that we cannot optimize FilterList#filterRow() through 
> short circuit when FilterList#hasFilterRow() turns false.
> See 
> https://issues.apache.org/jira/browse/HBASE-11093?focusedCommentId=13985149&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13985149
> This JIRA aims to remove the inconsistency by automatically detecting the 
> presence of overridden Filter#filterRow(). For FilterBase-derived classes, if 
> filterRow() is implemented and not inherited from FilterBase, it is 
> equivalent to having hasFilterRow() return true.
> With precise detection of presence of Filter#filterRow(), the following code 
> from HRegion is no longer needed while backward compatibility is kept.
> {code}
>   return filter != null && (!filter.hasFilterRow())
>   && filter.filterRow();
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10965) Automate detection of presence of Filter#filterRow()

2014-05-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996106#comment-13996106
 ] 

Lars Hofhansl commented on HBASE-10965:
---

-0 on this change. Just seems to add complexity for little gain.
If we could remove hasFilterRow completely from the interface (which we can do 
in 1.0, I think) that would be a different story.
Now we have the worst of both worlds. Filter still has hasFilterRow() and on 
top of that we have the reflection stuff.


> Automate detection of presence of Filter#filterRow()
> 
>
> Key: HBASE-10965
> URL: https://issues.apache.org/jira/browse/HBASE-10965
> Project: HBase
>  Issue Type: Task
>  Components: Filters
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 10965-v1.txt, 10965-v2.txt, 10965-v3.txt, 10965-v4.txt, 
> 10965-v6.txt, 10965-v7.txt
>
>
> There is potential inconsistency between the return value of 
> Filter#hasFilterRow() and presence of Filter#filterRow().
> Filters may override Filter#filterRow() while leaving return value of 
> Filter#hasFilterRow() being false (inherited from FilterBase).
> Downside to purely depending on hasFilterRow() telling us whether custom 
> filter overrides filterRow(List) or filterRow() is that the check below may 
> be rendered ineffective:
> {code}
>   if (nextKv == KV_LIMIT) {
> if (this.filter != null && filter.hasFilterRow()) {
>   throw new IncompatibleFilterException(
> "Filter whose hasFilterRow() returns true is incompatible 
> with scan with limit!");
> }
> {code}
> When user forgets to override hasFilterRow(), the above check becomes not 
> useful.
> Another limitation is that we cannot optimize FilterList#filterRow() through 
> short circuit when FilterList#hasFilterRow() turns false.
> See 
> https://issues.apache.org/jira/browse/HBASE-11093?focusedCommentId=13985149&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13985149
> This JIRA aims to remove the inconsistency by automatically detecting the 
> presence of overridden Filter#filterRow(). For FilterBase-derived classes, if 
> filterRow() is implemented and not inherited from FilterBase, it is 
> equivalent to having hasFilterRow() return true.
> With precise detection of presence of Filter#filterRow(), the following code 
> from HRegion is no longer needed while backward compatibility is kept.
> {code}
>   return filter != null && (!filter.hasFilterRow())
>   && filter.filterRow();
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11123) Upgrade instructions from 0.94 to 0.98

2014-05-13 Thread Misty Stanley-Jones (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996107#comment-13996107
 ] 

Misty Stanley-Jones commented on HBASE-11123:
-

Sorry good catch, I did not even parse the word "rolling" when I was reading 
what he said. Let me fix it and redo the patch.

> Upgrade instructions from 0.94 to 0.98
> --
>
> Key: HBASE-11123
> URL: https://issues.apache.org/jira/browse/HBASE-11123
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.98.2
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
>Priority: Minor
> Attachments: HBASE-11123.patch
>
>
> I cloned this from the 0.96 upgrade docs task. It was suggested that we need 
> upgrade instructions from 0.94 to 0.98. I will need source material to even 
> prioritize this. Assuming this is Minor.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11154) Document how to use Reverse Scan API

2014-05-13 Thread Misty Stanley-Jones (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996161#comment-13996161
 ] 

Misty Stanley-Jones commented on HBASE-11154:
-

OK I don't know what to do about this result. I'll ask for help tomorrow in 
figuring it out.

> Document how to use Reverse Scan API
> 
>
> Key: HBASE-11154
> URL: https://issues.apache.org/jira/browse/HBASE-11154
> Project: HBase
>  Issue Type: Task
>  Components: documentation, Scanners
>Affects Versions: 0.98.2
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Attachments: HBASE-11154.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11154) Document how to use Reverse Scan API

2014-05-13 Thread Misty Stanley-Jones (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996162#comment-13996162
 ] 

Misty Stanley-Jones commented on HBASE-11154:
-

OK at least one of the problems is that I used  which is not valid in 
DB5. I will fix that up now and re-submit the patch.

> Document how to use Reverse Scan API
> 
>
> Key: HBASE-11154
> URL: https://issues.apache.org/jira/browse/HBASE-11154
> Project: HBase
>  Issue Type: Task
>  Components: documentation, Scanners
>Affects Versions: 0.98.2
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Attachments: HBASE-11154.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11112) PerformanceEvaluation should document --multiGet option on its printUsage.

2014-05-13 Thread Jean-Marc Spaggiari (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Marc Spaggiari updated HBASE-2:


Status: Patch Available  (was: Open)

> PerformanceEvaluation should document --multiGet option on its printUsage.
> --
>
> Key: HBASE-2
> URL: https://issues.apache.org/jira/browse/HBASE-2
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, Performance
>Affects Versions: 0.98.3
>Reporter: Jean-Marc Spaggiari
>Assignee: Jean-Marc Spaggiari
> Attachments: HBASE-2-v0-trunk.patch, HBASE-2-v1-trunk.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11128) Add -target option to ExportSnapshot to export with a different name

2014-05-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995950#comment-13995950
 ] 

Hudson commented on HBASE-11128:


SUCCESS: Integrated in HBase-0.94-security #482 (See 
[https://builds.apache.org/job/HBase-0.94-security/482/])
HBASE-11128 Add -target option to ExportSnapshot to export with a different 
name (mbertozzi: rev 1593778)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshot.java


> Add -target option to ExportSnapshot to export with a different name
> 
>
> Key: HBASE-11128
> URL: https://issues.apache.org/jira/browse/HBASE-11128
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 0.99.0
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Trivial
> Fix For: 0.99.0, 0.96.3, 0.94.20, 0.98.3
>
> Attachments: HBASE-11128-v0.patch
>
>
> Add a "-target" option to export the snapshot using a different name



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11143) ageOfLastShippedOp metric is confusing

2014-05-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-11143:
--

Attachment: (was: 11143-trunk.txt)

> ageOfLastShippedOp metric is confusing
> --
>
> Key: HBASE-11143
> URL: https://issues.apache.org/jira/browse/HBASE-11143
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.99.0, 0.96.3, 0.94.20, 0.98.3
>
> Attachments: 11143-0.94-v2.txt, 11143-0.94.txt, 11143-trunk.txt
>
>
> We are trying to report on replication lag and find that there is no good 
> single metric to do that.
> ageOfLastShippedOp is close, but unfortunately it is increased even when 
> there is nothing to ship on a particular RegionServer.
> I would like discuss a few options here:
> Add a new metric: replicationQueueTime (or something) with the above meaning. 
> I.e. if we have something to ship we set the age of that last shipped edit, 
> if we fail we increment that last time (just like we do now). But if there is 
> nothing to replicate we set it to current time (and hence that metric is 
> reported to close to 0).
> Alternatively we could change the meaning of ageOfLastShippedOp to mean to do 
> that. That might lead to surprises, but the current behavior is clearly weird 
> when there is nothing to replicate.
> Comments? [~jdcryans], [~stack].
> If approach sounds good, I'll make a patch for all branches.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10944) Remove all kv.getBuffer() and kv.getRow() references existing in the code

2014-05-13 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996070#comment-13996070
 ] 

ramkrishna.s.vasudevan commented on HBASE-10944:


The thing here to be noted is in the write path the need for key is very 
important.  So we could try passing the Cell and create KeyOnlyKeyValue from 
that and use the Cell apis to create the keys.  Need to see if there is any 
performance impact due to that.  


> Remove all kv.getBuffer() and kv.getRow() references existing in the code
> -
>
> Key: HBASE-10944
> URL: https://issues.apache.org/jira/browse/HBASE-10944
> Project: HBase
>  Issue Type: Sub-task
>Reporter: ramkrishna.s.vasudevan
> Fix For: 0.99.0
>
>
> kv.getRow() and kv.getBuffers() are still used in places to form key byte[] 
> and row byte[].  Removing all such instances including testcases will make 
> the usage of Cell complete.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11156) Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available

2014-05-13 Thread Jiten (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiten updated HBASE-11156:
--

Description: 
# hbase shell
2014-05-13 14:51:41,582 INFO  [main] Configuration.deprecation: 
hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help' for list of supported commands.
Type "exit" to leave the HBase Shell
Version 0.96.1.1-cdh5.0.0, rUnknown, Thu Mar 27 23:01:59 PDT 2014.

Not able to create table in Hbase. Please help


>  Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use 
> io.native.lib.available
> -
>
> Key: HBASE-11156
> URL: https://issues.apache.org/jira/browse/HBASE-11156
> Project: HBase
>  Issue Type: Bug
>  Components: Admin
>Affects Versions: 0.96.1.1
>Reporter: Jiten
>Priority: Critical
>
> # hbase shell
> 2014-05-13 14:51:41,582 INFO  [main] Configuration.deprecation: 
> hadoop.native.lib is deprecated. Instead, use io.native.lib.available
> HBase Shell; enter 'help' for list of supported commands.
> Type "exit" to leave the HBase Shell
> Version 0.96.1.1-cdh5.0.0, rUnknown, Thu Mar 27 23:01:59 PDT 2014.
> Not able to create table in Hbase. Please help



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10924) [region_mover]: Adjust region_mover script to retry unloading a server a configurable number of times in case of region splits/merges

2014-05-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996134#comment-13996134
 ] 

Lars Hofhansl commented on HBASE-10924:
---

Is this an 0.94 issues specifically?

> [region_mover]: Adjust region_mover script to retry unloading a server a 
> configurable number of times in case of region splits/merges
> -
>
> Key: HBASE-10924
> URL: https://issues.apache.org/jira/browse/HBASE-10924
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.94.15
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>  Labels: region_mover, rolling_upgrade
> Fix For: 0.94.20
>
> Attachments: HBASE-10924-0.94-v2.patch, HBASE-10924-0.94-v3.patch
>
>
> Observed behavior:
> In about 5% of cases, my rolling upgrade tests fail because of stuck regions 
> during a region server unload. My theory is that this occurs when region 
> assignment information changes between the time the region list is generated, 
> and the time when the region is to be moved.
> An example of such a region information change is a split or merge.
> Example:
> Regionserver A has 100 regions (#0-#99). The balancer is turned off and the 
> regionmover script is called to unload this regionserver. The regionmover 
> script will generate the list of 100 regions to be moved and then proceed 
> down that list, moving the regions off in series. However, there is a region, 
> #84, that has split into two daughter regions while regions 0-83 were moved. 
> The script will be stuck trying to move #84, timeout, and then the failure 
> will bubble up (attempt 1 failed).
> Proposed solution:
> This specific failure mode should be caught and the region_mover script 
> should now attempt to move off all the regions. Now, it will have 16+1 (due 
> to split) regions to move. There is a good chance that it will be able to 
> move all 17 off without issues. However, should it encounter this same issue 
> (attempt 2 failed), it will retry again. This process will continue until the 
> maximum number of unload retry attempts has been reached.
> This is not foolproof, but let's say for the sake of argument that 5% of 
> unload attempts hit this issue, then with a retry count of 3, it will reduce 
> the unload failure probability from 0.05 to 0.000125 (0.05^3).
> Next steps:
> I am looking for feedback on this approach. If it seems like a sensible 
> approach, I will create a strawman patch and test it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HBASE-10924) [region_mover]: Adjust region_mover script to retry unloading a server a configurable number of times in case of region splits/merges

2014-05-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996134#comment-13996134
 ] 

Lars Hofhansl edited comment on HBASE-10924 at 5/13/14 6:31 AM:


Is this an 0.94 issue specifically?


was (Author: lhofhansl):
Is this an 0.94 issues specifically?

> [region_mover]: Adjust region_mover script to retry unloading a server a 
> configurable number of times in case of region splits/merges
> -
>
> Key: HBASE-10924
> URL: https://issues.apache.org/jira/browse/HBASE-10924
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.94.15
>Reporter: Aleksandr Shulman
>Assignee: Aleksandr Shulman
>  Labels: region_mover, rolling_upgrade
> Fix For: 0.94.20
>
> Attachments: HBASE-10924-0.94-v2.patch, HBASE-10924-0.94-v3.patch
>
>
> Observed behavior:
> In about 5% of cases, my rolling upgrade tests fail because of stuck regions 
> during a region server unload. My theory is that this occurs when region 
> assignment information changes between the time the region list is generated, 
> and the time when the region is to be moved.
> An example of such a region information change is a split or merge.
> Example:
> Regionserver A has 100 regions (#0-#99). The balancer is turned off and the 
> regionmover script is called to unload this regionserver. The regionmover 
> script will generate the list of 100 regions to be moved and then proceed 
> down that list, moving the regions off in series. However, there is a region, 
> #84, that has split into two daughter regions while regions 0-83 were moved. 
> The script will be stuck trying to move #84, timeout, and then the failure 
> will bubble up (attempt 1 failed).
> Proposed solution:
> This specific failure mode should be caught and the region_mover script 
> should now attempt to move off all the regions. Now, it will have 16+1 (due 
> to split) regions to move. There is a good chance that it will be able to 
> move all 17 off without issues. However, should it encounter this same issue 
> (attempt 2 failed), it will retry again. This process will continue until the 
> maximum number of unload retry attempts has been reached.
> This is not foolproof, but let's say for the sake of argument that 5% of 
> unload attempts hit this issue, then with a retry count of 3, it will reduce 
> the unload failure probability from 0.05 to 0.000125 (0.05^3).
> Next steps:
> I am looking for feedback on this approach. If it seems like a sensible 
> approach, I will create a strawman patch and test it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HBASE-11126) Add RegionObserver pre hooks that operate under row lock

2014-05-13 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan reassigned HBASE-11126:
--

Assignee: ramkrishna.s.vasudevan

> Add RegionObserver pre hooks that operate under row lock
> 
>
> Key: HBASE-11126
> URL: https://issues.apache.org/jira/browse/HBASE-11126
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Andrew Purtell
>Assignee: ramkrishna.s.vasudevan
>
> The coprocessor hooks were placed outside of row locks. This was meant to 
> sidestep performance issues arising from significant work done within hook 
> invocations. However as the security code increases in sophistication we are 
> now running into concurrency issues trying to use them as a result of that 
> early decision. Since the initial introduction of coprocessor upcalls there 
> has been some significant refactoring done around them and concurrency 
> control in core has become more complex. This is potentially an issue for 
> many coprocessor users.
> We should do either:\\
> - Move all existing RegionObserver pre* hooks to execute under row lock.
> - Introduce a new set of RegionObserver pre* hooks that execute under row 
> lock, named to indicate such.
> The second option is less likely to lead to surprises.
> All RegionObserver hook Javadoc should be updated with advice to the 
> coprocessor implementor not to take their own row locks in the hook. If the 
> current thread happens to already have a row lock and they try to take a lock 
> on another row, there is a deadlock risk.
> As always a drawback of adding hooks is the potential for performance impact. 
> We should benchmark the impact and decide if the second option above is a 
> viable choice or if the first option is required.
> Finally, we should introduce a higher level interface for managing the 
> registration of 'user' code for execution from the low level hooks. I filed 
> HBASE-11125 to discuss this further.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11157) [hbck] NotServingRegionException: Received close for but we are not serving it

2014-05-13 Thread dailidong (JIRA)
dailidong created HBASE-11157:
-

 Summary: [hbck] NotServingRegionException: Received close for 
 but we are not serving it
 Key: HBASE-11157
 URL: https://issues.apache.org/jira/browse/HBASE-11157
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.94.13
Reporter: dailidong
Priority: Trivial


if hbck close a region then meet a NotServerRegionException,hbck will hang up . 
we will close the region on the regionserver, but this regionserver is not 
serving the region, so we should try catch this exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11144) Filter to support scan multiple row key ranges

2014-05-13 Thread Li Jiajia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Jiajia updated HBASE-11144:
--

Attachment: MultiRowRangeFilter.patch

the patch attached is against trunk,

> Filter to support scan multiple row key ranges
> --
>
> Key: HBASE-11144
> URL: https://issues.apache.org/jira/browse/HBASE-11144
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters
>Reporter: Li Jiajia
> Attachments: MultiRowRangeFilter.patch
>
>
> Filter to support scan multiple row key ranges. It can construct the row key 
> ranges from the passed list which can be accessed by each region server. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11157) [hbck] NotServingRegionException: Received close for but we are not serving it

2014-05-13 Thread dailidong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dailidong updated HBASE-11157:
--

Attachment: HBASE-11157.patch

> [hbck] NotServingRegionException: Received close for  but we are 
> not serving it
> ---
>
> Key: HBASE-11157
> URL: https://issues.apache.org/jira/browse/HBASE-11157
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.94.13
>Reporter: dailidong
>Priority: Trivial
> Attachments: HBASE-11157.patch
>
>
> if hbck close a region then meet a NotServerRegionException,hbck will hang up 
> . we will close the region on the regionserver, but this regionserver is not 
> serving the region, so we should try catch this exception.
> Trying to fix unassigned region...
> Exception in thread "main" org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hbase.NotServingRegionException: Received close for 
> regionName but we are not serving it
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:3204)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:3185)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:323)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1012)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:87)
> at com.sun.proxy.$Proxy7.closeRegion(Unknown Source)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.closeRegionSilentlyAndWait(HBaseFsckRepair.java:150)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.closeRegion(HBaseFsck.java:1565)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1704)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1406)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:419)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:438)
> at org.apache.hadoop.hbase.util.HBaseFsck.exec(HBaseFsck.java:3670)
> at org.apache.hadoop.hbase.util.HBaseFsck.run(HBaseFsck.java:3489)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3483)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11157) [hbck] NotServingRegionException: Received close for but we are not serving it

2014-05-13 Thread dailidong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dailidong updated HBASE-11157:
--

Status: Patch Available  (was: Open)

> [hbck] NotServingRegionException: Received close for  but we are 
> not serving it
> ---
>
> Key: HBASE-11157
> URL: https://issues.apache.org/jira/browse/HBASE-11157
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.94.13
>Reporter: dailidong
>Priority: Trivial
> Attachments: HBASE-11157.patch
>
>
> if hbck close a region then meet a NotServerRegionException,hbck will hang up 
> . we will close the region on the regionserver, but this regionserver is not 
> serving the region, so we should try catch this exception.
> Trying to fix unassigned region...
> Exception in thread "main" org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hbase.NotServingRegionException: Received close for 
> regionName but we are not serving it
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:3204)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:3185)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:323)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1012)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:87)
> at com.sun.proxy.$Proxy7.closeRegion(Unknown Source)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.closeRegionSilentlyAndWait(HBaseFsckRepair.java:150)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.closeRegion(HBaseFsck.java:1565)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1704)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1406)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:419)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:438)
> at org.apache.hadoop.hbase.util.HBaseFsck.exec(HBaseFsck.java:3670)
> at org.apache.hadoop.hbase.util.HBaseFsck.run(HBaseFsck.java:3489)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3483)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2014-05-13 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996158#comment-13996158
 ] 

Jerry He commented on HBASE-7912:
-

HI, [~stack]

thanks for the comment.

bq. Have you lads had much chance testing it out?

Yes, our QA and us all have tested this.  This feature has been in our product 
for 2 releases now

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WALs to capture incremental changes, but we use bulk load of 
> HFiles for fast incremental restore.
> * Support single table or a set of tables, and column family level backup and 
> restore.
> * Restore to different table names.
> * Support adding additional tables or CF to backup set without interruption 
> of incremental backup schedule.
> * Support rollup/combining of incremental backups into longer period and 
> bigger incremental backups.
> * Unified command line interface for all the above.
> The solution will support HBase backup to FileSystem, either on the same 
> cluster or across clusters.  It has the flexibility to support backup to 
> other devices and servers in the future.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10936) Add zeroByte encoding test

2014-05-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996118#comment-13996118
 ] 

Lars Hofhansl commented on HBASE-10936:
---

Committed to 0.94. Waiting for [~stack] on 0.96.

> Add zeroByte encoding test
> --
>
> Key: HBASE-10936
> URL: https://issues.apache.org/jira/browse/HBASE-10936
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Lars Hofhansl
>Priority: Minor
> Fix For: 0.96.3, 0.94.20
>
> Attachments: 10936-0.94.txt, 10936-0.96.txt, 10936-0.98.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11128) Add -target option to ExportSnapshot to export with a different name

2014-05-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995885#comment-13995885
 ] 

Hudson commented on HBASE-11128:


SUCCESS: Integrated in hbase-0.96-hadoop2 #276 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/276/])
HBASE-11128 Add -target option to ExportSnapshot to export with a different 
name (mbertozzi: rev 1593777)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshot.java


> Add -target option to ExportSnapshot to export with a different name
> 
>
> Key: HBASE-11128
> URL: https://issues.apache.org/jira/browse/HBASE-11128
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 0.99.0
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Trivial
> Fix For: 0.99.0, 0.96.3, 0.94.20, 0.98.3
>
> Attachments: HBASE-11128-v0.patch
>
>
> Add a "-target" option to export the snapshot using a different name



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11128) Add -target option to ExportSnapshot to export with a different name

2014-05-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995887#comment-13995887
 ] 

Hudson commented on HBASE-11128:


SUCCESS: Integrated in hbase-0.96 #398 (See 
[https://builds.apache.org/job/hbase-0.96/398/])
HBASE-11128 Add -target option to ExportSnapshot to export with a different 
name (mbertozzi: rev 1593777)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshot.java


> Add -target option to ExportSnapshot to export with a different name
> 
>
> Key: HBASE-11128
> URL: https://issues.apache.org/jira/browse/HBASE-11128
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 0.99.0
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Trivial
> Fix For: 0.99.0, 0.96.3, 0.94.20, 0.98.3
>
> Attachments: HBASE-11128-v0.patch
>
>
> Add a "-target" option to export the snapshot using a different name



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11112) PerformanceEvaluation should document --multiGet option on its printUsage.

2014-05-13 Thread Jean-Marc Spaggiari (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Marc Spaggiari updated HBASE-2:


Attachment: HBASE-2-v1-trunk.patch

So more something like that? Pretty small ;)

> PerformanceEvaluation should document --multiGet option on its printUsage.
> --
>
> Key: HBASE-2
> URL: https://issues.apache.org/jira/browse/HBASE-2
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, Performance
>Affects Versions: 0.98.3
>Reporter: Jean-Marc Spaggiari
>Assignee: Jean-Marc Spaggiari
> Attachments: HBASE-2-v0-trunk.patch, HBASE-2-v1-trunk.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11154) Document how to use Reverse Scan API

2014-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996132#comment-13996132
 ] 

Hadoop QA commented on HBASE-11154:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12644541/HBASE-11154.patch
  against trunk revision .
  ATTACHMENT ID: 12644541

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+https://issues.apache.org/jira/browse/HBASE-4811";>HBASE-4811 
implements an API to scan a table or a range within a table in reverse, 
reducing the need to optimize your schema for forward or reverse scanning. This 
feature is available in HBase 0.98 and later. See https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed%28boolean";
 /> for more information.
+https://issues.apache.org/jira/browse/HBASE-4811";>HBASE-4811 
implements an API to scan a table or a range within a table in reverse, 
reducing the need to optimize your schema for forward or reverse scanning. This 
feature is available in HBase 0.98 and later. See https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed%28boolean";
 /> for more information.

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9511//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9511//console

This message is automatically generated.

> Document how to use Reverse Scan API
> 
>
> Key: HBASE-11154
> URL: https://issues.apache.org/jira/browse/HBASE-11154
> Project: HBase
>  Issue Type: Task
>  Components: documentation, Scanners
>Affects Versions: 0.98.2
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Attachments: HBASE-11154.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10965) Automate detection of presence of Filter#filterRow()

2014-05-13 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996265#comment-13996265
 ] 

Anoop Sam John commented on HBASE-10965:


My concern also same. We have to rely on the boolean return value as well as we 
have auto detect. Code is more complex now. I initially thought we can remove 
hasFilterRow () But as long as we have Filter and FilterBase 2 abstract super 
classes we can not remove the api.

> Automate detection of presence of Filter#filterRow()
> 
>
> Key: HBASE-10965
> URL: https://issues.apache.org/jira/browse/HBASE-10965
> Project: HBase
>  Issue Type: Task
>  Components: Filters
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 10965-v1.txt, 10965-v2.txt, 10965-v3.txt, 10965-v4.txt, 
> 10965-v6.txt, 10965-v7.txt
>
>
> There is potential inconsistency between the return value of 
> Filter#hasFilterRow() and presence of Filter#filterRow().
> Filters may override Filter#filterRow() while leaving return value of 
> Filter#hasFilterRow() being false (inherited from FilterBase).
> Downside to purely depending on hasFilterRow() telling us whether custom 
> filter overrides filterRow(List) or filterRow() is that the check below may 
> be rendered ineffective:
> {code}
>   if (nextKv == KV_LIMIT) {
> if (this.filter != null && filter.hasFilterRow()) {
>   throw new IncompatibleFilterException(
> "Filter whose hasFilterRow() returns true is incompatible 
> with scan with limit!");
> }
> {code}
> When user forgets to override hasFilterRow(), the above check becomes not 
> useful.
> Another limitation is that we cannot optimize FilterList#filterRow() through 
> short circuit when FilterList#hasFilterRow() turns false.
> See 
> https://issues.apache.org/jira/browse/HBASE-11093?focusedCommentId=13985149&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13985149
> This JIRA aims to remove the inconsistency by automatically detecting the 
> presence of overridden Filter#filterRow(). For FilterBase-derived classes, if 
> filterRow() is implemented and not inherited from FilterBase, it is 
> equivalent to having hasFilterRow() return true.
> With precise detection of presence of Filter#filterRow(), the following code 
> from HRegion is no longer needed while backward compatibility is kept.
> {code}
>   return filter != null && (!filter.hasFilterRow())
>   && filter.filterRow();
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11135) Change region sequenceid generation so happens earlier in the append cycle rather than just before added to file

2014-05-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995336#comment-13995336
 ] 

stack commented on HBASE-11135:
---

[~jeffreyz]

bq. Yes, I can do that. Today I'll migrate my patch and run tests to see if 
there is any issue.

Then we'd get our perf back!

I can make it so we only use the latch at particular times, say at flush time 
so I can be sure of sequence ids... so we can avoid sync under lock.

> Change region sequenceid generation so happens earlier in the append cycle 
> rather than just before added to file
> 
>
> Key: HBASE-11135
> URL: https://issues.apache.org/jira/browse/HBASE-11135
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: stack
> Attachments: 11135.wip.txt, 11135v2.txt, 11135v5.txt, 11135v5.txt, 
> 11135v5.txt
>
>
> Currently we assign the region edit/sequence id just before we put it in the 
> WAL.  We do it in the single thread that feeds from the ring buffer.  Doing 
> it at this point, we can ensure order, that the edits will be in the file in 
> accordance w/ the ordering of the region sequence id.
> But the point at which region sequence id is assigned an edit is deep down in 
> the WAL system and there is a lag between our putting an edit into the WAL 
> system and the edit actually getting its edit/sequence id.
> This lag -- "late-binding" -- complicates the unification of mvcc and region 
> sequence id, especially around async WAL writes (and, related, for no-WAL 
> writes) -- the parent for this issue (For async, how you get the edit id in 
> our system when the threads have all gone home -- unless you make them wait?)
> Chatting w/ Jeffrey Zhong yesterday, we came up with a crazypants means of 
> getting the region sequence id near-immediately.  We'll run two ringbuffers.  
> The first will mesh all handler threads and the consumer will generate ids 
> (we will have order on other side of this first ring buffer), and then if 
> async or no sync, we will just let the threads return ... updating mvcc just 
> before we let them go.  All other calls will go up on to the second ring 
> buffer to be serviced as now (batching, distribution out among the sync'ing 
> threads).  The first rb will have no friction and should turn at fast rates 
> compared to the second.  There should not be noticeable slowdown nor do I 
> foresee this refactor intefering w/ our multi-WAL plans.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11158) bin/hbase upgrade -check should also check compression codecs

2014-05-13 Thread Jean-Marc Spaggiari (JIRA)
Jean-Marc Spaggiari created HBASE-11158:
---

 Summary: bin/hbase upgrade -check should also check compression 
codecs
 Key: HBASE-11158
 URL: https://issues.apache.org/jira/browse/HBASE-11158
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0
Reporter: Jean-Marc Spaggiari


When upgrading insitu from 0.94 to 0.96 or 0.98 codecs are usually already 
there in the servers so it's all fine.

But when doing an upgrade by moving the data from one cluster to a brand new 
one, compression codecs might be missing.

bin/hbase upgrade -check will not report any missing codec, but HBase will not 
be able to start after the upgrade because the codes are missing.

I think bin/hbase upgrade -check should check the compression codecs configured 
on the tables and make sure they are available on the new cluster. f not, it 
should be reported.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11119) Update ExportSnapShot to optionally not use a tmp file on external file system

2014-05-13 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-9:


  Component/s: snapshots
Fix Version/s: 0.98.3
   0.94.20
   0.96.3
   0.99.0
   Issue Type: Improvement  (was: New Feature)

> Update ExportSnapShot to optionally not use a tmp file on external file system
> --
>
> Key: HBASE-9
> URL: https://issues.apache.org/jira/browse/HBASE-9
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Reporter: Ted Malaska
>Assignee: Ted Malaska
>Priority: Minor
> Fix For: 0.99.0, 0.96.3, 0.94.20, 0.98.3
>
> Attachments: HBASE-9.patch
>
>
> There are FileSystem like S3 where renaming is extremely expensive.  This 
> patch will add a parameter that says something like
> use.tmp.folder
> It will be defaulted to true.  So default behavior is the same.  If false is 
> set them the files will land in the final destination with no need for a 
> rename. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10357) Failover RPC's for scans

2014-05-13 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995620#comment-13995620
 ] 

Devaraj Das commented on HBASE-10357:
-

Ignore the pom.xml changes in the last patch.

> Failover RPC's for scans
> 
>
> Key: HBASE-10357
> URL: https://issues.apache.org/jira/browse/HBASE-10357
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Reporter: Enis Soztutar
> Fix For: 0.99.0
>
> Attachments: 10357-1.txt, 10357-2.txt, 10357-3.2.txt, 10357-3.txt, 
> 10357-4.2.txt, 10357-4.3.txt, 10357-4.txt
>
>
> This is extension of HBASE-10355 to add failover support for scans. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11137) Add mapred.TableSnapshotInputFormat

2014-05-13 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-11137:
-

Attachment: HBASE-11137.01.patch.0

Though adjacent, I think these failures are unrelated. Reattaching for rerun.

> Add mapred.TableSnapshotInputFormat
> ---
>
> Key: HBASE-11137
> URL: https://issues.apache.org/jira/browse/HBASE-11137
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce, Performance
>Affects Versions: 0.98.0, 0.96.2
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Attachments: HBASE-11137.00.patch, HBASE-11137.01.patch, 
> HBASE-11137.01.patch.0
>
>
> We should have feature parity between mapreduce and mapred implementations. 
> This is important for Hive.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-10935) support snapshot policy where flush memstore can be skipped to prevent production cluster freeze

2014-05-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-10935:
--

Fix Version/s: (was: 0.94.20)

I am removing this from 0.94. Can bring it back if requested.

> support snapshot policy where flush memstore can be skipped to prevent 
> production cluster freeze
> 
>
> Key: HBASE-10935
> URL: https://issues.apache.org/jira/browse/HBASE-10935
> Project: HBase
>  Issue Type: New Feature
>  Components: shell, snapshots
>Affects Versions: 0.94.7, 0.94.18
>Reporter: Tianying Chang
>Assignee: Tianying Chang
>Priority: Minor
> Fix For: 0.99.0
>
> Attachments: jira-10935-trunk.patch, jira-10935.patch
>
>
> We are using snapshot feature to do HBase disaster recovery. We will do 
> snapshot in our production cluster periodically. The current flush snapshot 
> policy require all regions of the table to coordinate to prevent write and do 
> flush at the same time. Since we use WALPlayer to complete the data that is 
> not in the snapshot HFile, we don't need the snapshot to do coordinated 
> flush. The snapshot just recored all the HFile that are already there. 
> I added the parameter in the HBase shell. So people can choose to use the 
> NoFlush snapshot when they need, like below. Otherwise, the default flush 
> snpahot support is not impacted. 
> >snaphot 'TestTable', 'TestSnapshot', 'skipFlush'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11157) [hbck] NotServingRegionException: Received close for but we are not serving it

2014-05-13 Thread dailidong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dailidong updated HBASE-11157:
--

Description: 
if hbck close a region then meet a NotServerRegionException,hbck will hang up . 
we will close the region on the regionserver, but this regionserver is not 
serving the region, so we should try catch this exception.

Trying to fix unassigned region...
Exception in thread "main" org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.hbase.NotServingRegionException: Received close for 
regionName but we are not serving it
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:3204)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:3185)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:323)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)

at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1012)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:87)
at com.sun.proxy.$Proxy7.closeRegion(Unknown Source)
at 
org.apache.hadoop.hbase.util.HBaseFsckRepair.closeRegionSilentlyAndWait(HBaseFsckRepair.java:150)
at 
org.apache.hadoop.hbase.util.HBaseFsck.closeRegion(HBaseFsck.java:1565)
at 
org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1704)
at 
org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1406)
at 
org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:419)
at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:438)
at org.apache.hadoop.hbase.util.HBaseFsck.exec(HBaseFsck.java:3670)
at org.apache.hadoop.hbase.util.HBaseFsck.run(HBaseFsck.java:3489)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3483)

  was:if hbck close a region then meet a NotServerRegionException,hbck will 
hang up . we will close the region on the regionserver, but this regionserver 
is not serving the region, so we should try catch this exception.


> [hbck] NotServingRegionException: Received close for  but we are 
> not serving it
> ---
>
> Key: HBASE-11157
> URL: https://issues.apache.org/jira/browse/HBASE-11157
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.94.13
>Reporter: dailidong
>Priority: Trivial
>
> if hbck close a region then meet a NotServerRegionException,hbck will hang up 
> . we will close the region on the regionserver, but this regionserver is not 
> serving the region, so we should try catch this exception.
> Trying to fix unassigned region...
> Exception in thread "main" org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hbase.NotServingRegionException: Received close for 
> regionName but we are not serving it
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:3204)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:3185)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:323)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1012)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:87)
> at com.sun.proxy.$Proxy7.closeRegion(Unknown Source)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.closeRegionSilentlyAndWait(HBaseFsckRepair.java:150)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.closeRegion(HBaseFsck.java:1565)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1704)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistenc

[jira] [Created] (HBASE-11156) Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available

2014-05-13 Thread Jiten (JIRA)
Jiten created HBASE-11156:
-

 Summary:  Configuration.deprecation: hadoop.native.lib is 
deprecated. Instead, use io.native.lib.available
 Key: HBASE-11156
 URL: https://issues.apache.org/jira/browse/HBASE-11156
 Project: HBase
  Issue Type: Bug
  Components: Admin
Affects Versions: 0.96.1.1
Reporter: Jiten
Priority: Critical






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11146) HMaster instantiates both MasterCoprocessorHost and RegionServerCoprocessorHost

2014-05-13 Thread Qiang Tian (JIRA)
Qiang Tian created HBASE-11146:
--

 Summary: HMaster instantiates both MasterCoprocessorHost and 
RegionServerCoprocessorHost
 Key: HBASE-11146
 URL: https://issues.apache.org/jira/browse/HBASE-11146
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 0.99.0
Reporter: Qiang Tian
Assignee: Qiang Tian


See HBASE-11096.

in 0.99, HRegionServer is the base class of HMaster. 
master and regionserver share the same run method, the master instantiates both 
MasterCoprocessorHost and RegionServerCoprocessorHost


below is example logs, a coprocessor is start/stop twice--one is for real 
regionserver, the other is for the RegionServerCoprocessorHost in master.

2014-05-08 00:33:51,632 INFO [M:0;bdvm135:36021] 
coprocessor.TestCoprocessorStop$FooCoprocessor(66): st
art coprocessor on regionserver

2014-05-08 00:33:51,633 INFO [RS:0;bdvm135:47513] 
coprocessor.TestCoprocessorStop$FooCoprocessor(66): s
tart coprocessor on regionserver

...
2014-05-08 00:34:03,166 INFO [main] regionserver.HRegionServer(1624): call 
stack of stop
java.io.IOException
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.stop(HRegionServer.java:1624)
at 
org.apache.hadoop.hbase.master.ServerManager.shutdownCluster(ServerManager.java:975)
at org.apache.hadoop.hbase.master.HMaster.shutdown(HMaster.java:1623)
at org.apache.hadoop.hbase.util.JVMClusterUtil.shutdown(JVMClusterUtil.java:256)
at 
org.apache.hadoop.hbase.LocalHBaseCluster.shutdown(LocalHBaseCluster.java:437)
at org.apache.hadoop.hbase.MiniHBaseCluster.shutdown(MiniHBaseCluster.java:519)
at 
org.apache.hadoop.hbase.coprocessor.TestCoprocessorStop.testStopped(TestCoprocessorStop.java:
114)
...
2014-05-08 00:34:03,215 INFO [main] regionserver.HRegionServer(1629): rsHost 
code path called

2014-05-08 00:34:03,228 DEBUG [main] coprocessor.CoprocessorHost(258): Stop 
coprocessor org.apache.hadoo
p.hbase.coprocessor.TestCoprocessorStop$FooCoprocessor
2014-05-08 00:34:03,462 INFO [main] 
coprocessor.TestCoprocessorStop$FooCoprocessor(88): create file hdf
s://localhost:8155/user/tianq/test-data/f0c9423c-e505-4feb-907e-c7bd6e16545b/regionserver1399534399680
 r
eturn rc true

...

2014-05-08 00:34:03,482 INFO [main] regionserver.HRegionServer(1624): call 
stack of stop
java.io.IOException
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.stop(HRegionServer.java:1624)
at org.apache.hadoop.hbase.util.JVMClusterUtil.shutdown(JVMClusterUtil.java:264)
at 
org.apache.hadoop.hbase.LocalHBaseCluster.shutdown(LocalHBaseCluster.java:437)
at org.apache.hadoop.hbase.MiniHBaseCluster.shutdown(MiniHBaseCluster.java:519)
at 
org.apache.hadoop.hbase.coprocessor.TestCoprocessorStop.testStopped(TestCoprocessorStop.java:
114)
2014-05-08 00:34:03,485 INFO [main] regionserver.HRegionServer(1629): rsHost 
code path called

2014-05-08 00:34:03,485 DEBUG [main] coprocessor.CoprocessorHost(258): Stop 
coprocessor org.apache.hadoo
p.hbase.coprocessor.TestCoprocessorStop$FooCoprocessor
2014-05-08 00:34:03,493 INFO [main] 
coprocessor.TestCoprocessorStop$FooCoprocessor(88): create file hdf
s://localhost:8155/user/tianq/test-data/f0c9423c-e505-4feb-907e-c7bd6e16545b/regionserver1399534399680
 r
eturn rc false



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11135) Change region sequenceid generation so happens earlier in the append cycle rather than just before added to file

2014-05-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995236#comment-13995236
 ] 

stack commented on HBASE-11135:
---

[~jeffreyz] What ever is most convenient for you, just let me know.  Let me try 
and get hadoopqa to run against this last version.

I tested 0.98.  Even w/ the latch in place doing the hand off of the squence 
id, we seem to do between 1/2 and 1/3rd more throughput than 0.98.

One thought I had was could you update the mvcc in the ring buffer consumer 
thread just after we up the region edit/sequence id?  If so, I could undo the 
latching.

> Change region sequenceid generation so happens earlier in the append cycle 
> rather than just before added to file
> 
>
> Key: HBASE-11135
> URL: https://issues.apache.org/jira/browse/HBASE-11135
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: stack
> Attachments: 11135.wip.txt, 11135v2.txt, 11135v5.txt, 11135v5.txt
>
>
> Currently we assign the region edit/sequence id just before we put it in the 
> WAL.  We do it in the single thread that feeds from the ring buffer.  Doing 
> it at this point, we can ensure order, that the edits will be in the file in 
> accordance w/ the ordering of the region sequence id.
> But the point at which region sequence id is assigned an edit is deep down in 
> the WAL system and there is a lag between our putting an edit into the WAL 
> system and the edit actually getting its edit/sequence id.
> This lag -- "late-binding" -- complicates the unification of mvcc and region 
> sequence id, especially around async WAL writes (and, related, for no-WAL 
> writes) -- the parent for this issue (For async, how you get the edit id in 
> our system when the threads have all gone home -- unless you make them wait?)
> Chatting w/ Jeffrey Zhong yesterday, we came up with a crazypants means of 
> getting the region sequence id near-immediately.  We'll run two ringbuffers.  
> The first will mesh all handler threads and the consumer will generate ids 
> (we will have order on other side of this first ring buffer), and then if 
> async or no sync, we will just let the threads return ... updating mvcc just 
> before we let them go.  All other calls will go up on to the second ring 
> buffer to be serviced as now (batching, distribution out among the sync'ing 
> threads).  The first rb will have no friction and should turn at fast rates 
> compared to the second.  There should not be noticeable slowdown nor do I 
> foresee this refactor intefering w/ our multi-WAL plans.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10942) support parallel request cancellation for multi-get

2014-05-13 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996449#comment-13996449
 ] 

Nicolas Liochon commented on HBASE-10942:
-

bq. I think it's important to have way to turn off any new feature in case 
there's a bug in it...
Yep, but the fact that the first code path does not work does not mean that the 
second one works better :-). Especially in this case, there are some nasty 
impacts if the threads are not interrupted.

bq. What do you mean by having CompletionService i.e. what would we use it for
It would allow to put the thread management code in another class (like the 
code for the simple get case, a much simpler case, I agree).

> support parallel request cancellation for multi-get
> ---
>
> Key: HBASE-10942
> URL: https://issues.apache.org/jira/browse/HBASE-10942
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hbase-10070
>
> Attachments: HBASE-10942.01.patch, HBASE-10942.02.patch, 
> HBASE-10942.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-7987) Snapshot Manifest file instead of multiple empty files

2014-05-13 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996227#comment-13996227
 ] 

Matteo Bertozzi commented on HBASE-7987:


I'm +1 for a backport to 98, since is almost a clean backport.
but I guess the decision is on [~apurtell]

> Snapshot Manifest file instead of multiple empty files
> --
>
> Key: HBASE-7987
> URL: https://issues.apache.org/jira/browse/HBASE-7987
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
> Fix For: 0.99.0
>
> Attachments: HBASE-7987-v0.patch, HBASE-7987-v1.patch, 
> HBASE-7987-v2.patch, HBASE-7987-v2.sketch, HBASE-7987-v3.patch, 
> HBASE-7987-v4.patch, HBASE-7987-v5.patch, HBASE-7987-v6.patch, 
> HBASE-7987.sketch
>
>
> Currently taking a snapshot means creating one empty file for each file in 
> the source table directory, plus copying the .regioninfo file for each 
> region, the table descriptor file and a snapshotInfo file.
> during the restore or snapshot verification we traverse the filesystem 
> (fs.listStatus()) to find the snapshot files, and we open the .regioninfo 
> files to get the information.
> to avoid hammering the NameNode and having lots of empty files, we can use a 
> manifest file that contains the list of files and information that we need.
> To keep the RS parallelism that we have, each RS can write its own manifest.
> {code}
> message SnapshotDescriptor {
>   required string name;
>   optional string table;
>   optional int64 creationTime;
>   optional Type type;
>   optional int32 version;
> }
> message SnapshotRegionManifest {
>   optional int32 version;
>   required RegionInfo regionInfo;
>   repeated FamilyFiles familyFiles;
>   message StoreFile {
> required string name;
> optional Reference reference;
>   }
>   message FamilyFiles {
> required bytes familyName;
> repeated StoreFile storeFiles;
>   }
> }
> {code}
> {code}
> /hbase/.snapshot/
> /hbase/.snapshot//snapshotInfo
> /hbase/.snapshot//
> /hbase/.snapshot///tableInfo
> /hbase/.snapshot///regionManifest(.n)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11026) Provide option to filter out all rows in PerformanceEvaluation tool

2014-05-13 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996350#comment-13996350
 ] 

Jean-Marc Spaggiari commented on HBASE-11026:
-

Any chance to get this backported to 0.96 and 0.94?

> Provide option to filter out all rows in PerformanceEvaluation tool
> ---
>
> Key: HBASE-11026
> URL: https://issues.apache.org/jira/browse/HBASE-11026
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.99.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.99.0, 0.98.2
>
> Attachments: HBASE-11026_1.patch, HBASE-11026_2.patch, 
> HBASE-11026_4-0.98.patch, HBASE-11026_4.patch
>
>
> Performance Evaluation could also be used to check the actual performance of 
> the scans on the Server side by passing Filters that filters out all the 
> rows.  We can create a test filter and add it to the Filter.proto and set 
> this filter based on input params.  Could be helpful in testing.
> If you feel this is not needed pls feel free to close this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11137) Add mapred.TableSnapshotInputFormat

2014-05-13 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-11137:
-

Status: Patch Available  (was: Open)

> Add mapred.TableSnapshotInputFormat
> ---
>
> Key: HBASE-11137
> URL: https://issues.apache.org/jira/browse/HBASE-11137
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce, Performance
>Affects Versions: 0.96.2, 0.98.0
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Attachments: HBASE-11137.00.patch, HBASE-11137.01.patch, 
> HBASE-11137.01_rerun.patch
>
>
> We should have feature parity between mapreduce and mapred implementations. 
> This is important for Hive.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11148) Provide a distributed procedure to globally roll logs

2014-05-13 Thread Demai Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996529#comment-13996529
 ] 

Demai Ni commented on HBASE-11148:
--

[~jinghe], can we also have this one for 0.98? thanks... Demai

> Provide a distributed procedure to globally roll logs
> -
>
> Key: HBASE-11148
> URL: https://issues.apache.org/jira/browse/HBASE-11148
> Project: HBase
>  Issue Type: New Feature
>Reporter: Jerry He
> Fix For: 0.99.0
>
>
> Propose a distributed procedure here to globally roll logs.
> Currently HBaseAdmin and HBase shell provides a way to roll the WAL on a 
> single RS.
> Some use cases may require that all the RS roll the logs at the same time and 
> in a coordinated way. Also there may be requirement that some tasks be done 
> together with the roll log on each region server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11126) Add RegionObserver pre hooks that operate under row lock

2014-05-13 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996063#comment-13996063
 ] 

ramkrishna.s.vasudevan commented on HBASE-11126:


Can i take this up?  this is needed for visibility deletes. 

> Add RegionObserver pre hooks that operate under row lock
> 
>
> Key: HBASE-11126
> URL: https://issues.apache.org/jira/browse/HBASE-11126
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Andrew Purtell
>
> The coprocessor hooks were placed outside of row locks. This was meant to 
> sidestep performance issues arising from significant work done within hook 
> invocations. However as the security code increases in sophistication we are 
> now running into concurrency issues trying to use them as a result of that 
> early decision. Since the initial introduction of coprocessor upcalls there 
> has been some significant refactoring done around them and concurrency 
> control in core has become more complex. This is potentially an issue for 
> many coprocessor users.
> We should do either:\\
> - Move all existing RegionObserver pre* hooks to execute under row lock.
> - Introduce a new set of RegionObserver pre* hooks that execute under row 
> lock, named to indicate such.
> The second option is less likely to lead to surprises.
> All RegionObserver hook Javadoc should be updated with advice to the 
> coprocessor implementor not to take their own row locks in the hook. If the 
> current thread happens to already have a row lock and they try to take a lock 
> on another row, there is a deadlock risk.
> As always a drawback of adding hooks is the potential for performance impact. 
> We should benchmark the impact and decide if the second option above is a 
> viable choice or if the first option is required.
> Finally, we should introduce a higher level interface for managing the 
> registration of 'user' code for execution from the low level hooks. I filed 
> HBASE-11125 to discuss this further.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11108) Split ZKTable into interface and implementation

2014-05-13 Thread Mikhail Antonov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Antonov updated HBASE-11108:


Status: Patch Available  (was: Open)

> Split ZKTable into interface and implementation
> ---
>
> Key: HBASE-11108
> URL: https://issues.apache.org/jira/browse/HBASE-11108
> Project: HBase
>  Issue Type: Sub-task
>  Components: Consensus, Zookeeper
>Affects Versions: 0.99.0
>Reporter: Konstantin Boudnik
>Assignee: Mikhail Antonov
> Attachments: HBASE-11108.patch, HBASE-11108.patch
>
>
> In HBASE-11071 we are trying to split admin handlers away from ZK. However, a 
> ZKTable instance is being used in multiple places, hence it would be 
> beneficial to hide its implementation behind a well defined interface.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11140) LocalHBaseCluster should create ConsensusProvider per each server

2014-05-13 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995468#comment-13995468
 ] 

Mikhail Antonov commented on HBASE-11140:
-

Thanks for review Ted

> LocalHBaseCluster should create ConsensusProvider per each server
> -
>
> Key: HBASE-11140
> URL: https://issues.apache.org/jira/browse/HBASE-11140
> Project: HBase
>  Issue Type: Sub-task
>  Components: Consensus
>Affects Versions: 0.99.0
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Fix For: 0.99.0
>
> Attachments: HBASE-11140.patch
>
>
> Right now there's a bug there when single ConsensusProvider instance is 
> shared across all threads running region servers and masters within 
> processes, which breaks certain tests in patches which used to pass 
> successfully before.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HBASE-11155) Fix Validation Errors in Ref Guide

2014-05-13 Thread Misty Stanley-Jones (JIRA)
Misty Stanley-Jones created HBASE-11155:
---

 Summary: Fix Validation Errors in Ref Guide
 Key: HBASE-11155
 URL: https://issues.apache.org/jira/browse/HBASE-11155
 Project: HBase
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.98.2
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones


Before I do serious documentation work, I have to fix all of the validation 
errors that are somehow not causing the Ref Guide to break the builds. I will 
attach one patch per file -- that's the easiest way I know how to do it. I will 
try not to make any content changes, only validation changes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10251) Restore API Compat for PerformanceEvaluation.generateValue()

2014-05-13 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996651#comment-13996651
 ] 

Jean-Daniel Cryans commented on HBASE-10251:


Pass VALUE_LENGTH directly instead of re-declaring it.

> Restore API Compat for PerformanceEvaluation.generateValue()
> 
>
> Key: HBASE-10251
> URL: https://issues.apache.org/jira/browse/HBASE-10251
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.98.0, 0.98.1, 0.99.0, 0.98.2
>Reporter: Aleksandr Shulman
>Assignee: Dima Spivak
>  Labels: api_compatibility
> Fix For: 0.98.0, 0.98.1, 0.99.0, 0.98.2
>
> Attachments: HBASE_10251.patch
>
>
> Observed:
> A couple of my client tests fail to compile against trunk because the method 
> PerformanceEvaluation.generateValue was removed as part of HBASE-8496.
> This is an issue because it was used in a number of places, including unit 
> tests. Since we did not explicitly label this API as private, it's ambiguous 
> as to whether this could/should have been used by people writing apps against 
> 0.96. If they used it, then they would be broken upon upgrade to 0.98 and 
> trunk.
> Potential Solution:
> The method was renamed to generateData, but the logic is still the same. We 
> can reintroduce it as deprecated in 0.98, as compat shim over generateData. 
> The patch should be a few lines. We may also consider doing so in trunk, but 
> I'd be just as fine with leaving it out.
> More generally, this raises the question about what other code is in this 
> "grey-area", where it is public, is used outside of the package, but is not 
> explicitly labeled with an AudienceInterface.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11155) Fix Validation Errors in Ref Guide

2014-05-13 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996642#comment-13996642
 ] 

Jonathan Hsieh commented on HBASE-11155:


Thus far these changes look good modulo the nit below.  

What command are you running to get the warnings and where the build is broken? 
 One of our precommit tests checks the doc compile.

Nit: this looks like it snuck in from another patch.
{code}
   
 Reverse Timestamps
+
+  Reverse Scan API
+  
+https://issues.apache.org/jira/browse/HBASE-4811";>HBASE-4811 
implements an API to scan a table or a range within a table in reverse, 
reducing the need to optimize your schema for forward or reverse scanning. This 
feature is available in HBase 0.98 and later. See https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed%28boolean";
 /> for more information.
+  
+
+
{code}

We generally like to have one patch, one jira, one commit.  How many more of 
these are coming?  

If the bunch will be strung out over for more than a few days, I'd rather get 
the nit fixed and commit.  I would rename the jira to "Fix validation errors in 
ref guide (book,configuration,schema_design)" and let you open new issue for 
the rest.  If you can get them all in the next few days, we can wait for all of 
them.



> Fix Validation Errors in Ref Guide
> --
>
> Key: HBASE-11155
> URL: https://issues.apache.org/jira/browse/HBASE-11155
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.98.2
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Attachments: HBASE-11155-book.xml.patch, 
> HBASE-11155-configuration.xml.patch, HBASE-11155-schema_design.xml.patch
>
>
> Before I do serious documentation work, I have to fix all of the validation 
> errors that are somehow not causing the Ref Guide to break the builds. I will 
> attach one patch per file -- that's the easiest way I know how to do it. I 
> will try not to make any content changes, only validation changes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10357) Failover RPC's for scans

2014-05-13 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13992970#comment-13992970
 ] 

Enis Soztutar commented on HBASE-10357:
---

Mind putting this to RB? 

> Failover RPC's for scans
> 
>
> Key: HBASE-10357
> URL: https://issues.apache.org/jira/browse/HBASE-10357
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Reporter: Enis Soztutar
> Fix For: 0.99.0
>
> Attachments: 10357-1.txt, 10357-2.txt, 10357-3.2.txt, 10357-3.txt, 
> 10357-4.2.txt, 10357-4.txt
>
>
> This is extension of HBASE-10355 to add failover support for scans. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10861) Supporting API in ByteRange

2014-05-13 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996317#comment-13996317
 ] 

ramkrishna.s.vasudevan commented on HBASE-10861:


Currently the code is using a mutable version every where.  So should this be 
by default MutableBR and when needed the immutableBR?

> Supporting API in ByteRange
> ---
>
> Key: HBASE-10861
> URL: https://issues.apache.org/jira/browse/HBASE-10861
> Project: HBase
>  Issue Type: Improvement
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBASE-10861.patch, HBASE-10861_2.patch
>
>
> We would need APIs that would 
> setLimit(int limit)
> getLimt()
> asReadOnly()
> These APIs would help in implementations that have Buffers offheap (for now 
> BRs backed by DBB).
> If anything more is needed could be added when needed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-10573) Use Netty 4

2014-05-13 Thread Nicolas Liochon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Liochon updated HBASE-10573:


Attachment: 10573.v3.patch

> Use Netty 4
> ---
>
> Key: HBASE-10573
> URL: https://issues.apache.org/jira/browse/HBASE-10573
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 0.99.0, hbase-10191
>Reporter: Andrew Purtell
> Fix For: 0.99.0
>
> Attachments: 10573.patch, 10573.patch, 10573.v3.patch
>
>
> Pull in Netty 4 and sort out the consequences.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-9042) Can't get TestHCM#testClusterStatus to work

2014-05-13 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996562#comment-13996562
 ] 

Nicolas Liochon commented on HBASE-9042:


[~lars_francke] Lars, could you please give a try to the patch in HBASE-10573? 
Thanks!

> Can't get TestHCM#testClusterStatus to work
> ---
>
> Key: HBASE-9042
> URL: https://issues.apache.org/jira/browse/HBASE-9042
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: Lars Francke
>Priority: Minor
> Attachments: HBASE-9042.1.txt
>
>
> I can't get HBase trunk to build. In particular TestHCM.testClusterStatus 
> always fails for me. I tried on my own Jenkins as well as my IDE (IntelliJ) 
> with the same result (two different machines, CentOS & Mac OS).
> mvn -U -PrunAllTests -Dmaven.test.redirectTestOutputToFile=true
> -Dit.test=noItTest clean install
> I've attached the full log. It fails on the last wait by exceeding the 
> timeout. This is reported:
> {code}
>  - Thread LEAK? -, OpenFileDescriptor=417 (was 440), MaxFileDescriptor=4096 
> (was 4096), SystemLoadAverage=227 (was 265), ProcessCount=243 (was 240) - 
> ProcessCount LEAK? -, AvailableMemoryMB=2196 (was 1991) - AvailableMemoryMB 
> LEAK? -, ConnectionCount=7 (was 6) - ConnectionCount LEAK? -
> {code}
> And the Thread dump (see attached file) has a bunch of things reported as 
> potentially hanging threads.
> From my MacBook's command line I got the test to pass using the same
> command but not in Jenkins or from IntelliJ.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11143) ageOfLastShippedOp metric is confusing

2014-05-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995276#comment-13995276
 ] 

Lars Hofhansl commented on HBASE-11143:
---

Yep, I'd add the new metric to 0.96 and later.
Cool... I'll commit this to all branches in a bit (0.96+ will only get the new 
metric).

> ageOfLastShippedOp metric is confusing
> --
>
> Key: HBASE-11143
> URL: https://issues.apache.org/jira/browse/HBASE-11143
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.20
>
> Attachments: 11143-0.94-v2.txt, 11143-0.94.txt
>
>
> We are trying to report on replication lag and find that there is no good 
> single metric to do that.
> ageOfLastShippedOp is close, but unfortunately it is increased even when 
> there is nothing to ship on a particular RegionServer.
> I would like discuss a few options here:
> Add a new metric: replicationQueueTime (or something) with the above meaning. 
> I.e. if we have something to ship we set the age of that last shipped edit, 
> if we fail we increment that last time (just like we do now). But if there is 
> nothing to replicate we set it to current time (and hence that metric is 
> reported to close to 0).
> Alternatively we could change the meaning of ageOfLastShippedOp to mean to do 
> that. That might lead to surprises, but the current behavior is clearly weird 
> when there is nothing to replicate.
> Comments? [~jdcryans], [~stack].
> If approach sounds good, I'll make a patch for all branches.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11129) Expose Scan conversion methods in TableMapReduceUtil as public methods

2014-05-13 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13992863#comment-13992863
 ] 

Nick Dimiduk commented on HBASE-11129:
--

Passing the scanner on the jobconf is a private API, now that the serialization 
details are private methods. This implementation detail should be isolated 
within a single job -- either it picks up the 0.X.Y hbase-server jar or it has 
0.X.Z version, there's no mixing. We'd need to test it out, but I think making 
this change could be acceptable for a patch release.

Looking at TableInputFormat#setConf, either "hbase.mapreduce.scan" is 
respected, or "hbase.mapreduce.scan.*" params are used. What I propose does 
away with the former. This way, these configs become part of the public API.

> Expose Scan conversion methods in TableMapReduceUtil as public methods
> --
>
> Key: HBASE-11129
> URL: https://issues.apache.org/jira/browse/HBASE-11129
> Project: HBase
>  Issue Type: Task
>  Components: mapreduce
>Reporter: Ted Yu
>Assignee: Gustavo Anatoly
>Priority: Minor
> Attachments: HBASE-11129.patch
>
>
> Scan#readFields() from 0.92 has been removed.
> TableMapReduceUtil has the following package private methods:
> {code}
>   static String convertScanToString(Scan scan) throws IOException {
> {code}
> {code}
>   static Scan convertStringToScan(String base64) throws IOException {
> {code}
> We should consider exposing them as public methods so that user can interpret 
> Scan objects easily in mapreduce jobs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11135) Change region sequenceid generation so happens earlier in the append cycle rather than just before added to file

2014-05-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11135:
--

Attachment: 11135v8.txt

> Change region sequenceid generation so happens earlier in the append cycle 
> rather than just before added to file
> 
>
> Key: HBASE-11135
> URL: https://issues.apache.org/jira/browse/HBASE-11135
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: stack
> Attachments: 11135.wip.txt, 11135v2.txt, 11135v5.txt, 11135v5.txt, 
> 11135v5.txt, 11135v6.txt, 11135v7.txt, 11135v8.txt
>
>
> Currently we assign the region edit/sequence id just before we put it in the 
> WAL.  We do it in the single thread that feeds from the ring buffer.  Doing 
> it at this point, we can ensure order, that the edits will be in the file in 
> accordance w/ the ordering of the region sequence id.
> But the point at which region sequence id is assigned an edit is deep down in 
> the WAL system and there is a lag between our putting an edit into the WAL 
> system and the edit actually getting its edit/sequence id.
> This lag -- "late-binding" -- complicates the unification of mvcc and region 
> sequence id, especially around async WAL writes (and, related, for no-WAL 
> writes) -- the parent for this issue (For async, how you get the edit id in 
> our system when the threads have all gone home -- unless you make them wait?)
> Chatting w/ Jeffrey Zhong yesterday, we came up with a crazypants means of 
> getting the region sequence id near-immediately.  We'll run two ringbuffers.  
> The first will mesh all handler threads and the consumer will generate ids 
> (we will have order on other side of this first ring buffer), and then if 
> async or no sync, we will just let the threads return ... updating mvcc just 
> before we let them go.  All other calls will go up on to the second ring 
> buffer to be serviced as now (batching, distribution out among the sync'ing 
> threads).  The first rb will have no friction and should turn at fast rates 
> compared to the second.  There should not be noticeable slowdown nor do I 
> foresee this refactor intefering w/ our multi-WAL plans.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-6990) Pretty print TTL

2014-05-13 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996526#comment-13996526
 ] 

Jonathan Hsieh commented on HBASE-6990:
---

Let's make this super readable and so we can use this for other units 
eventually as well.  A bunch of terms are added and we can probably get away 
with fewer.

"Human" is a poor class name  (the code in it doesn't represent a human).  
Maybe PrettyPrinter?

Unit is good to have in pretty printer.  

Keep the HColumnDescriptor stuff in HCD, and keep the PrettyPrinter stuff in 
its class. So move getUnit, the method that figure out units, into 
HColumnDescriptor (e.g. "TTL" is a TIME_INTERVAL unit) .  Have the 
"Human.toHumanString" call take the value and the unit and not the HCD specific 
string.  It would look like "Human.toHumanString(String value, Unit unit)".

nit: Rename "toHumanString" to something else like format.  So the call HCD 
might become "PrettyPrinter.format(value, unit)".  Seems clearer than 
"Human.toHumanString".

Duplicate? its in the "Human" class also.
{code}
   /**
+   *  TTL related constat for human readable representation
+   */
+
+  public static final Long SECONDS_PER_DAY = 60 * 60 * 24L;
+
+
+  /**
 {code}

TTL is only related to column descriptor. I'm suprised there is no constant for 
that elsewhere?
{code}
+  /**
+   *  TTL related constants for human readable representation
+   */
+  public static final Long SECONDS_PER_DAY = 60 * 60 * 24L;
+  public static final String TTL = "TTL";
+
{code}

Also consider passing in the string buffer instead of creating a new one in 
each of the helper methods.

> Pretty print TTL
> 
>
> Key: HBASE-6990
> URL: https://issues.apache.org/jira/browse/HBASE-6990
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Esteban Gutierrez
>Priority: Minor
> Attachments: HBASE-6990.v0.patch, HBASE-6990.v1.patch, 
> HBASE-6990.v2.patch
>
>
> I've seen a lot of users getting confused by the TTL configuration and I 
> think that if we just pretty printed it it would solve most of the issues. 
> For example, let's say a user wanted to set a TTL of 90 days. That would be 
> 7776000. But let's say that it was typo'd to 7776 instead, it gives you 
> 900 days!
> So when we print the TTL we could do something like "x days, x hours, x 
> minutes, x seconds (real_ttl_value)". This would also help people when they 
> use ms instead of seconds as they would see really big values in there.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-10573) Use Netty 4

2014-05-13 Thread Nicolas Liochon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Liochon updated HBASE-10573:


Fix Version/s: 0.99.0

> Use Netty 4
> ---
>
> Key: HBASE-10573
> URL: https://issues.apache.org/jira/browse/HBASE-10573
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 0.99.0, hbase-10191
>Reporter: Andrew Purtell
>Assignee: Nicolas Liochon
> Fix For: 0.99.0
>
> Attachments: 10573.patch, 10573.patch, 10573.v3.patch
>
>
> Pull in Netty 4 and sort out the consequences.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10885) Support visibility expressions on Deletes

2014-05-13 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996785#comment-13996785
 ] 

Anoop Sam John commented on HBASE-10885:


Left some comments in RB. Review is not over. Will complete it soon.

> Support visibility expressions on Deletes
> -
>
> Key: HBASE-10885
> URL: https://issues.apache.org/jira/browse/HBASE-10885
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.98.1
>Reporter: Andrew Purtell
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.99.0, 0.98.3
>
> Attachments: HBASE-10885_1.patch, HBASE-10885_2.patch, 
> HBASE-10885_new_tag_type_1.patch, HBASE-10885_new_tag_type_2.patch
>
>
> Accumulo can specify visibility expressions for delete markers. During 
> compaction the cells covered by the tombstone are determined in part by 
> matching the visibility expression. This is useful for the use case of data 
> set coalescing, where entries from multiple data sets carrying different 
> labels are combined into one common large table. Later, a subset of entries 
> can be conveniently removed using visibility expressions.
> Currently doing the same in HBase would only be possible with a custom 
> coprocessor. Otherwise, a Delete will affect all cells covered by the 
> tombstone regardless of any visibility expression scoping. This is correct 
> behavior in that no data spill is possible, but certainly could be 
> surprising, and is only meant to be transitional. We decided not to support 
> visibility expressions on Deletes to control the complexity of the initial 
> implementation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11137) Add mapred.TableSnapshotInputFormat

2014-05-13 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-11137:
-

Attachment: HBASE-11137.01_rerun.patch

> Add mapred.TableSnapshotInputFormat
> ---
>
> Key: HBASE-11137
> URL: https://issues.apache.org/jira/browse/HBASE-11137
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce, Performance
>Affects Versions: 0.98.0, 0.96.2
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Attachments: HBASE-11137.00.patch, HBASE-11137.01.patch, 
> HBASE-11137.01_rerun.patch
>
>
> We should have feature parity between mapreduce and mapred implementations. 
> This is important for Hive.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11135) Change region sequenceid generation so happens earlier in the append cycle rather than just before added to file

2014-05-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996811#comment-13996811
 ] 

stack commented on HBASE-11135:
---

Let me commit.  hadoopqa is sporadic.  I ran test suite locally and it passed:

{code}
...
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 50.138 sec

Results :

Tests run: 1187, Failures: 0, Errors: 0, Skipped: 6
...
{code}

> Change region sequenceid generation so happens earlier in the append cycle 
> rather than just before added to file
> 
>
> Key: HBASE-11135
> URL: https://issues.apache.org/jira/browse/HBASE-11135
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: stack
> Attachments: 11135.wip.txt, 11135v2.txt, 11135v5.txt, 11135v5.txt, 
> 11135v5.txt, 11135v6.txt, 11135v7.txt, 11135v8.txt, 11135v8.txt
>
>
> Currently we assign the region edit/sequence id just before we put it in the 
> WAL.  We do it in the single thread that feeds from the ring buffer.  Doing 
> it at this point, we can ensure order, that the edits will be in the file in 
> accordance w/ the ordering of the region sequence id.
> But the point at which region sequence id is assigned an edit is deep down in 
> the WAL system and there is a lag between our putting an edit into the WAL 
> system and the edit actually getting its edit/sequence id.
> This lag -- "late-binding" -- complicates the unification of mvcc and region 
> sequence id, especially around async WAL writes (and, related, for no-WAL 
> writes) -- the parent for this issue (For async, how you get the edit id in 
> our system when the threads have all gone home -- unless you make them wait?)
> Chatting w/ Jeffrey Zhong yesterday, we came up with a crazypants means of 
> getting the region sequence id near-immediately.  We'll run two ringbuffers.  
> The first will mesh all handler threads and the consumer will generate ids 
> (we will have order on other side of this first ring buffer), and then if 
> async or no sync, we will just let the threads return ... updating mvcc just 
> before we let them go.  All other calls will go up on to the second ring 
> buffer to be serviced as now (batching, distribution out among the sync'ing 
> threads).  The first rb will have no friction and should turn at fast rates 
> compared to the second.  There should not be noticeable slowdown nor do I 
> foresee this refactor intefering w/ our multi-WAL plans.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-10861) Supporting API in ByteRange

2014-05-13 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-10861:
---

Status: Patch Available  (was: Open)

> Supporting API in ByteRange
> ---
>
> Key: HBASE-10861
> URL: https://issues.apache.org/jira/browse/HBASE-10861
> Project: HBase
>  Issue Type: Improvement
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBASE-10861.patch, HBASE-10861_2.patch, 
> HBASE-10861_3.patch
>
>
> We would need APIs that would 
> setLimit(int limit)
> getLimt()
> asReadOnly()
> These APIs would help in implementations that have Buffers offheap (for now 
> BRs backed by DBB).
> If anything more is needed could be added when needed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11109) flush region sequence id may not be larger than all edits flushed

2014-05-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11109:
--

Resolution: Invalid
Status: Resolved  (was: Patch Available)

This has been subsumed by HBASE-11135.  The tests that are in this patch went 
in as part of HBASE-11135.  The 'sync' fix under the update lock while flushing 
was replaced by an append+wait+on+sequenceid+update which should be less 
onerous.

> flush region sequence id may not be larger than all edits flushed
> -
>
> Key: HBASE-11109
> URL: https://issues.apache.org/jira/browse/HBASE-11109
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Affects Versions: 0.99.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 0.99.0
>
> Attachments: 11109.txt, 11109v2.txt, 11109v2.txt
>
>
> This was found by [~jeffreyz]  See parent issue.  We have this issue since we 
> put the ring buffer/disrupter into the WAL (HBASE-10156).
> An edits region sequence id is set only after the edit has traversed the ring 
> buffer.  Flushing, we just up whatever the current region sequence id is.  
> Crossing the ring buffer may take some time and is done by background 
> threads.  The flusher may be taking the region sequence id though edits have 
> not yet made it across the ringbuffer: i.e. edits that are actually scoped by 
> the flush may have region sequence ids in excess of that of the flush 
> sequence id reported.
> The consequences are not exactly clear.  Would rather not have to find out so 
> lets fix this here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11135) Change region sequenceid generation so happens earlier in the append cycle rather than just before added to file

2014-05-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11135:
--

Attachment: 11135v8.addendum.doc.txt

[~jeffreyz] I committed this addendum to make comments match the 
implementation.  Thanks for review.

> Change region sequenceid generation so happens earlier in the append cycle 
> rather than just before added to file
> 
>
> Key: HBASE-11135
> URL: https://issues.apache.org/jira/browse/HBASE-11135
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: stack
> Attachments: 11135.wip.txt, 11135v2.txt, 11135v5.txt, 11135v5.txt, 
> 11135v5.txt, 11135v6.txt, 11135v7.txt, 11135v8.addendum.doc.txt, 11135v8.txt, 
> 11135v8.txt
>
>
> Currently we assign the region edit/sequence id just before we put it in the 
> WAL.  We do it in the single thread that feeds from the ring buffer.  Doing 
> it at this point, we can ensure order, that the edits will be in the file in 
> accordance w/ the ordering of the region sequence id.
> But the point at which region sequence id is assigned an edit is deep down in 
> the WAL system and there is a lag between our putting an edit into the WAL 
> system and the edit actually getting its edit/sequence id.
> This lag -- "late-binding" -- complicates the unification of mvcc and region 
> sequence id, especially around async WAL writes (and, related, for no-WAL 
> writes) -- the parent for this issue (For async, how you get the edit id in 
> our system when the threads have all gone home -- unless you make them wait?)
> Chatting w/ Jeffrey Zhong yesterday, we came up with a crazypants means of 
> getting the region sequence id near-immediately.  We'll run two ringbuffers.  
> The first will mesh all handler threads and the consumer will generate ids 
> (we will have order on other side of this first ring buffer), and then if 
> async or no sync, we will just let the threads return ... updating mvcc just 
> before we let them go.  All other calls will go up on to the second ring 
> buffer to be serviced as now (batching, distribution out among the sync'ing 
> threads).  The first rb will have no friction and should turn at fast rates 
> compared to the second.  There should not be noticeable slowdown nor do I 
> foresee this refactor intefering w/ our multi-WAL plans.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10569) Co-locate meta and master

2014-05-13 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996691#comment-13996691
 ] 

Matteo Bertozzi commented on HBASE-10569:
-

I think the main motivation is the colocation of all the components that are 
usually involved in a single "transaction". The main example of this is the 
Assignment, which involve: Master, META, ZooKeeper.

#1 and #2 are part of a generic notification system, which will be used to 
propagate ACLs, Visibility, Quotas. (In theory the base of this system is also 
the one behind the ZK-less Assignment)

For the Horizontal scalability, I think that we are going to have Multiple 
Master each one operating on its subsection of "meta" (and the notification 
system). This means that you will have concurrent assignments on different 
masters. 
The best case is where you can fit a full table (regions metadata) on a single 
master, the other case is where your table is split on multiple master which 
means that operation that requires to work on the full set of regions e.g. 
delete, disable, enable need some sort of coordination to provide the full 
consistency that you'll get with a full table that fits on a single master. 

> Co-locate meta and master
> -
>
> Key: HBASE-10569
> URL: https://issues.apache.org/jira/browse/HBASE-10569
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Region Assignment
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 0.99.0
>
> Attachments: Co-locateMetaAndMasterHBASE-10569.pdf, 
> hbase-10569_v1.patch, hbase-10569_v2.patch, hbase-10569_v3.1.patch, 
> hbase-10569_v3.patch, master_rs.pdf
>
>
> I was thinking simplifying/improving the region assignments. The first step 
> is to co-locate the meta and the master as many people agreed on HBASE-5487.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10936) Add zeroByte encoding test

2014-05-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996850#comment-13996850
 ] 

stack commented on HBASE-10936:
---

+1 for 0.96 if in 0.94, else -1 since just a new test... (minimizing commits to 
0.96 to bug fixes only)

> Add zeroByte encoding test
> --
>
> Key: HBASE-10936
> URL: https://issues.apache.org/jira/browse/HBASE-10936
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Lars Hofhansl
>Priority: Minor
> Fix For: 0.96.3, 0.94.20
>
> Attachments: 10936-0.94.txt, 10936-0.96.txt, 10936-0.98.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11084) Handle deletions and column tracking using internal filters rather than using DeleteTrackers and ColumnTrackers

2014-05-13 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995946#comment-13995946
 ] 

Jean-Marc Spaggiari commented on HBASE-11084:
-

Can we detail pros of this approach?

> Handle deletions and column tracking using internal filters rather than using 
> DeleteTrackers and ColumnTrackers
> ---
>
> Key: HBASE-11084
> URL: https://issues.apache.org/jira/browse/HBASE-11084
> Project: HBase
>  Issue Type: Improvement
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.99.0
>
>
> See HBASE-11054 for discussion.  
> Currently the delete tracking is done by DeleteTracker and its subclasses.  
> column tracking and its versions are handled inside the ColumnTrackers and 
> its subclasses. 
> This JIRA aims at providing internal filters and attaching them to 
> scans/gets, including minor and major compaction scans so that all the logic 
> of deletes and version counting goes into it rather than having trackers.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10942) support parallel request cancellation for multi-get

2014-05-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996636#comment-13996636
 ] 

Sergey Shelukhin commented on HBASE-10942:
--

1 - but it reduces the surface area. There's very little change to path when 
this feature is disabled... I've been burned by this in various distributed 
systems, most recently Hive (which is much easier to upgrade) 

2 - can you give example? What code would be possible to move? We can move it. 
It uses thread pool right now...

> support parallel request cancellation for multi-get
> ---
>
> Key: HBASE-10942
> URL: https://issues.apache.org/jira/browse/HBASE-10942
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hbase-10070
>
> Attachments: HBASE-10942.01.patch, HBASE-10942.02.patch, 
> HBASE-10942.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11154) Document how to use Reverse Scan API

2014-05-13 Thread Misty Stanley-Jones (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996896#comment-13996896
 ] 

Misty Stanley-Jones commented on HBASE-11154:
-

Using Oxygen. Some things probably won't break the build but aren't really 
valid. Sorry about the  error, I am very used to DB 4.5 and did not even 
question myself.

> Document how to use Reverse Scan API
> 
>
> Key: HBASE-11154
> URL: https://issues.apache.org/jira/browse/HBASE-11154
> Project: HBase
>  Issue Type: Task
>  Components: documentation, Scanners
>Affects Versions: 0.98.2
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Fix For: 0.99.0
>
> Attachments: HBASE-11154-2.patch, HBASE-11154.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11148) Provide a distributed procedure to globally roll logs

2014-05-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995818#comment-13995818
 ] 

stack commented on HBASE-11148:
---

[~nidmhbase] Is there supposed to be a patch here sir?

> Provide a distributed procedure to globally roll logs
> -
>
> Key: HBASE-11148
> URL: https://issues.apache.org/jira/browse/HBASE-11148
> Project: HBase
>  Issue Type: New Feature
>Reporter: Jerry He
> Fix For: 0.99.0
>
>
> Propose a distributed procedure here to globally roll logs.
> Currently HBaseAdmin and HBase shell provides a way to roll the WAL on a 
> single RS.
> Some use cases may require that all the RS roll the logs at the same time and 
> in a coordinated way. Also there may be requirement that some tasks be done 
> together with the roll log on each region server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11155) Fix Validation Errors in Ref Guide

2014-05-13 Thread Misty Stanley-Jones (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996906#comment-13996906
 ] 

Misty Stanley-Jones commented on HBASE-11155:
-

I'm not done with this yet and don't want review for it yet, thank you. :) I 
will fix the mistaken commit.

> Fix Validation Errors in Ref Guide
> --
>
> Key: HBASE-11155
> URL: https://issues.apache.org/jira/browse/HBASE-11155
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.98.2
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Attachments: HBASE-11155-book.xml.patch, 
> HBASE-11155-configuration.xml.patch, HBASE-11155-schema_design.xml.patch
>
>
> Before I do serious documentation work, I have to fix all of the validation 
> errors that are somehow not causing the Ref Guide to break the builds. I will 
> attach one patch per file -- that's the easiest way I know how to do it. I 
> will try not to make any content changes, only validation changes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-7987) Snapshot Manifest file instead of multiple empty files

2014-05-13 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996424#comment-13996424
 ] 

Nick Dimiduk commented on HBASE-7987:
-

I guess the same concern applies to 0.96 and 0.94 as I'd like to support 
mapred.TableSnapshotInputFormat everywhere there's a mapreduce version. 
[~apurtell], [~stack], [~lhofhansl], [~enis]?

> Snapshot Manifest file instead of multiple empty files
> --
>
> Key: HBASE-7987
> URL: https://issues.apache.org/jira/browse/HBASE-7987
> Project: HBase
>  Issue Type: Improvement
>  Components: snapshots
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
> Fix For: 0.99.0
>
> Attachments: HBASE-7987-v0.patch, HBASE-7987-v1.patch, 
> HBASE-7987-v2.patch, HBASE-7987-v2.sketch, HBASE-7987-v3.patch, 
> HBASE-7987-v4.patch, HBASE-7987-v5.patch, HBASE-7987-v6.patch, 
> HBASE-7987.sketch
>
>
> Currently taking a snapshot means creating one empty file for each file in 
> the source table directory, plus copying the .regioninfo file for each 
> region, the table descriptor file and a snapshotInfo file.
> during the restore or snapshot verification we traverse the filesystem 
> (fs.listStatus()) to find the snapshot files, and we open the .regioninfo 
> files to get the information.
> to avoid hammering the NameNode and having lots of empty files, we can use a 
> manifest file that contains the list of files and information that we need.
> To keep the RS parallelism that we have, each RS can write its own manifest.
> {code}
> message SnapshotDescriptor {
>   required string name;
>   optional string table;
>   optional int64 creationTime;
>   optional Type type;
>   optional int32 version;
> }
> message SnapshotRegionManifest {
>   optional int32 version;
>   required RegionInfo regionInfo;
>   repeated FamilyFiles familyFiles;
>   message StoreFile {
> required string name;
> optional Reference reference;
>   }
>   message FamilyFiles {
> required bytes familyName;
> repeated StoreFile storeFiles;
>   }
> }
> {code}
> {code}
> /hbase/.snapshot/
> /hbase/.snapshot//snapshotInfo
> /hbase/.snapshot//
> /hbase/.snapshot///tableInfo
> /hbase/.snapshot///regionManifest(.n)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10965) Automate detection of presence of Filter#filterRow()

2014-05-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996819#comment-13996819
 ] 

Ted Yu commented on HBASE-10965:


In 0.94, Filter is an interface.
This means user filters coming from 0.94 would always override FilterBase.

> Automate detection of presence of Filter#filterRow()
> 
>
> Key: HBASE-10965
> URL: https://issues.apache.org/jira/browse/HBASE-10965
> Project: HBase
>  Issue Type: Task
>  Components: Filters
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 10965-v1.txt, 10965-v2.txt, 10965-v3.txt, 10965-v4.txt, 
> 10965-v6.txt, 10965-v7.txt
>
>
> There is potential inconsistency between the return value of 
> Filter#hasFilterRow() and presence of Filter#filterRow().
> Filters may override Filter#filterRow() while leaving return value of 
> Filter#hasFilterRow() being false (inherited from FilterBase).
> Downside to purely depending on hasFilterRow() telling us whether custom 
> filter overrides filterRow(List) or filterRow() is that the check below may 
> be rendered ineffective:
> {code}
>   if (nextKv == KV_LIMIT) {
> if (this.filter != null && filter.hasFilterRow()) {
>   throw new IncompatibleFilterException(
> "Filter whose hasFilterRow() returns true is incompatible 
> with scan with limit!");
> }
> {code}
> When user forgets to override hasFilterRow(), the above check becomes not 
> useful.
> Another limitation is that we cannot optimize FilterList#filterRow() through 
> short circuit when FilterList#hasFilterRow() turns false.
> See 
> https://issues.apache.org/jira/browse/HBASE-11093?focusedCommentId=13985149&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13985149
> This JIRA aims to remove the inconsistency by automatically detecting the 
> presence of overridden Filter#filterRow(). For FilterBase-derived classes, if 
> filterRow() is implemented and not inherited from FilterBase, it is 
> equivalent to having hasFilterRow() return true.
> With precise detection of presence of Filter#filterRow(), the following code 
> from HRegion is no longer needed while backward compatibility is kept.
> {code}
>   return filter != null && (!filter.hasFilterRow())
>   && filter.filterRow();
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11162) RegionServer webui uses the default master info port irrespective of the user configuration.

2014-05-13 Thread Srikanth Srungarapu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Srikanth Srungarapu updated HBASE-11162:


Attachment: HBASE-11162.patch

> RegionServer webui uses the default master info port irrespective of the user 
> configuration.
> 
>
> Key: HBASE-11162
> URL: https://issues.apache.org/jira/browse/HBASE-11162
> Project: HBase
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 0.98.2
>Reporter: Srikanth Srungarapu
>Assignee: Srikanth Srungarapu
>Priority: Minor
> Fix For: 0.98.4
>
> Attachments: HBASE-11162.patch
>
>
> Under the regionserver ui software attributes section, the value for 
> attribute HBase Master is always using the default port (60010), even if the 
> user configured master info port to a different value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11123) Upgrade instructions from 0.94 to 0.98

2014-05-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11123:
--

   Resolution: Fixed
Fix Version/s: 0.99.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks for the patch [~misty]

> Upgrade instructions from 0.94 to 0.98
> --
>
> Key: HBASE-11123
> URL: https://issues.apache.org/jira/browse/HBASE-11123
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.98.2
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
>Priority: Minor
> Fix For: 0.99.0
>
> Attachments: HBASE-11123-1.patch, HBASE-11123.patch
>
>
> I cloned this from the 0.96 upgrade docs task. It was suggested that we need 
> upgrade instructions from 0.94 to 0.98. I will need source material to even 
> prioritize this. Assuming this is Minor.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10915) Decouple region closing (HM and HRS) from ZK

2014-05-13 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993496#comment-13993496
 ] 

Mikhail Antonov commented on HBASE-10915:
-

Would appreciate feedback on latest patch.

> Decouple region closing (HM and HRS) from ZK
> 
>
> Key: HBASE-10915
> URL: https://issues.apache.org/jira/browse/HBASE-10915
> Project: HBase
>  Issue Type: Sub-task
>  Components: Consensus, Zookeeper
>Affects Versions: 0.99.0
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
> Attachments: HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch, 
> HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch, 
> HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch, HBASE-10915.patch, 
> HBASE-10915.patch, HBASE-10915.patch
>
>
> Decouple region closing from ZK. 
> Includes RS side (CloseRegionHandler), HM side (ClosedRegionHandler) and the 
> code using (HRegionServer, RSRpcServices etc).
> May need small changes in AssignmentManager.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-1197) IPC of large cells should transfer in chunks not via naive full copy

2014-05-13 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996950#comment-13996950
 ] 

Enis Soztutar commented on HBASE-1197:
--

This ties into blob store discussions which would enable a lot of more 
interesting use cases for hbase. However, it is an architectural level work 
that can sit on top of hbase as well, like what Microsoft did in 
http://sigops.org/sosp/sosp11/current/2011-Cascais/printable/11-calder.pdf 

Even if we are able to store large cells in hfiles, the write amplification 
from compactions would be a big problem. We would like to use the block storage 
in hdfs directly to store large objects. Andrew, you had a coprocessor + custom 
compactions based solution for this, no? 

> IPC of large cells should transfer in chunks not via naive full copy
> 
>
> Key: HBASE-1197
> URL: https://issues.apache.org/jira/browse/HBASE-1197
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>
> Several instances of OOME when trying to serve up large cells to clients have 
> been observed. IPC should send large cell content in chunks instead of as one 
> large naive copy. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HBASE-11135) Change region sequenceid generation so happens earlier in the append cycle rather than just before added to file

2014-05-13 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996701#comment-13996701
 ] 

Jeffrey Zhong edited comment on HBASE-11135 at 5/13/14 7:38 PM:


The v8 patch says following. I think it's out of date. 

{noformat}
It will block waiting on this
+   * method if on initialization our edit/sequence id is {@link 
HLogKey#NO_SEQ_NO}.
{noformat}


was (Author: jeffreyz):
The v8 patch says following. I think it's out of date. 

{quote}
It will block waiting on this
+   * method if on initialization our edit/sequence id is {@link 
HLogKey#NO_SEQ_NO}.
{quote}

> Change region sequenceid generation so happens earlier in the append cycle 
> rather than just before added to file
> 
>
> Key: HBASE-11135
> URL: https://issues.apache.org/jira/browse/HBASE-11135
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: stack
> Attachments: 11135.wip.txt, 11135v2.txt, 11135v5.txt, 11135v5.txt, 
> 11135v5.txt, 11135v6.txt, 11135v7.txt, 11135v8.txt, 11135v8.txt
>
>
> Currently we assign the region edit/sequence id just before we put it in the 
> WAL.  We do it in the single thread that feeds from the ring buffer.  Doing 
> it at this point, we can ensure order, that the edits will be in the file in 
> accordance w/ the ordering of the region sequence id.
> But the point at which region sequence id is assigned an edit is deep down in 
> the WAL system and there is a lag between our putting an edit into the WAL 
> system and the edit actually getting its edit/sequence id.
> This lag -- "late-binding" -- complicates the unification of mvcc and region 
> sequence id, especially around async WAL writes (and, related, for no-WAL 
> writes) -- the parent for this issue (For async, how you get the edit id in 
> our system when the threads have all gone home -- unless you make them wait?)
> Chatting w/ Jeffrey Zhong yesterday, we came up with a crazypants means of 
> getting the region sequence id near-immediately.  We'll run two ringbuffers.  
> The first will mesh all handler threads and the consumer will generate ids 
> (we will have order on other side of this first ring buffer), and then if 
> async or no sync, we will just let the threads return ... updating mvcc just 
> before we let them go.  All other calls will go up on to the second ring 
> buffer to be serviced as now (batching, distribution out among the sync'ing 
> threads).  The first rb will have no friction and should turn at fast rates 
> compared to the second.  There should not be noticeable slowdown nor do I 
> foresee this refactor intefering w/ our multi-WAL plans.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10504) Define Replication Interface

2014-05-13 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996979#comment-13996979
 ] 

Enis Soztutar commented on HBASE-10504:
---

bq. So IIUC you are going a completely different way than what was discussed 
before, right? I don't mind it but I want to make sure that we're on the same 
track.
Indeed. If you want to replicate to SOLR for example, no need to add another 
proxy layer. You can have SOLR clients do the transformations, and send the 
updates directly. Having to mock a cluster with pluggable sink approach 
requires to implement some hbase-specific implementation details (zookeeper 
interactions, meta, rpc layer etc). 
bq. Might still be nice to offer some tools like removeNonReplicableEdits()
Agreed. I was thinking of doing a Filter interface for that. You can either 
pass a filter, or you can implement your own filter and invoke it. I have to 
dig a bit more for that. 
bq. the basic metrics the handling of a disabled peer, etc, for the other 
consumers.
In the POC patch, I keep the peer definition, but peers can have pluggable 
ReplicationConsumer. You can still enable/disable a peer, get the metrics for 
the peer coming from the ReplicationSource for that peer. We should allow the 
replicationConsumer to put up its own metrics with the peer name as well.

> Define Replication Interface
> 
>
> Key: HBASE-10504
> URL: https://issues.apache.org/jira/browse/HBASE-10504
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.99.0
>
> Attachments: hbase-10504_wip1.patch
>
>
> HBase has replication.  Fellas have been hijacking the replication apis to do 
> all kinds of perverse stuff like indexing hbase content (hbase-indexer 
> https://github.com/NGDATA/hbase-indexer) and our [~toffer] just showed up w/ 
> overrides that replicate via an alternate channel (over a secure thrift 
> channel between dcs over on HBASE-9360).  This issue is about surfacing these 
> APIs as public with guarantees to downstreamers similar to those we have on 
> our public client-facing APIs (and so we don't break them for downstreamers).
> Any input [~phunt] or [~gabriel.reid] or [~toffer]?
> Thanks.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11133) Add an option to skip snapshot verification after Export

2014-05-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995888#comment-13995888
 ] 

Hudson commented on HBASE-11133:


SUCCESS: Integrated in hbase-0.96 #398 (See 
[https://builds.apache.org/job/hbase-0.96/398/])
HBASE-11133 Add an option to skip snapshot verification after ExportSnapshot 
(mbertozzi: rev 1593772)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java


> Add an option to skip snapshot verification after Export
> 
>
> Key: HBASE-11133
> URL: https://issues.apache.org/jira/browse/HBASE-11133
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 0.99.0
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Trivial
> Fix For: 0.99.0, 0.96.3, 0.98.3
>
> Attachments: HBASE-11133-v0.patch, HBASE-11133-v1.patch
>
>
> Add a "-skip-dst-verify" option to skip snapshot verification after Export



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-6990) Pretty print TTL

2014-05-13 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez updated HBASE-6990:
-

Attachment: HBASE-6990.v2.patch

New patch with fixes suggested by [~jmhsieh]

> Pretty print TTL
> 
>
> Key: HBASE-6990
> URL: https://issues.apache.org/jira/browse/HBASE-6990
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Esteban Gutierrez
>Priority: Minor
> Attachments: HBASE-6990.v0.patch, HBASE-6990.v1.patch, 
> HBASE-6990.v2.patch
>
>
> I've seen a lot of users getting confused by the TTL configuration and I 
> think that if we just pretty printed it it would solve most of the issues. 
> For example, let's say a user wanted to set a TTL of 90 days. That would be 
> 7776000. But let's say that it was typo'd to 7776 instead, it gives you 
> 900 days!
> So when we print the TTL we could do something like "x days, x hours, x 
> minutes, x seconds (real_ttl_value)". This would also help people when they 
> use ms instead of seconds as they would see really big values in there.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10569) Co-locate meta and master

2014-05-13 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996641#comment-13996641
 ] 

Francis Liu commented on HBASE-10569:
-

Thanks for the doc. Itd be great if we could have the list of use cases we are 
trying to solve. So we have that motivate the design decisions. 

During the discussion and my chat with Matteo and Jimmy here is what I got:

1. A method to guarantee security acl changes are fully propagated when a acl 
change is requested
2. Same as #1 but for quota
3. Remove master daemon to simplify deployment/ops
4. Have a designated set of servers system tables will be hosted on. To isolate 
it from user region workloads. 

Feel free to add if I missed anything. 

#3 and #4 directly motivates this patch. Tho it seeems there was an agreement 
to still hqve designated hosts as masters?

It seems to me #1 and #2 are use cases for a synchronous coordination frameowrk 
(consensus discussion).  Which may or may not require system table colocation. 
Having fault tolerant coordination as a first class primitive is sorely 
missing. And I believe enable us avoid design choices which would impede 
horizontal scalability. 



> Co-locate meta and master
> -
>
> Key: HBASE-10569
> URL: https://issues.apache.org/jira/browse/HBASE-10569
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Region Assignment
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 0.99.0
>
> Attachments: Co-locateMetaAndMasterHBASE-10569.pdf, 
> hbase-10569_v1.patch, hbase-10569_v2.patch, hbase-10569_v3.1.patch, 
> hbase-10569_v3.patch, master_rs.pdf
>
>
> I was thinking simplifying/improving the region assignments. The first step 
> is to co-locate the meta and the master as many people agreed on HBASE-5487.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-8763) [BRAINSTORM] Combine MVCC and SeqId

2014-05-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996821#comment-13996821
 ] 

stack commented on HBASE-8763:
--

I committed hbase-11135 so hopefully this patch is cleaner.  I opened 
HBASE-11160 for the case where we can hopefully let go of append having to wait 
on edit/sequence id updates (early-binding instead of late-binding).

> [BRAINSTORM] Combine MVCC and SeqId
> ---
>
> Key: HBASE-8763
> URL: https://issues.apache.org/jira/browse/HBASE-8763
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: Enis Soztutar
>Assignee: Jeffrey Zhong
>Priority: Critical
> Attachments: hbase-8736-poc.patch, hbase-8763-poc-v1.patch, 
> hbase-8763-v1.patch, hbase-8763_wip1.patch
>
>
> HBASE-8701 and a lot of recent issues include good discussions about mvcc + 
> seqId semantics. It seems that having mvcc and the seqId complicates the 
> comparator semantics a lot in regards to flush + WAL replay + compactions + 
> delete markers and out of order puts. 
> Thinking more about it I don't think we need a MVCC write number which is 
> different than the seqId. We can keep the MVCC semantics, read point and 
> smallest read points intact, but combine mvcc write number and seqId. This 
> will allow cleaner semantics + implementation + smaller data files. 
> We can do some brainstorming for 0.98. We still have to verify that this 
> would be semantically correct, it should be so by my current understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-7987) Snapshot Manifest file instead of multiple empty files

2014-05-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13992540#comment-13992540
 ] 

Hudson commented on HBASE-7987:
---

SUCCESS: Integrated in HBase-TRUNK #5128 (See 
[https://builds.apache.org/job/HBase-TRUNK/5128/])
HBASE-7987 Snapshot Manifest file instead of multiple empty files (mbertozzi: 
rev 1593139)
* 
/hbase/trunk/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/MapReduceProtos.java
* 
/hbase/trunk/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/SnapshotProtos.java
* /hbase/trunk/hbase-protocol/src/main/protobuf/MapReduce.proto
* /hbase/trunk/hbase-protocol/src/main/protobuf/Snapshot.proto
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/TableSnapshotScanner.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/Reference.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormat.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/CloneSnapshotHandler.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/DisabledTableSnapshotHandler.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/EnabledTableSnapshotHandler.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/MasterSnapshotVerifier.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/RestoreSnapshotHandler.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotFileCache.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotHFileCleaner.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/TakeSnapshotHandler.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ProcedureManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionFileSystem.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileInfo.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/CopyRecoveredEditsTask.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ReferenceRegionHFilesTask.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ReferenceServerWALsTask.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/RestoreSnapshotHelper.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotDescriptionUtils.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotInfo.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotLogSplitter.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotManifest.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotManifestV1.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotManifestV2.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotReferenceUtil.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotTask.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/TableInfoCopyTask.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/TakeSnapshotUtils.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSVisitor.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestSnapshotFromClient.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestTableSnapshotScanner.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableSnapshotInputFormat.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestSnapshotFromMaster.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/snapshot/TestSnapshotFileCache.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/SnapshotTestingUtils.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestCopyRecoveredEditsTask.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestExportSnapshot.java
* 
/hbase/

[jira] [Updated] (HBASE-11151) move tracing modules from hbase-server to hbase-common

2014-05-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-11151:
--

   Resolution: Fixed
Fix Version/s: 0.99.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks for the cleanup [~iwasakims]

> move tracing modules from hbase-server to hbase-common
> --
>
> Key: HBASE-11151
> URL: https://issues.apache.org/jira/browse/HBASE-11151
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation, util
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Fix For: 0.99.0
>
> Attachments: HBASE-11151-0.patch
>
>
> Not only servers but also clients using tracing needs SpanReceiverHost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2014-05-13 Thread Demai Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996599#comment-13996599
 ] 

Demai Ni commented on HBASE-7912:
-

[~stack], 

thanks for the comments. 

bq. This doc. with perhaps a little more commentary like it could go into the 
hbase refguide when this feature is committed?
In additional to the cli pdf I attached in this jira. more completed documents 
can be found here:  [IBM BigInsights 
2.1.2|http://www-01.ibm.com/support/knowledgecenter/SSPT3X_2.1.2/com.ibm.swg.im.infosphere.biginsights.admin.doc/doc/admin_hbase_bkuprestore_overview.html],
 which was officially released in March 2014. We will open source all the 
features related with Backup/Restore from IBM BigInsights. We can move the 
documents to 'backup' session of HBase ref book as you suggested, and certainly 
after incorporated the comments/suggestions from the community.

About testing, thanks to [~jinghe]'s comment. We already did functional, stress 
testing internally before release. For the current patches, since we did some 
changes per suggestions from the community, additional dev testing is being 
carried on. 

{quote}
bq. We’ll convert/replay the backed-up Hlogs into HFiles for fast incremental 
restore. 
This is interesting. It is done against a cluster or it is just a MR job/tool?
{quote}
~70% of the code logic is from WalPlayer, a MR job against target cluster. The 
difference is, we don't rely on a live hbase cluster when convert the HLog to 
Hfiles as the code can access the tableinfo offline. Currently the code is only 
useful for the backup/restore solution. We'd like to open another jira for the 
logic as a general tool/improvement of WalPlayer, and the new jira will have a 
dependency on [HBASE-8083 | https://issues.apache.org/jira/browse/HBASE-8073]. 

bq.What needs to go in first? What should we review first?
Actually, need you and other folks' suggestion here. 

>From the dependency perspective, I'd like to have [Full backup HBase-10900| 
>https://issues.apache.org/jira/browse/HBASE-10900] in first, and then 
>[incremental backup 
>HBase-11085|https://issues.apache.org/jira/browse/HBASE-11085], and once 
>Jerry's [global log roll HBase-11148| 
>https://issues.apache.org/jira/browse/HBASE-11148] get accepted. I will put a 
>patch to update full and incremental to use it immediately.  Then, I would 
>like to improve it with protobuff and abstract out zookeeper. 

If community accepts the solution of the general framework provided by [Full 
backup HBase-10900| https://issues.apache.org/jira/browse/HBASE-10900] and  
[incremental backup 
HBase-11085|https://issues.apache.org/jira/browse/HBASE-11085]. We will build 
the patches of other features on top of the framework. 

At this moment, I am thinking about open another review board for the combined 
patches of [both incremental and full backup | 
https://issues.apache.org/jira/secure/attachment/12644215/HBASE-11085-trunk-v1-contains-HBASE-10900-trunk-v4.patch].
 

I understand a lot of codes involved here, and open to any suggestion to make 
the review easier to everyone. :-) 

Demai

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files int

[jira] [Created] (HBASE-11149) Wire encryption is broken

2014-05-13 Thread Devaraj Das (JIRA)
Devaraj Das created HBASE-11149:
---

 Summary: Wire encryption is broken
 Key: HBASE-11149
 URL: https://issues.apache.org/jira/browse/HBASE-11149
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.99.0
 Attachments: 11149-1.txt

Upon some testing with the QOP configuration (hbase.rpc.protection), discovered 
that RPC doesn't work with "integrity" and "privacy" values for the 
configuration key. I was using 0.98.x for testing but I believe the issue is 
there in trunk as well (haven't checked 0.96 and 0.94).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-10861) Supporting API in ByteRange

2014-05-13 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-10861:
---

Status: Open  (was: Patch Available)

> Supporting API in ByteRange
> ---
>
> Key: HBASE-10861
> URL: https://issues.apache.org/jira/browse/HBASE-10861
> Project: HBase
>  Issue Type: Improvement
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBASE-10861.patch, HBASE-10861_2.patch
>
>
> We would need APIs that would 
> setLimit(int limit)
> getLimt()
> asReadOnly()
> These APIs would help in implementations that have Buffers offheap (for now 
> BRs backed by DBB).
> If anything more is needed could be added when needed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HBASE-10251) Restore API Compat for PerformanceEvaluation.generateValue()

2014-05-13 Thread Jean-Daniel Cryans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans reassigned HBASE-10251:
--

Assignee: Dima Spivak  (was: Jean-Daniel Cryans)

God I hate jira, so easy to misclick. Reassigning Dima.

> Restore API Compat for PerformanceEvaluation.generateValue()
> 
>
> Key: HBASE-10251
> URL: https://issues.apache.org/jira/browse/HBASE-10251
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.98.0, 0.98.1, 0.99.0, 0.98.2
>Reporter: Aleksandr Shulman
>Assignee: Dima Spivak
>  Labels: api_compatibility
> Fix For: 0.98.0, 0.98.1, 0.99.0, 0.98.2
>
> Attachments: HBASE-10251-v2.patch, HBASE_10251.patch
>
>
> Observed:
> A couple of my client tests fail to compile against trunk because the method 
> PerformanceEvaluation.generateValue was removed as part of HBASE-8496.
> This is an issue because it was used in a number of places, including unit 
> tests. Since we did not explicitly label this API as private, it's ambiguous 
> as to whether this could/should have been used by people writing apps against 
> 0.96. If they used it, then they would be broken upon upgrade to 0.98 and 
> trunk.
> Potential Solution:
> The method was renamed to generateData, but the logic is still the same. We 
> can reintroduce it as deprecated in 0.98, as compat shim over generateData. 
> The patch should be a few lines. We may also consider doing so in trunk, but 
> I'd be just as fine with leaving it out.
> More generally, this raises the question about what other code is in this 
> "grey-area", where it is public, is used outside of the package, but is not 
> explicitly labeled with an AudienceInterface.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HBASE-10251) Restore API Compat for PerformanceEvaluation.generateValue()

2014-05-13 Thread Jean-Daniel Cryans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans reassigned HBASE-10251:
--

Assignee: Jean-Daniel Cryans  (was: Dima Spivak)

> Restore API Compat for PerformanceEvaluation.generateValue()
> 
>
> Key: HBASE-10251
> URL: https://issues.apache.org/jira/browse/HBASE-10251
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.98.0, 0.98.1, 0.99.0, 0.98.2
>Reporter: Aleksandr Shulman
>Assignee: Jean-Daniel Cryans
>  Labels: api_compatibility
> Fix For: 0.98.0, 0.98.1, 0.99.0, 0.98.2
>
> Attachments: HBASE-10251-v2.patch, HBASE_10251.patch
>
>
> Observed:
> A couple of my client tests fail to compile against trunk because the method 
> PerformanceEvaluation.generateValue was removed as part of HBASE-8496.
> This is an issue because it was used in a number of places, including unit 
> tests. Since we did not explicitly label this API as private, it's ambiguous 
> as to whether this could/should have been used by people writing apps against 
> 0.96. If they used it, then they would be broken upon upgrade to 0.98 and 
> trunk.
> Potential Solution:
> The method was renamed to generateData, but the logic is still the same. We 
> can reintroduce it as deprecated in 0.98, as compat shim over generateData. 
> The patch should be a few lines. We may also consider doing so in trunk, but 
> I'd be just as fine with leaving it out.
> More generally, this raises the question about what other code is in this 
> "grey-area", where it is public, is used outside of the package, but is not 
> explicitly labeled with an AudienceInterface.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11126) Add RegionObserver pre hooks that operate under row lock

2014-05-13 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996067#comment-13996067
 ] 

ramkrishna.s.vasudevan commented on HBASE-11126:


As mail notification is not happening, will assign it to me.  Reassign if you 
think otherwise @apurtell.

> Add RegionObserver pre hooks that operate under row lock
> 
>
> Key: HBASE-11126
> URL: https://issues.apache.org/jira/browse/HBASE-11126
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.99.0, 0.98.3
>Reporter: Andrew Purtell
>
> The coprocessor hooks were placed outside of row locks. This was meant to 
> sidestep performance issues arising from significant work done within hook 
> invocations. However as the security code increases in sophistication we are 
> now running into concurrency issues trying to use them as a result of that 
> early decision. Since the initial introduction of coprocessor upcalls there 
> has been some significant refactoring done around them and concurrency 
> control in core has become more complex. This is potentially an issue for 
> many coprocessor users.
> We should do either:\\
> - Move all existing RegionObserver pre* hooks to execute under row lock.
> - Introduce a new set of RegionObserver pre* hooks that execute under row 
> lock, named to indicate such.
> The second option is less likely to lead to surprises.
> All RegionObserver hook Javadoc should be updated with advice to the 
> coprocessor implementor not to take their own row locks in the hook. If the 
> current thread happens to already have a row lock and they try to take a lock 
> on another row, there is a deadlock risk.
> As always a drawback of adding hooks is the potential for performance impact. 
> We should benchmark the impact and decide if the second option above is a 
> viable choice or if the first option is required.
> Finally, we should introduce a higher level interface for managing the 
> registration of 'user' code for execution from the low level hooks. I filed 
> HBASE-11125 to discuss this further.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HBASE-10569) Co-locate meta and master

2014-05-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reassigned HBASE-10569:
-

Assignee: Jimmy Xiang  (was: stack)

Sorry, assigned  myself by mistack.

> Co-locate meta and master
> -
>
> Key: HBASE-10569
> URL: https://issues.apache.org/jira/browse/HBASE-10569
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Region Assignment
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 0.99.0
>
> Attachments: Co-locateMetaAndMasterHBASE-10569.pdf, 
> hbase-10569_v1.patch, hbase-10569_v2.patch, hbase-10569_v3.1.patch, 
> hbase-10569_v3.patch, master_rs.pdf
>
>
> I was thinking simplifying/improving the region assignments. The first step 
> is to co-locate the meta and the master as many people agreed on HBASE-5487.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HBASE-11041) HBaseTestingUtil.createMultiRegions deals incorrectly with missing column family

2014-05-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-11041.
---

   Resolution: Cannot Reproduce
Fix Version/s: (was: 0.98.3)
   (was: 0.94.20)
   (was: 0.96.3)
   (was: 0.99.0)

> HBaseTestingUtil.createMultiRegions deals incorrectly with missing column 
> family
> 
>
> Key: HBASE-11041
> URL: https://issues.apache.org/jira/browse/HBASE-11041
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>
> Just found a test failing like this:
> {code}
> Error Message
> HTableDescriptor is read-only
> Stacktrace
> java.lang.UnsupportedOperationException: HTableDescriptor is read-only
>   at 
> org.apache.hadoop.hbase.client.UnmodifyableHTableDescriptor.addFamily(UnmodifyableHTableDescriptor.java:64)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.createMultiRegions(HBaseTestingUtility.java:1302)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.createMultiRegions(HBaseTestingUtility.java:1291)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.createMultiRegions(HBaseTestingUtility.java:1286)
>   at 
> org.apache.hadoop.hbase.master.TestDistributedLogSplitting.installTable(TestDistributedLogSplitting.java:485)
>   at 
> org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testMasterStartsUpWithLogSplittingWork(TestDistributedLogSplitting.java:282)
> {code}
> The code that causes this looks like this:
> {code}
> HTableDescriptor htd = table.getTableDescriptor();
> if(!htd.hasFamily(columnFamily)) {
>   HColumnDescriptor hcd = new HColumnDescriptor(columnFamily);
>   htd.addFamily(hcd);
> }
> {code}
> But note that table.getTableDescriptor() returns an 
> UnmodifyableHTableDescriptor, so the add would *always* fail.
> The specific test that failed was 
> TestDistributedLogSplitting.testMasterStartsUpWithLogSplittingWork.
> Looks like the HMaster did not have the last table descriptor state, yet.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-11143) Improve replication metrics

2014-05-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-11143:
--

Attachment: 11143-0.94-v3.txt

One last small improvement: Log the number of bytes in the "Replicating ..." 
debug log message.

> Improve replication metrics
> ---
>
> Key: HBASE-11143
> URL: https://issues.apache.org/jira/browse/HBASE-11143
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.99.0, 0.94.20, 0.98.3
>
> Attachments: 11143-0.94-v2.txt, 11143-0.94-v3.txt, 11143-0.94.txt, 
> 11143-trunk.txt
>
>
> We are trying to report on replication lag and find that there is no good 
> single metric to do that.
> ageOfLastShippedOp is close, but unfortunately it is increased even when 
> there is nothing to ship on a particular RegionServer.
> I would like discuss a few options here:
> Add a new metric: replicationQueueTime (or something) with the above meaning. 
> I.e. if we have something to ship we set the age of that last shipped edit, 
> if we fail we increment that last time (just like we do now). But if there is 
> nothing to replicate we set it to current time (and hence that metric is 
> reported to close to 0).
> Alternatively we could change the meaning of ageOfLastShippedOp to mean to do 
> that. That might lead to surprises, but the current behavior is clearly weird 
> when there is nothing to replicate.
> Comments? [~jdcryans], [~stack].
> If approach sounds good, I'll make a patch for all branches.
> Edit: Also adds a new shippedKBs metric to track the amount of data that is 
> shipped via replication.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >