[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190945#comment-13190945
 ] 

gaojinchao commented on HBASE-5231:
---

I think we can do. 
regarding to log "Done. Calculated a load balance in" , we can move out 
"balanceCluster".
move to below code ?

+  for (Map> assignments : 
assignmentsByTable.values()) {
+List partialPlans = 
this.balancer.balanceCluster(assignments);
+if (partialPlans != null) plans.addAll(partialPlans);
   }

> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-23 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5179:
-

Fix Version/s: (was: 0.92.0)
   0.92.1

> Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
> region to be assigned before log splitting is completed, causing data loss
> 
>
> Key: HBASE-5179
> URL: https://issues.apache.org/jira/browse/HBASE-5179
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.94.0, 0.92.1, 0.90.6
>
> Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
> 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
> 5179-90v16.patch, 5179-90v17.txt, 5179-90v2.patch, 5179-90v3.patch, 
> 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 5179-90v7.patch, 
> 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, 5179-v11-92.txt, 
> 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, Errorlog, 
> hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, 
> hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
> hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch
>
>
> If master's processing its failover and ServerShutdownHandler's processing 
> happen concurrently, it may appear following  case.
> 1.master completed splitLogAfterStartup()
> 2.RegionserverA restarts, and ServerShutdownHandler is processing.
> 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
> dead server.
> 4.master starts to assign regions of RegionserverA because it is a dead 
> server by step3.
> However, when doing step4(assigning region), ServerShutdownHandler may be 
> doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5237) Addendum for HBASE-5160 and HBASE-4397

2012-01-23 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5237:
-

Fix Version/s: (was: 0.92.0)
   0.92.1

> Addendum for HBASE-5160 and HBASE-4397
> --
>
> Key: HBASE-5237
> URL: https://issues.apache.org/jira/browse/HBASE-5237
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.92.1, 0.90.6
>
> Attachments: HBASE-5237_0.90.patch, HBASE-5237_trunk.patch
>
>
> As part of HBASE-4397 there is one more scenario where the patch has to be 
> applied.
> {code}
> RegionPlan plan = getRegionPlan(state, forceNewPlan);
>   if (plan == null) {
> debugLog(state.getRegion(),
> "Unable to determine a plan to assign " + state);
> return; // Should get reassigned later when RIT times out.
>   }
> {code}
> I think in this scenario also 
> {code}
> this.timeoutMonitor.setAllRegionServersOffline(true);
> {code}
> this should be done.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3796) Per-Store Entries in Compaction Queue

2012-01-23 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-3796:
-

Fix Version/s: (was: 0.92.0)
   0.92.1

> Per-Store Entries in Compaction Queue
> -
>
> Key: HBASE-3796
> URL: https://issues.apache.org/jira/browse/HBASE-3796
> Project: HBase
>  Issue Type: Bug
>Reporter: Nicolas Spiegelberg
>Assignee: Mikhail Bautin
>Priority: Minor
> Fix For: 0.92.1
>
> Attachments: HBASE-3796-fixed.patch, HBASE-3796.patch
>
>
> Although compaction is decided on a per-store basis, right now the 
> CompactSplitThread only deals at the Region level for queueing.  Store-level 
> compaction queue entries will give us more visibility into compaction 
> workload + allow us to stop summarizing priorities.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5139) Compute (weighted) median using AggregateProtocol

2012-01-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5139:
--

Attachment: 5139.addendum

Addendum that handles startRow being null for the case where median is in the 
first region.

> Compute (weighted) median using AggregateProtocol
> -
>
> Key: HBASE-5139
> URL: https://issues.apache.org/jira/browse/HBASE-5139
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Zhihong Yu
>Assignee: Zhihong Yu
> Attachments: 5139-v2.txt, 5139.addendum
>
>
> Suppose cf:cq1 stores numeric values and optionally cf:cq2 stores weights. 
> This task finds out the median value among the values of cf:cq1 (See 
> http://www.stat.ucl.ac.be/ISdidactique/Rhelp/library/R.basic/html/weighted.median.html)
> This can be done in two passes.
> The first pass utilizes AggregateProtocol where the following tuple is 
> returned from each region:
> (partial-sum-of-values, partial-sum-of-weights)
> The start rowkey (supplied by coprocessor framework) would be used to sort 
> the tuples. This way we can determine which region (called R) contains the 
> (weighted) median. partial-sum-of-weights can be 0 if unweighted median is 
> sought
> The second pass involves scanning the table, beginning with startrow of 
> region R and computing partial (weighted) sum until the threshold of S/2 is 
> crossed. The (weighted) median is returned.
> However, this approach wouldn't work if there is mutation in the underlying 
> table between pass one and pass two.
> In that case, sequential scanning seems to be the solution which is slower 
> than the above approach.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5139) Compute (weighted) median using AggregateProtocol

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191261#comment-13191261
 ] 

Hadoop QA commented on HBASE-5139:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511514/5139.addendum
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/836//console

This message is automatically generated.

> Compute (weighted) median using AggregateProtocol
> -
>
> Key: HBASE-5139
> URL: https://issues.apache.org/jira/browse/HBASE-5139
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Zhihong Yu
>Assignee: Zhihong Yu
> Attachments: 5139-v2.txt, 5139.addendum
>
>
> Suppose cf:cq1 stores numeric values and optionally cf:cq2 stores weights. 
> This task finds out the median value among the values of cf:cq1 (See 
> http://www.stat.ucl.ac.be/ISdidactique/Rhelp/library/R.basic/html/weighted.median.html)
> This can be done in two passes.
> The first pass utilizes AggregateProtocol where the following tuple is 
> returned from each region:
> (partial-sum-of-values, partial-sum-of-weights)
> The start rowkey (supplied by coprocessor framework) would be used to sort 
> the tuples. This way we can determine which region (called R) contains the 
> (weighted) median. partial-sum-of-weights can be 0 if unweighted median is 
> sought
> The second pass involves scanning the table, beginning with startrow of 
> region R and computing partial (weighted) sum until the threshold of S/2 is 
> crossed. The (weighted) median is returned.
> However, this approach wouldn't work if there is mutation in the underlying 
> table between pass one and pass two.
> In that case, sequential scanning seems to be the solution which is slower 
> than the above approach.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread Zhihong Yu (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190945#comment-13190945
 ] 

Zhihong Yu edited comment on HBASE-5231 at 1/23/12 5:16 PM:


I think we can do. 
regarding to log "Done. Calculated a load balance in" , we can move out 
"balanceCluster".
move to below code ?
{code}
+  for (Map> assignments : 
assignmentsByTable.values()) {
+List partialPlans = 
this.balancer.balanceCluster(assignments);
+if (partialPlans != null) plans.addAll(partialPlans);
   }
{code}

  was (Author: sunnygao):
I think we can do. 
regarding to log "Done. Calculated a load balance in" , we can move out 
"balanceCluster".
move to below code ?

+  for (Map> assignments : 
assignmentsByTable.values()) {
+List partialPlans = 
this.balancer.balanceCluster(assignments);
+if (partialPlans != null) plans.addAll(partialPlans);
   }
  
> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Zhihong Yu (Created) (JIRA)
Move coprocessors set out of RegionLoad
---

 Key: HBASE-5258
 URL: https://issues.apache.org/jira/browse/HBASE-5258
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu


When I worked on HBASE-5256, I revisited the code related to Ser/De of 
coprocessors set in RegionLoad.

I think the rationale for embedding coprocessors set is for maximum flexibility 
where each region can load different coprocessors.
This flexibility is causing extra cost in the region server to Master 
communication and increasing the footprint of Master heap.

Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5255) Use singletons for OperationStatus to save memory

2012-01-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5255:
--

Attachment: 5255-v2.txt

Patch v2 makes exceptionMsg and code fields final.

> Use singletons for OperationStatus to save memory
> -
>
> Key: HBASE-5255
> URL: https://issues.apache.org/jira/browse/HBASE-5255
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.92.0, 0.90.5
>Reporter: Benoit Sigoure
>Assignee: Benoit Sigoure
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.0, 0.92.1
>
> Attachments: 5255-v2.txt, 
> HBASE-5255-0.92-Use-singletons-to-remove-unnecessary-memory-allocati.patch, 
> HBASE-5255-trunk-Use-singletons-to-remove-unnecessary-memory-allocati.patch
>
>
> Every single {{Put}} causes the allocation of at least one 
> {{OperationStatus}}, yet {{OperationStatus}} is almost always stateless, so 
> these allocations are unnecessary and could be avoided.  Attached patch adds 
> a few singletons and uses them, with no public API change.  I didn't test the 
> patches, but you get the idea.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191281#comment-13191281
 ] 

Zhihong Yu commented on HBASE-5231:
---

The above mentioned log marks the completion of balancing each table (or the 
whole cluster) where actual region movement is scheduled.
I feel we can leave it there for now.

> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5231:
--

Attachment: 5231-v2.txt

Patch v2 which I am going to integrate later today.

> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231-v2.txt, 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5240) HBase internalscanner.next javadoc doesn't imply whether or not results are appended or not

2012-01-23 Thread Alex Newman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-5240:
---

Status: Patch Available  (was: Open)

> HBase internalscanner.next javadoc doesn't imply whether or not results are 
> appended or not
> ---
>
> Key: HBASE-5240
> URL: https://issues.apache.org/jira/browse/HBASE-5240
> Project: HBase
>  Issue Type: Bug
>Reporter: Alex Newman
>Assignee: Alex Newman
> Attachments: 
> 0001-HBASE-5240.-HBase-internalscanner.next-javadoc-doesn.patch
>
>
> Just looking at 
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/InternalScanner.html.
>  We don't know whether or not the results are appended to results list, or if 
> we always clear it first.
> boolean   next(List results)
>   Grab the next row's worth of values.
>  boolean  next(List result, int limit)
>   Grab the next row's worth of values with a limit on the number of 
> values to return.
>  
> Method Detail
> next
> boolean next(List results)
>  throws IOException
> Grab the next row's worth of values.
> Parameters:
> results - return output array 
> Returns:
> true if more rows exist after this one, false if scanner is done 
> Throws:
> IOException - e
> next
> boolean next(List result,
>  int limit)
>  throws IOException
> Grab the next row's worth of values with a limit on the number of values 
> to return.
> Parameters:
> result - return output array
> limit - limit on row count to get 
> Returns:
> true if more rows exist after this one, false if scanner is done 
> Throws:
> IOException - e

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5240) HBase internalscanner.next javadoc doesn't imply whether or not results are appended or not

2012-01-23 Thread Alex Newman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-5240:
---

Attachment: 0001-HBASE-5240.-HBase-internalscanner.next-javadoc-doesn.patch

> HBase internalscanner.next javadoc doesn't imply whether or not results are 
> appended or not
> ---
>
> Key: HBASE-5240
> URL: https://issues.apache.org/jira/browse/HBASE-5240
> Project: HBase
>  Issue Type: Bug
>Reporter: Alex Newman
>Assignee: Alex Newman
> Attachments: 
> 0001-HBASE-5240.-HBase-internalscanner.next-javadoc-doesn.patch
>
>
> Just looking at 
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/InternalScanner.html.
>  We don't know whether or not the results are appended to results list, or if 
> we always clear it first.
> boolean   next(List results)
>   Grab the next row's worth of values.
>  boolean  next(List result, int limit)
>   Grab the next row's worth of values with a limit on the number of 
> values to return.
>  
> Method Detail
> next
> boolean next(List results)
>  throws IOException
> Grab the next row's worth of values.
> Parameters:
> results - return output array 
> Returns:
> true if more rows exist after this one, false if scanner is done 
> Throws:
> IOException - e
> next
> boolean next(List result,
>  int limit)
>  throws IOException
> Grab the next row's worth of values with a limit on the number of values 
> to return.
> Parameters:
> result - return output array
> limit - limit on row count to get 
> Returns:
> true if more rows exist after this one, false if scanner is done 
> Throws:
> IOException - e

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3796) Per-Store Entries in Compaction Queue

2012-01-23 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-3796:
--

Release Note:   (was: Sorry, it seems like I re-opened the wrong patch 
instead of HBASE-3976. Restoring the "Fixed" status.)

> Per-Store Entries in Compaction Queue
> -
>
> Key: HBASE-3796
> URL: https://issues.apache.org/jira/browse/HBASE-3796
> Project: HBase
>  Issue Type: Bug
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>Priority: Minor
> Fix For: 0.92.1
>
> Attachments: HBASE-3796-fixed.patch, HBASE-3796.patch
>
>
> Although compaction is decided on a per-store basis, right now the 
> CompactSplitThread only deals at the Region level for queueing.  Store-level 
> compaction queue entries will give us more visibility into compaction 
> workload + allow us to stop summarizing priorities.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-3796) Per-Store Entries in Compaction Queue

2012-01-23 Thread Mikhail Bautin (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin resolved HBASE-3796.
---

  Resolution: Fixed
Assignee: Nicolas Spiegelberg  (was: Mikhail Bautin)
Release Note: Sorry, it seems like I re-opened the wrong patch instead of 
HBASE-3976. Restoring the "Fixed" status.

> Per-Store Entries in Compaction Queue
> -
>
> Key: HBASE-3796
> URL: https://issues.apache.org/jira/browse/HBASE-3796
> Project: HBase
>  Issue Type: Bug
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>Priority: Minor
> Fix For: 0.92.1
>
> Attachments: HBASE-3796-fixed.patch, HBASE-3796.patch
>
>
> Although compaction is decided on a per-store basis, right now the 
> CompactSplitThread only deals at the Region level for queueing.  Store-level 
> compaction queue entries will give us more visibility into compaction 
> workload + allow us to stop summarizing priorities.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-3976) Disable Block Cache On Compactions

2012-01-23 Thread Mikhail Bautin (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin reopened HBASE-3976:
---

  Assignee: Mikhail Bautin  (was: Nicolas Spiegelberg)

Re-opening until we add a unit test and implement a proper fix.

> Disable Block Cache On Compactions
> --
>
> Key: HBASE-3976
> URL: https://issues.apache.org/jira/browse/HBASE-3976
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.90.3
>Reporter: Karthick Sankarachary
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: HBASE-3976-V3.patch, HBASE-3976-unconditional.patch, 
> HBASE-3976.patch
>
>
> Is there a good reason to believe that caching blocks during compactions is 
> beneficial? Currently, if block cache is enabled on a certain family, then 
> every time it's compacted, we load all of its blocks into the (LRU) cache, at 
> the expense of the legitimately hot ones.
> As a matter of fact, this concern was raised earlier in HBASE-1597, which 
> rightly points out that, "we should not bog down the LRU with unneccessary 
> blocks" during compaction. Even though that issue has been marked as "fixed", 
> it looks like it ought to be reopened.
> Should we err on the side of caution and not cache blocks during compactions 
> period (as illustrated in the attached patch)? Or, can we be selectively 
> aggressive about what blocks do get cached during compaction (e.g., only 
> cache those blocks from the recent files)?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3796) Per-Store Entries in Compaction Queue

2012-01-23 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191303#comment-13191303
 ] 

Mikhail Bautin commented on HBASE-3796:
---

Sorry, it seems like I re-opened the wrong patch instead of HBASE-3976. 
Restoring the "Fixed" status.

> Per-Store Entries in Compaction Queue
> -
>
> Key: HBASE-3796
> URL: https://issues.apache.org/jira/browse/HBASE-3796
> Project: HBase
>  Issue Type: Bug
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>Priority: Minor
> Fix For: 0.92.1
>
> Attachments: HBASE-3796-fixed.patch, HBASE-3796.patch
>
>
> Although compaction is decided on a per-store basis, right now the 
> CompactSplitThread only deals at the Region level for queueing.  Store-level 
> compaction queue entries will give us more visibility into compaction 
> workload + allow us to stop summarizing priorities.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4920) We need a mascot, a totem

2012-01-23 Thread Marcy Davis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191306#comment-13191306
 ] 

Marcy Davis commented on HBASE-4920:


I will have my friend play around with the Orca image some more based on 
everyone's comments. @Lars Hofhansl, do you have an image of an octupus you 
want to suggest?

> We need a mascot, a totem
> -
>
> Key: HBASE-4920
> URL: https://issues.apache.org/jira/browse/HBASE-4920
> Project: HBase
>  Issue Type: Task
>Reporter: stack
> Attachments: HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 
> 2011-11-30 at 4.06.17 PM.png, photo (2).JPG
>
>
> We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
> Clyesdale.  We need something else.
> We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
> and we could order boxes of them from some off-shore sweatshop that 
> subcontracts to a contractor who employs child labor only.
> Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
> Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
> here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
> translation, bigdata).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5255) Use singletons for OperationStatus to save memory

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191307#comment-13191307
 ] 

Hadoop QA commented on HBASE-5255:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511518/5255-v2.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 84 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/837//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/837//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/837//console

This message is automatically generated.

> Use singletons for OperationStatus to save memory
> -
>
> Key: HBASE-5255
> URL: https://issues.apache.org/jira/browse/HBASE-5255
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.92.0, 0.90.5
>Reporter: Benoit Sigoure
>Assignee: Benoit Sigoure
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.0, 0.92.1
>
> Attachments: 5255-v2.txt, 
> HBASE-5255-0.92-Use-singletons-to-remove-unnecessary-memory-allocati.patch, 
> HBASE-5255-trunk-Use-singletons-to-remove-unnecessary-memory-allocati.patch
>
>
> Every single {{Put}} causes the allocation of at least one 
> {{OperationStatus}}, yet {{OperationStatus}} is almost always stateless, so 
> these allocations are unnecessary and could be avoided.  Attached patch adds 
> a few singletons and uses them, with no public API change.  I didn't test the 
> patches, but you get the idea.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4920) We need a mascot, a totem

2012-01-23 Thread Marcy Davis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcy Davis updated HBASE-4920:
---

Attachment: apache hbase orca logo_Proof 3.pdf

Here are a few other Orca design options (2 in black and white). 

> We need a mascot, a totem
> -
>
> Key: HBASE-4920
> URL: https://issues.apache.org/jira/browse/HBASE-4920
> Project: HBase
>  Issue Type: Task
>Reporter: stack
> Attachments: HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 
> 2011-11-30 at 4.06.17 PM.png, apache hbase orca logo_Proof 3.pdf, photo 
> (2).JPG
>
>
> We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
> Clyesdale.  We need something else.
> We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
> and we could order boxes of them from some off-shore sweatshop that 
> subcontracts to a contractor who employs child labor only.
> Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
> Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
> here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
> translation, bigdata).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5210) HFiles are missing from an incremental load

2012-01-23 Thread Jimmy Xiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191310#comment-13191310
 ] 

Jimmy Xiang commented on HBASE-5210:


Any fix in getRandomFilename will just reduce the chance of file name 
collision.  Since this a rare case, I think it may be better to just fail the 
task if failed to commit the files in the moveTaskOutputs(), without 
overwriting the existing files.  In HDFS 0.23, rename() takes an option not to 
overwrite.  With HADOOP 0.20, we can just do our best to check any conflicts 
before committing the files.

> HFiles are missing from an incremental load
> ---
>
> Key: HBASE-5210
> URL: https://issues.apache.org/jira/browse/HBASE-5210
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.2
> Environment: HBase 0.90.2 with Hadoop-0.20.2 (with durable sync).  
> RHEL 2.6.18-164.15.1.el5.  4 node cluster (1 master, 3 slaves)
>Reporter: Lawrence Simpson
> Attachments: HBASE-5210-crazy-new-getRandomFilename.patch
>
>
> We run an overnight map/reduce job that loads data from an external source 
> and adds that data to an existing HBase table.  The input files have been 
> loaded into hdfs.  The map/reduce job uses the HFileOutputFormat (and the 
> TotalOrderPartitioner) to create HFiles which are subsequently added to the 
> HBase table.  On at least two separate occasions (that we know of), a range 
> of output would be missing for a given day.  The range of keys for the 
> missing values corresponded to those of a particular region.  This implied 
> that a complete HFile somehow went missing from the job.  Further 
> investigation revealed the following:
>  * Two different reducers (running in separate JVMs and thus separate class 
> loaders)
>  * in the same server can end up using the same file names for their
>  * HFiles.  The scenario is as follows:
>  *1.  Both reducers start near the same time.
>  *2.  The first reducer reaches the point where it wants to write its 
> first file.
>  *3.  It uses the StoreFile class which contains a static Random 
> object 
>  *which is initialized by default using a timestamp.
>  *4.  The file name is generated using the random number generator.
>  *5.  The file name is checked against other existing files.
>  *6.  The file is written into temporary files in a directory named
>  *after the reducer attempt.
>  *7.  The second reduce task reaches the same point, but its 
> StoreClass
>  *(which is now in the file system's cache) gets loaded within the
>  *time resolution of the OS and thus initializes its Random()
>  *object with the same seed as the first task.
>  *8.  The second task also checks for an existing file with the name
>  *generated by the random number generator and finds no conflict
>  *because each task is writing files in its own temporary folder.
>  *9.  The first task finishes and gets its temporary files committed
>  *to the "real" folder specified for output of the HFiles.
>  * 10.The second task then reaches its own conclusion and commits its
>  *files (moveTaskOutputs).  The released Hadoop code just 
> overwrites
>  *any files with the same name.  No warning messages or anything.
>  *The first task's HFiles just go missing.
>  * 
>  *  Note:  The reducers here are NOT different attempts at the same 
>  *reduce task.  They are different reduce tasks so data is
>  *really lost.
> I am currently testing a fix in which I have added code to the Hadoop 
> FileOutputCommitter.moveTaskOutputs method to check for a conflict with
> an existing file in the final output folder and to rename the HFile if
> needed.  This may not be appropriate for all uses of FileOutputFormat.
> So I have put this into a new class which is then used by a subclass of
> HFileOutputFormat.  Subclassing of FileOutputCommitter itself was a bit 
> more of a problem due to private declarations.
> I don't know if my approach is the best fix for the problem.  If someone
> more knowledgeable than myself deems that it is, I will be happy to share
> what I have done and by that time I may have some information on the
> results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5255) Use singletons for OperationStatus to save memory

2012-01-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5255:
--

Attachment: 5255-92.txt

Patch for 0.92 branch

> Use singletons for OperationStatus to save memory
> -
>
> Key: HBASE-5255
> URL: https://issues.apache.org/jira/browse/HBASE-5255
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.92.0, 0.90.5
>Reporter: Benoit Sigoure
>Assignee: Benoit Sigoure
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.0, 0.92.1
>
> Attachments: 5255-92.txt, 5255-v2.txt, 
> HBASE-5255-0.92-Use-singletons-to-remove-unnecessary-memory-allocati.patch, 
> HBASE-5255-trunk-Use-singletons-to-remove-unnecessary-memory-allocati.patch
>
>
> Every single {{Put}} causes the allocation of at least one 
> {{OperationStatus}}, yet {{OperationStatus}} is almost always stateless, so 
> these allocations are unnecessary and could be avoided.  Attached patch adds 
> a few singletons and uses them, with no public API change.  I didn't test the 
> patches, but you get the idea.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5257) Allow filter to be evaluated after version handling

2012-01-23 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191313#comment-13191313
 ] 

Lars Hofhansl commented on HBASE-5257:
--

@Ted: Linked the issues instead.

As for this issue... For maximum flexibility and to avoid introducing wire 
incompatibility I propose a small code change in ScanQueryMatcher and a new 
VersionFilterWrapper that takes two Filters (both of course can the 
FilterLists), the first is evaluated pre column tracker, the 2nd is run post 
column tracker.


> Allow filter to be evaluated after version handling
> ---
>
> Key: HBASE-5257
> URL: https://issues.apache.org/jira/browse/HBASE-5257
> Project: HBase
>  Issue Type: Improvement
>Reporter: Lars Hofhansl
>
> There are various usecases and filter types where evaluating the filter 
> before version are handled either do not make sense, or make filter handling 
> more complicated.
> Also see this comment in ScanQueryMatcher:
> {code}
> /**
>  * Filters should be checked before checking column trackers. If we do
>  * otherwise, as was previously being done, ColumnTracker may increment 
> its
>  * counter for even that KV which may be discarded later on by Filter. 
> This
>  * would lead to incorrect results in certain cases.
>  */
> {code}
> So we had Filters after the column trackers (which do the version checking), 
> and then moved it.
> Should be at the discretion of the Filter.
> Could either add a new method to FilterBase (maybe excludeVersions() or 
> something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
> only be used as outmost filter and indicates the same (maybe 
> ExcludeVersionsFilter).
> See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5210) HFiles are missing from an incremental load

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191314#comment-13191314
 ] 

Zhihong Yu commented on HBASE-5210:
---

I prefer Lawrence's approach.
The only consideration is that it takes relatively long for the proposed change 
in FileOutputCommitter.moveTaskOutputs() to be published, reviewed and pushed 
upstream.

> HFiles are missing from an incremental load
> ---
>
> Key: HBASE-5210
> URL: https://issues.apache.org/jira/browse/HBASE-5210
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.2
> Environment: HBase 0.90.2 with Hadoop-0.20.2 (with durable sync).  
> RHEL 2.6.18-164.15.1.el5.  4 node cluster (1 master, 3 slaves)
>Reporter: Lawrence Simpson
> Attachments: HBASE-5210-crazy-new-getRandomFilename.patch
>
>
> We run an overnight map/reduce job that loads data from an external source 
> and adds that data to an existing HBase table.  The input files have been 
> loaded into hdfs.  The map/reduce job uses the HFileOutputFormat (and the 
> TotalOrderPartitioner) to create HFiles which are subsequently added to the 
> HBase table.  On at least two separate occasions (that we know of), a range 
> of output would be missing for a given day.  The range of keys for the 
> missing values corresponded to those of a particular region.  This implied 
> that a complete HFile somehow went missing from the job.  Further 
> investigation revealed the following:
>  * Two different reducers (running in separate JVMs and thus separate class 
> loaders)
>  * in the same server can end up using the same file names for their
>  * HFiles.  The scenario is as follows:
>  *1.  Both reducers start near the same time.
>  *2.  The first reducer reaches the point where it wants to write its 
> first file.
>  *3.  It uses the StoreFile class which contains a static Random 
> object 
>  *which is initialized by default using a timestamp.
>  *4.  The file name is generated using the random number generator.
>  *5.  The file name is checked against other existing files.
>  *6.  The file is written into temporary files in a directory named
>  *after the reducer attempt.
>  *7.  The second reduce task reaches the same point, but its 
> StoreClass
>  *(which is now in the file system's cache) gets loaded within the
>  *time resolution of the OS and thus initializes its Random()
>  *object with the same seed as the first task.
>  *8.  The second task also checks for an existing file with the name
>  *generated by the random number generator and finds no conflict
>  *because each task is writing files in its own temporary folder.
>  *9.  The first task finishes and gets its temporary files committed
>  *to the "real" folder specified for output of the HFiles.
>  * 10.The second task then reaches its own conclusion and commits its
>  *files (moveTaskOutputs).  The released Hadoop code just 
> overwrites
>  *any files with the same name.  No warning messages or anything.
>  *The first task's HFiles just go missing.
>  * 
>  *  Note:  The reducers here are NOT different attempts at the same 
>  *reduce task.  They are different reduce tasks so data is
>  *really lost.
> I am currently testing a fix in which I have added code to the Hadoop 
> FileOutputCommitter.moveTaskOutputs method to check for a conflict with
> an existing file in the final output folder and to rename the HFile if
> needed.  This may not be appropriate for all uses of FileOutputFormat.
> So I have put this into a new class which is then used by a subclass of
> HFileOutputFormat.  Subclassing of FileOutputCommitter itself was a bit 
> more of a problem due to private declarations.
> I don't know if my approach is the best fix for the problem.  If someone
> more knowledgeable than myself deems that it is, I will be happy to share
> what I have done and by that time I may have some information on the
> results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5230:
---

Attachment: D1353.3.patch

mbautin updated the revision "[jira] [HBASE-5230] Extend TestCacheOnWrite to 
ensure we don't cache data blocks on compaction".
Reviewers: nspiegelberg, tedyu, Liyin, stack, JIRA

  Rebasing on trunk changes.

REVISION DETAIL
  https://reviews.facebook.net/D1353

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java


> Unit test to ensure compactions don't cache data on write
> -
>
> Key: HBASE-5230
> URL: https://issues.apache.org/jira/browse/HBASE-5230
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch
>
>
> Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
> write during compactions even if cache-on-write is enabled generally 
> enabled). This is because we have very different implementations of 
> HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
> and with CacheConfig (presumably it's there but not sure if it even works, 
> since the patch in HBASE-3976 may not have been committed). We need to create 
> a unit test to verify that we don't cache data blocks on write during 
> compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4920) We need a mascot, a totem

2012-01-23 Thread Enis Soztutar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191317#comment-13191317
 ] 

Enis Soztutar commented on HBASE-4920:
--

Orca +1, Hadoop -> Elephant, HBase -> Orca makes sense in my view. I liked 
design option 2's as well, can your friend put together the logo with the HBase 
text, to see how they look together. 

> We need a mascot, a totem
> -
>
> Key: HBASE-4920
> URL: https://issues.apache.org/jira/browse/HBASE-4920
> Project: HBase
>  Issue Type: Task
>Reporter: stack
> Attachments: HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 
> 2011-11-30 at 4.06.17 PM.png, apache hbase orca logo_Proof 3.pdf, photo 
> (2).JPG
>
>
> We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
> Clyesdale.  We need something else.
> We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
> and we could order boxes of them from some off-shore sweatshop that 
> subcontracts to a contractor who employs child labor only.
> Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
> Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
> here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
> translation, bigdata).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5230:
--

Attachment: Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch

Attaching the most recent patch (rebased on trunk changes -- maybe even 
identical).

> Unit test to ensure compactions don't cache data on write
> -
>
> Key: HBASE-5230
> URL: https://issues.apache.org/jira/browse/HBASE-5230
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch
>
>
> Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
> write during compactions even if cache-on-write is enabled generally 
> enabled). This is because we have very different implementations of 
> HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
> and with CacheConfig (presumably it's there but not sure if it even works, 
> since the patch in HBASE-3976 may not have been committed). We need to create 
> a unit test to verify that we don't cache data blocks on write during 
> compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191320#comment-13191320
 ] 

Mikhail Bautin commented on HBASE-5230:
---

@Ted: the unit test failure at 
https://builds.apache.org/job/PreCommit-HBASE-Build/824//testReport/org.apache.hadoop.hbase.regionserver/TestAtomicOperation/testRowMutationMultiThreads/
 seems unrelated. Is this patch OK to be committed? (We can wait for another 
run of unit tests if necessary, I've just re-uploaded the patch.)

> Unit test to ensure compactions don't cache data on write
> -
>
> Key: HBASE-5230
> URL: https://issues.apache.org/jira/browse/HBASE-5230
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch
>
>
> Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
> write during compactions even if cache-on-write is enabled generally 
> enabled). This is because we have very different implementations of 
> HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
> and with CacheConfig (presumably it's there but not sure if it even works, 
> since the patch in HBASE-3976 may not have been committed). We need to create 
> a unit test to verify that we don't cache data blocks on write during 
> compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5240) HBase internalscanner.next javadoc doesn't imply whether or not results are appended or not

2012-01-23 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191328#comment-13191328
 ] 

jirapos...@reviews.apache.org commented on HBASE-5240:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3594/
---

Review request for hbase.


Summary
---

Just looking at 
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/InternalScanner.html.
 We don't know whether or not the results are appended to results list, or if 
we always clear it first.

boolean next(List results)
Grab the next row's worth of values.
boolean next(List result, int limit)
Grab the next row's worth of values with a limit on the number of values to 
return.

Method Detail
next

boolean next(List results)
throws IOException

Grab the next row's worth of values.

Parameters:
results - return output array 
Returns:
true if more rows exist after this one, false if scanner is done 
Throws:
IOException - e

next

boolean next(List result,
int limit)
throws IOException

Grab the next row's worth of values with a limit on the number of values to 
return.

Parameters:
result - return output array
limit - limit on row count to get 
Returns:
true if more rows exist after this one, false if scanner is done 
Throws:
IOException - e


This addresses bug HBASE-5240.
https://issues.apache.org/jira/browse/HBASE-5240


Diffs
-

  src/main/java/org/apache/hadoop/hbase/regionserver/InternalScanner.java 
0f5f36c 

Diff: https://reviews.apache.org/r/3594/diff


Testing
---


Thanks,

Alex



> HBase internalscanner.next javadoc doesn't imply whether or not results are 
> appended or not
> ---
>
> Key: HBASE-5240
> URL: https://issues.apache.org/jira/browse/HBASE-5240
> Project: HBase
>  Issue Type: Bug
>Reporter: Alex Newman
>Assignee: Alex Newman
> Attachments: 
> 0001-HBASE-5240.-HBase-internalscanner.next-javadoc-doesn.patch
>
>
> Just looking at 
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/InternalScanner.html.
>  We don't know whether or not the results are appended to results list, or if 
> we always clear it first.
> boolean   next(List results)
>   Grab the next row's worth of values.
>  boolean  next(List result, int limit)
>   Grab the next row's worth of values with a limit on the number of 
> values to return.
>  
> Method Detail
> next
> boolean next(List results)
>  throws IOException
> Grab the next row's worth of values.
> Parameters:
> results - return output array 
> Returns:
> true if more rows exist after this one, false if scanner is done 
> Throws:
> IOException - e
> next
> boolean next(List result,
>  int limit)
>  throws IOException
> Grab the next row's worth of values with a limit on the number of values 
> to return.
> Parameters:
> result - return output array
> limit - limit on row count to get 
> Returns:
> true if more rows exist after this one, false if scanner is done 
> Throws:
> IOException - e

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5210) HFiles are missing from an incremental load

2012-01-23 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191327#comment-13191327
 ] 

Todd Lipcon commented on HBASE-5210:


Why not change the output file name to be based on the task attempt ID? There 
is already a unique id for each task available...

> HFiles are missing from an incremental load
> ---
>
> Key: HBASE-5210
> URL: https://issues.apache.org/jira/browse/HBASE-5210
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.2
> Environment: HBase 0.90.2 with Hadoop-0.20.2 (with durable sync).  
> RHEL 2.6.18-164.15.1.el5.  4 node cluster (1 master, 3 slaves)
>Reporter: Lawrence Simpson
> Attachments: HBASE-5210-crazy-new-getRandomFilename.patch
>
>
> We run an overnight map/reduce job that loads data from an external source 
> and adds that data to an existing HBase table.  The input files have been 
> loaded into hdfs.  The map/reduce job uses the HFileOutputFormat (and the 
> TotalOrderPartitioner) to create HFiles which are subsequently added to the 
> HBase table.  On at least two separate occasions (that we know of), a range 
> of output would be missing for a given day.  The range of keys for the 
> missing values corresponded to those of a particular region.  This implied 
> that a complete HFile somehow went missing from the job.  Further 
> investigation revealed the following:
>  * Two different reducers (running in separate JVMs and thus separate class 
> loaders)
>  * in the same server can end up using the same file names for their
>  * HFiles.  The scenario is as follows:
>  *1.  Both reducers start near the same time.
>  *2.  The first reducer reaches the point where it wants to write its 
> first file.
>  *3.  It uses the StoreFile class which contains a static Random 
> object 
>  *which is initialized by default using a timestamp.
>  *4.  The file name is generated using the random number generator.
>  *5.  The file name is checked against other existing files.
>  *6.  The file is written into temporary files in a directory named
>  *after the reducer attempt.
>  *7.  The second reduce task reaches the same point, but its 
> StoreClass
>  *(which is now in the file system's cache) gets loaded within the
>  *time resolution of the OS and thus initializes its Random()
>  *object with the same seed as the first task.
>  *8.  The second task also checks for an existing file with the name
>  *generated by the random number generator and finds no conflict
>  *because each task is writing files in its own temporary folder.
>  *9.  The first task finishes and gets its temporary files committed
>  *to the "real" folder specified for output of the HFiles.
>  * 10.The second task then reaches its own conclusion and commits its
>  *files (moveTaskOutputs).  The released Hadoop code just 
> overwrites
>  *any files with the same name.  No warning messages or anything.
>  *The first task's HFiles just go missing.
>  * 
>  *  Note:  The reducers here are NOT different attempts at the same 
>  *reduce task.  They are different reduce tasks so data is
>  *really lost.
> I am currently testing a fix in which I have added code to the Hadoop 
> FileOutputCommitter.moveTaskOutputs method to check for a conflict with
> an existing file in the final output folder and to rename the HFile if
> needed.  This may not be appropriate for all uses of FileOutputFormat.
> So I have put this into a new class which is then used by a subclass of
> HFileOutputFormat.  Subclassing of FileOutputCommitter itself was a bit 
> more of a problem due to private declarations.
> I don't know if my approach is the best fix for the problem.  If someone
> more knowledgeable than myself deems that it is, I will be happy to share
> what I have done and by that time I may have some information on the
> results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-23 Thread Liyin Tang (Created) (JIRA)
Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang


Assuming the HBase and MapReduce running in the same cluster, the 
TableInputFormat is to override the split function which divides all the 
regions from one particular table into a series of mapper tasks. So each mapper 
task can process a region or one part of a region. Ideally, the mapper task 
should run on the same machine on which the region server hosts the 
corresponding region. That's the motivation that the TableInputFormat sets the 
RegionLocation so that the MapReduce framework can respect the node locality. 

The code simply set the host name of the region server as the HRegionLocation. 
However, the host name of the region server may have different format with the 
host name of the task tracker (Mapper task). The task tracker always gets its 
hostname by the reverse DNS lookup. And the DNS service may return different 
host name format. For example, the host name of the region server is correctly 
set as a.b.c.d while the reverse DNS lookup may return a.b.c.d. (With an 
additional doc in the end).

So the solution is to set the RegionLocation by the reverse DNS lookup as well. 
No matter what host name format the DNS system is using, the TableInputFormat 
has the responsibility to keep the consistent host name format with the 
MapReduce framework.







--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4920) We need a mascot, a totem

2012-01-23 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191339#comment-13191339
 ] 

Jonathan Hsieh commented on HBASE-4920:
---

I feel the "cyber" look and the hard edges of the wordmark doesn't quite fit 
with the roundness of the image but like the general idea (maybe a "shaper" 
style for the same idea).

@Stack http://en.wikipedia.org/wiki/Vancouver_Canucks

@Lars Actual pacific northwest native american totem poles with octopus/squid
http://users.imag.net/~sry.jkramer/nativetotems/common.htm
http://www.flickr.com/photos/lostviking/3419653151/

Giant squids are pretty close to the International Orange (Engineering) color.
http://blogs.sfweekly.com/thesnitch/2008/01/breaking_giant_cartoon_squid_a.php

> We need a mascot, a totem
> -
>
> Key: HBASE-4920
> URL: https://issues.apache.org/jira/browse/HBASE-4920
> Project: HBase
>  Issue Type: Task
>Reporter: stack
> Attachments: HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 
> 2011-11-30 at 4.06.17 PM.png, apache hbase orca logo_Proof 3.pdf, photo 
> (2).JPG
>
>
> We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
> Clyesdale.  We need something else.
> We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
> and we could order boxes of them from some off-shore sweatshop that 
> subcontracts to a contractor who employs child labor only.
> Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
> Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
> here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
> translation, bigdata).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4920) We need a mascot, a totem

2012-01-23 Thread Jonathan Hsieh (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191339#comment-13191339
 ] 

Jonathan Hsieh edited comment on HBASE-4920 at 1/23/12 6:42 PM:


I feel the "cyber" look and the hard edges of the wordmark doesn't quite fit 
with the roundness of the image but like the general idea (maybe a "sharper" 
style for the same idea).

@Stack http://en.wikipedia.org/wiki/Vancouver_Canucks

@Lars Actual pacific northwest native american totem poles with octopus/squid
http://users.imag.net/~sry.jkramer/nativetotems/common.htm
http://www.flickr.com/photos/lostviking/3419653151/

Giant squids are pretty close to the International Orange (Engineering) color.
http://blogs.sfweekly.com/thesnitch/2008/01/breaking_giant_cartoon_squid_a.php

  was (Author: jmhsieh):
I feel the "cyber" look and the hard edges of the wordmark doesn't quite 
fit with the roundness of the image but like the general idea (maybe a "shaper" 
style for the same idea).

@Stack http://en.wikipedia.org/wiki/Vancouver_Canucks

@Lars Actual pacific northwest native american totem poles with octopus/squid
http://users.imag.net/~sry.jkramer/nativetotems/common.htm
http://www.flickr.com/photos/lostviking/3419653151/

Giant squids are pretty close to the International Orange (Engineering) color.
http://blogs.sfweekly.com/thesnitch/2008/01/breaking_giant_cartoon_squid_a.php
  
> We need a mascot, a totem
> -
>
> Key: HBASE-4920
> URL: https://issues.apache.org/jira/browse/HBASE-4920
> Project: HBase
>  Issue Type: Task
>Reporter: stack
> Attachments: HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 
> 2011-11-30 at 4.06.17 PM.png, apache hbase orca logo_Proof 3.pdf, photo 
> (2).JPG
>
>
> We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
> Clyesdale.  We need something else.
> We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
> and we could order boxes of them from some off-shore sweatshop that 
> subcontracts to a contractor who employs child labor only.
> Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
> Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
> here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
> translation, bigdata).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5240) HBase internalscanner.next javadoc doesn't imply whether or not results are appended or not

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191341#comment-13191341
 ] 

Hadoop QA commented on HBASE-5240:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12511522/0001-HBASE-5240.-HBase-internalscanner.next-javadoc-doesn.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 156 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestAtomicOperation
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/838//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/838//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/838//console

This message is automatically generated.

> HBase internalscanner.next javadoc doesn't imply whether or not results are 
> appended or not
> ---
>
> Key: HBASE-5240
> URL: https://issues.apache.org/jira/browse/HBASE-5240
> Project: HBase
>  Issue Type: Bug
>Reporter: Alex Newman
>Assignee: Alex Newman
> Attachments: 
> 0001-HBASE-5240.-HBase-internalscanner.next-javadoc-doesn.patch
>
>
> Just looking at 
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/InternalScanner.html.
>  We don't know whether or not the results are appended to results list, or if 
> we always clear it first.
> boolean   next(List results)
>   Grab the next row's worth of values.
>  boolean  next(List result, int limit)
>   Grab the next row's worth of values with a limit on the number of 
> values to return.
>  
> Method Detail
> next
> boolean next(List results)
>  throws IOException
> Grab the next row's worth of values.
> Parameters:
> results - return output array 
> Returns:
> true if more rows exist after this one, false if scanner is done 
> Throws:
> IOException - e
> next
> boolean next(List result,
>  int limit)
>  throws IOException
> Grab the next row's worth of values with a limit on the number of values 
> to return.
> Parameters:
> result - return output array
> limit - limit on row count to get 
> Returns:
> true if more rows exist after this one, false if scanner is done 
> Throws:
> IOException - e

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191345#comment-13191345
 ] 

Zhihong Yu commented on HBASE-5230:
---

TestAtomicOperation passed locally.
The patch should be good to go.

> Unit test to ensure compactions don't cache data on write
> -
>
> Key: HBASE-5230
> URL: https://issues.apache.org/jira/browse/HBASE-5230
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch
>
>
> Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
> write during compactions even if cache-on-write is enabled generally 
> enabled). This is because we have very different implementations of 
> HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
> and with CacheConfig (presumably it's there but not sure if it even works, 
> since the patch in HBASE-3976 may not have been committed). We need to create 
> a unit test to verify that we don't cache data blocks on write during 
> compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5222) Stopping replication via the "stop_replication" command in hbase shell on a slave cluster isn't acknowledged in the replication sink

2012-01-23 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5222:
--

Component/s: shell
 replication

> Stopping replication via the "stop_replication" command in hbase shell on a 
> slave cluster isn't acknowledged in the replication sink
> 
>
> Key: HBASE-5222
> URL: https://issues.apache.org/jira/browse/HBASE-5222
> Project: HBase
>  Issue Type: Bug
>  Components: replication, shell
>Affects Versions: 0.90.4
>Reporter: Josh Wymer
>
> After running "stop_replication" in the hbase shell on our slave cluster we 
> saw replication continue for weeks. Turns out that the replication sink is 
> missing a check to get the replication state and therefore continued to write.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5210) HFiles are missing from an incremental load

2012-01-23 Thread Jimmy Xiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191351#comment-13191351
 ] 

Jimmy Xiang commented on HBASE-5210:


I like this one.  It's really simple and clean.

> HFiles are missing from an incremental load
> ---
>
> Key: HBASE-5210
> URL: https://issues.apache.org/jira/browse/HBASE-5210
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.2
> Environment: HBase 0.90.2 with Hadoop-0.20.2 (with durable sync).  
> RHEL 2.6.18-164.15.1.el5.  4 node cluster (1 master, 3 slaves)
>Reporter: Lawrence Simpson
> Attachments: HBASE-5210-crazy-new-getRandomFilename.patch
>
>
> We run an overnight map/reduce job that loads data from an external source 
> and adds that data to an existing HBase table.  The input files have been 
> loaded into hdfs.  The map/reduce job uses the HFileOutputFormat (and the 
> TotalOrderPartitioner) to create HFiles which are subsequently added to the 
> HBase table.  On at least two separate occasions (that we know of), a range 
> of output would be missing for a given day.  The range of keys for the 
> missing values corresponded to those of a particular region.  This implied 
> that a complete HFile somehow went missing from the job.  Further 
> investigation revealed the following:
>  * Two different reducers (running in separate JVMs and thus separate class 
> loaders)
>  * in the same server can end up using the same file names for their
>  * HFiles.  The scenario is as follows:
>  *1.  Both reducers start near the same time.
>  *2.  The first reducer reaches the point where it wants to write its 
> first file.
>  *3.  It uses the StoreFile class which contains a static Random 
> object 
>  *which is initialized by default using a timestamp.
>  *4.  The file name is generated using the random number generator.
>  *5.  The file name is checked against other existing files.
>  *6.  The file is written into temporary files in a directory named
>  *after the reducer attempt.
>  *7.  The second reduce task reaches the same point, but its 
> StoreClass
>  *(which is now in the file system's cache) gets loaded within the
>  *time resolution of the OS and thus initializes its Random()
>  *object with the same seed as the first task.
>  *8.  The second task also checks for an existing file with the name
>  *generated by the random number generator and finds no conflict
>  *because each task is writing files in its own temporary folder.
>  *9.  The first task finishes and gets its temporary files committed
>  *to the "real" folder specified for output of the HFiles.
>  * 10.The second task then reaches its own conclusion and commits its
>  *files (moveTaskOutputs).  The released Hadoop code just 
> overwrites
>  *any files with the same name.  No warning messages or anything.
>  *The first task's HFiles just go missing.
>  * 
>  *  Note:  The reducers here are NOT different attempts at the same 
>  *reduce task.  They are different reduce tasks so data is
>  *really lost.
> I am currently testing a fix in which I have added code to the Hadoop 
> FileOutputCommitter.moveTaskOutputs method to check for a conflict with
> an existing file in the final output folder and to rename the HFile if
> needed.  This may not be appropriate for all uses of FileOutputFormat.
> So I have put this into a new class which is then used by a subclass of
> HFileOutputFormat.  Subclassing of FileOutputCommitter itself was a bit 
> more of a problem due to private declarations.
> I don't know if my approach is the best fix for the problem.  If someone
> more knowledgeable than myself deems that it is, I will be happy to share
> what I have done and by that time I may have some information on the
> results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5260) [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using incorrect XML element to config entry

2012-01-23 Thread Doug Meil (Created) (JIRA)
[book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using 
incorrect XML element to config entry


 Key: HBASE-5260
 URL: https://issues.apache.org/jira/browse/HBASE-5260
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial


troubleshooting.xml
* the Troubleshooting/Network/Loopback IP entry is using the incorrect XML 
element to link to the Config section.  It's using "link" instead of an "xref", 
so the description is "???"   Oddly enough, though, the link actually works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5260) [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using incorrect XML element to config entry

2012-01-23 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5260:
-

Status: Patch Available  (was: Open)

> [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using 
> incorrect XML element to config entry
> 
>
> Key: HBASE-5260
> URL: https://issues.apache.org/jira/browse/HBASE-5260
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Trivial
> Attachments: troubleshooting_hbase_5260.xml.patch
>
>
> troubleshooting.xml
> * the Troubleshooting/Network/Loopback IP entry is using the incorrect XML 
> element to link to the Config section.  It's using "link" instead of an 
> "xref", so the description is "???"   Oddly enough, though, the link actually 
> works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5260) [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using incorrect XML element to config entry

2012-01-23 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5260:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using 
> incorrect XML element to config entry
> 
>
> Key: HBASE-5260
> URL: https://issues.apache.org/jira/browse/HBASE-5260
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Trivial
> Attachments: troubleshooting_hbase_5260.xml.patch
>
>
> troubleshooting.xml
> * the Troubleshooting/Network/Loopback IP entry is using the incorrect XML 
> element to link to the Config section.  It's using "link" instead of an 
> "xref", so the description is "???"   Oddly enough, though, the link actually 
> works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5260) [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using incorrect XML element to config entry

2012-01-23 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5260:
-

Attachment: troubleshooting_hbase_5260.xml.patch

> [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using 
> incorrect XML element to config entry
> 
>
> Key: HBASE-5260
> URL: https://issues.apache.org/jira/browse/HBASE-5260
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Trivial
> Attachments: troubleshooting_hbase_5260.xml.patch
>
>
> troubleshooting.xml
> * the Troubleshooting/Network/Loopback IP entry is using the incorrect XML 
> element to link to the Config section.  It's using "link" instead of an 
> "xref", so the description is "???"   Oddly enough, though, the link actually 
> works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5255) Use singletons for OperationStatus to save memory

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191356#comment-13191356
 ] 

Hadoop QA commented on HBASE-5255:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511526/5255-92.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 84 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/839//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/839//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/839//console

This message is automatically generated.

> Use singletons for OperationStatus to save memory
> -
>
> Key: HBASE-5255
> URL: https://issues.apache.org/jira/browse/HBASE-5255
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.90.5, 0.92.0
>Reporter: Benoit Sigoure
>Assignee: Benoit Sigoure
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.0, 0.92.1
>
> Attachments: 5255-92.txt, 5255-v2.txt, 
> HBASE-5255-0.92-Use-singletons-to-remove-unnecessary-memory-allocati.patch, 
> HBASE-5255-trunk-Use-singletons-to-remove-unnecessary-memory-allocati.patch
>
>
> Every single {{Put}} causes the allocation of at least one 
> {{OperationStatus}}, yet {{OperationStatus}} is almost always stateless, so 
> these allocations are unnecessary and could be avoided.  Attached patch adds 
> a few singletons and uses them, with no public API change.  I didn't test the 
> patches, but you get the idea.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191357#comment-13191357
 ] 

Hadoop QA commented on HBASE-5230:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12511528/Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 8 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 84 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.coprocessor.TestClassLoading
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/840//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/840//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/840//console

This message is automatically generated.

> Unit test to ensure compactions don't cache data on write
> -
>
> Key: HBASE-5230
> URL: https://issues.apache.org/jira/browse/HBASE-5230
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch
>
>
> Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
> write during compactions even if cache-on-write is enabled generally 
> enabled). This is because we have very different implementations of 
> HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
> and with CacheConfig (presumably it's there but not sure if it even works, 
> since the patch in HBASE-3976 may not have been committed). We need to create 
> a unit test to verify that we don't cache data blocks on write during 
> compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-23 Thread Mubarak Seyed (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mubarak Seyed updated HBASE-4720:
-

Attachment: HBASE-4720.trunk.v6.patch

The attached file (HBASE-4720.trunk.v6.patch) is updated patch file. Thanks.

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
> HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4397) -ROOT-, .META. tables stay offline for too long in recovery phase after all RSs are shutdown at the same time

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191363#comment-13191363
 ] 

Hudson commented on HBASE-4397:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5237 Addendum for HBASE-5160 and HBASE-4397(Ram)

ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


> -ROOT-, .META. tables stay offline for too long in recovery phase after all 
> RSs are shutdown at the same time
> -
>
> Key: HBASE-4397
> URL: https://issues.apache.org/jira/browse/HBASE-4397
> Project: HBase
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 0.94.0, 0.92.0
>
> Attachments: HBASE-4397-0.92.patch
>
>
> 1. Shutdown all RSs.
> 2. Bring all RS back online.
> The "-ROOT-", ".META." stay in offline state until timeout monitor force 
> assignment 30 minutes later. That is because HMaster can't find a RS to 
> assign the tables to in assign operation.
> 011-09-13 13:25:52,743 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
> Failed assignment of -ROOT-,,0.70236052 to sea-lab-4,60020,1315870341387, 
> trying to assign elsewhere instead; retry=0
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373)
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:345)
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1002)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:854)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:148)
> at $Proxy9.openRegion(Unknown Source)
> at 
> org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:407)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1408)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1153)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1128)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1123)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:1788)
> at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.verifyAndAssignRoot(ServerShutdownHandler.java:100)
> at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.verifyAndAssignRootWithRetries(ServerShutdownHandler.java:118)
> at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:181)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:167)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2011-09-13 13:25:52,743 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Unable to find a viable 
> location to assign region -ROOT-,,0.70236052
> Possible fixes:
> 1. Have serverManager handle "server online" event similar to how 
> RegionServerTracker.java calls servermanager.expireServer in the case server 
> goes down.
> 2. Make timeoutMonitor handle the situation better. This is a special 
> situation in the cluster. 30 minutes timeout can be skipped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5243) LogSyncerThread not getting shutdown waiting for the interrupted flag

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191360#comment-13191360
 ] 

Hudson commented on HBASE-5243:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5243 LogSyncerThread not getting shutdown waiting for the interrupted 
flag(Ram).

ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java


> LogSyncerThread not getting shutdown waiting for the interrupted flag
> -
>
> Key: HBASE-5243
> URL: https://issues.apache.org/jira/browse/HBASE-5243
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.6, 0.92.1
>
> Attachments: HBASE-5243_0.90.patch, HBASE-5243_0.90_1.patch, 
> HBASE-5243_trunk.patch
>
>
> In the LogSyncer run() we keep looping till this.isInterrupted flag is set.
> But in some cases the DFSclient is consuming the Interrupted exception.  So
> we are running into infinite loop in some shutdown cases.
> I would suggest that as we are the ones who tries to close down the
> LogSyncerThread we can introduce a variable like
> Close or shutdown and based on the state of this flag along with
> isInterrupted() we can make the thread stop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5160) Backport HBASE-4397 - -ROOT-, .META. tables stay offline for too long in recovery phase after all RSs are shutdown at the same time

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191362#comment-13191362
 ] 

Hudson commented on HBASE-5160:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5237 Addendum for HBASE-5160 and HBASE-4397(Ram)

ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


> Backport HBASE-4397 - -ROOT-, .META. tables stay offline for too long in 
> recovery phase after all RSs are shutdown at the same time
> ---
>
> Key: HBASE-5160
> URL: https://issues.apache.org/jira/browse/HBASE-5160
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
> Fix For: 0.90.6
>
> Attachments: HBASE-5160-AssignmentManager.patch, HBASE-5160_2.patch
>
>
> Backporting to 0.90.6 considering the importance of the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5235) HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions.

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191359#comment-13191359
 ] 

Hudson commented on HBASE-5235:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5235 HLogSplitter writer thread's streams not getting closed when any 
of the writer threads has exceptions. (Ram)

ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java


> HLogSplitter writer thread's streams not getting closed when any of the 
> writer threads has exceptions.
> --
>
> Key: HBASE-5235
> URL: https://issues.apache.org/jira/browse/HBASE-5235
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5, 0.92.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.6, 0.92.1
>
> Attachments: HBASE-5235_0.90.patch, HBASE-5235_0.90_1.patch, 
> HBASE-5235_0.90_2.patch, HBASE-5235_trunk.patch
>
>
> Pls find the analysis.  Correct me if am wrong
> {code}
> 2012-01-15 05:14:02,374 FATAL 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-9 Got 
> while writing log entry to log
> java.io.IOException: All datanodes 10.18.40.200:50010 are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3373)
>   at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2811)
>   at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3026)
> {code}
> Here we have an exception in one of the writer threads. If any exception we 
> try to hold it in an Atomic variable 
> {code}
>   private void writerThreadError(Throwable t) {
> thrown.compareAndSet(null, t);
>   }
> {code}
> In the finally block of splitLog we try to close the streams.
> {code}
>   for (WriterThread t: writerThreads) {
> try {
>   t.join();
> } catch (InterruptedException ie) {
>   throw new IOException(ie);
> }
> checkForErrors();
>   }
>   LOG.info("Split writers finished");
>   
>   return closeStreams();
> {code}
> Inside checkForErrors
> {code}
>   private void checkForErrors() throws IOException {
> Throwable thrown = this.thrown.get();
> if (thrown == null) return;
> if (thrown instanceof IOException) {
>   throw (IOException)thrown;
> } else {
>   throw new RuntimeException(thrown);
> }
>   }
> So once we throw the exception the DFSStreamer threads are not getting closed.
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5237) Addendum for HBASE-5160 and HBASE-4397

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191361#comment-13191361
 ] 

Hudson commented on HBASE-5237:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5237 Addendum for HBASE-5160 and HBASE-4397(Ram)

ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


> Addendum for HBASE-5160 and HBASE-4397
> --
>
> Key: HBASE-5237
> URL: https://issues.apache.org/jira/browse/HBASE-5237
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.6, 0.92.1
>
> Attachments: HBASE-5237_0.90.patch, HBASE-5237_trunk.patch
>
>
> As part of HBASE-4397 there is one more scenario where the patch has to be 
> applied.
> {code}
> RegionPlan plan = getRegionPlan(state, forceNewPlan);
>   if (plan == null) {
> debugLog(state.getRegion(),
> "Unable to determine a plan to assign " + state);
> return; // Should get reassigned later when RIT times out.
>   }
> {code}
> I think in this scenario also 
> {code}
> this.timeoutMonitor.setAllRegionServersOffline(true);
> {code}
> this should be done.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191365#comment-13191365
 ] 

Hudson commented on HBASE-5231:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5231  Backport HBASE-3373 (per-table load balancing) to 0.92

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/DefaultLoadBalancer.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java


> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231-v2.txt, 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3373) Allow regions to be load-balanced by table

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191364#comment-13191364
 ] 

Hudson commented on HBASE-3373:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5231  Backport HBASE-3373 (per-table load balancing) to 0.92

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/DefaultLoadBalancer.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java


> Allow regions to be load-balanced by table
> --
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
>Assignee: Zhihong Yu
> Fix For: 0.94.0
>
> Attachments: 3373.txt, HbaseBalancerTest2.java
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5261) Update HBase for Java 7

2012-01-23 Thread Mikhail Bautin (Created) (JIRA)
Update HBase for Java 7
---

 Key: HBASE-5261
 URL: https://issues.apache.org/jira/browse/HBASE-5261
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin


We need to make sure that HBase compiles and works with JDK 7. Once we verify 
it is reasonably stable, we can explore utilizing the G1 garbage collector. 
When all deployments are ready to move to JDK 7, we can start using new 
language features, but in the transition period we will need to maintain a 
codebase that compiles both with JDK 6 and JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191372#comment-13191372
 ] 

Mikhail Bautin commented on HBASE-5230:
---

The above failed tests passed locally:

Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 216.187 sec
Running org.apache.hadoop.hbase.mapreduce.TestImportTsv
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 78.841 sec
Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 97.529 sec
Running org.apache.hadoop.hbase.mapred.TestTableMapReduce
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 64.111 sec
Running org.apache.hadoop.hbase.coprocessor.TestClassLoading
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 28.787 sec

Results :

Tests run: 24, Failures: 0, Errors: 0, Skipped: 0


> Unit test to ensure compactions don't cache data on write
> -
>
> Key: HBASE-5230
> URL: https://issues.apache.org/jira/browse/HBASE-5230
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch
>
>
> Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
> write during compactions even if cache-on-write is enabled generally 
> enabled). This is because we have very different implementations of 
> HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
> and with CacheConfig (presumably it's there but not sure if it even works, 
> since the patch in HBASE-3976 may not have been committed). We need to create 
> a unit test to verify that we don't cache data blocks on write during 
> compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191377#comment-13191377
 ] 

stack commented on HBASE-5231:
--

It looks like a method named getAssignmentsByTable will only do this if a 
particular configuration is set; else it will do assignments the old way.  
Seems like an odd name for this method.  I'd have thought it would have 
remained getAssignments and then in getAssignments we'd switch on whether to do 
by table or not.

Does this change default? I can't tell.


> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231-v2.txt, 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4141) Fix LRU stats message

2012-01-23 Thread Vikram Srivastava (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Srivastava updated HBASE-4141:
-

Attachment: LruBlockCache_HBASE_4141.patch

Fixed the brackets. Currently the comma would not be printed if the value is 
zero.

> Fix LRU stats message
> -
>
> Key: HBASE-4141
> URL: https://issues.apache.org/jira/browse/HBASE-4141
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Lars George
>Priority: Trivial
>  Labels: newbie
> Attachments: LruBlockCache_HBASE_4141.patch
>
>
> Currently the DEBUG message looks like this:
> {noformat}
> 2011-07-26 04:21:52,344 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
> LRU Stats: total=3.24 MB, free=391.76 MB, max=395 MB, blocks=0, 
> accesses=118458, hits=0, hitRatio=0.00%%, cachingAccesses=0, cachingHits=0, 
> cachingHitsRatio=�%, evictions=0, evicted=0, evictedPerRun=NaN
> {noformat}
> Note the double percent on "hitRatio", and the stray character at 
> "cachingHitsRatio".
> The former is a added by the code in LruBlockCache.java:
> {code}
> ...
> "hitRatio=" +
>   (stats.getHitCount() == 0 ? "0" : 
> (StringUtils.formatPercent(stats.getHitRatio(), 2) + "%, ")) +
> ...
> {code}
> The StringUtils already adds a percent sign, so the trailing one here can be 
> dropped.
> The latter I presume is caused by the value not between 0.0 and 1.0. This 
> should be checked and "NaN" or so displayed instead as is done for other 
> values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191388#comment-13191388
 ] 

Hadoop QA commented on HBASE-4720:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12511535/HBASE-4720.trunk.v6.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 85 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/841//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/841//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/841//console

This message is automatically generated.

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
> HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5261) Update HBase for Java 7

2012-01-23 Thread Mikhail Bautin (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin reassigned HBASE-5261:
-

Assignee: Mikhail Bautin

> Update HBase for Java 7
> ---
>
> Key: HBASE-5261
> URL: https://issues.apache.org/jira/browse/HBASE-5261
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>
> We need to make sure that HBase compiles and works with JDK 7. Once we verify 
> it is reasonably stable, we can explore utilizing the G1 garbage collector. 
> When all deployments are ready to move to JDK 7, we can start using new 
> language features, but in the transition period we will need to maintain a 
> codebase that compiles both with JDK 6 and JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-23 Thread Mikhail Bautin (Created) (JIRA)
Structured event log for HBase for monitoring and auto-tuning performance
-

 Key: HBASE-5262
 URL: https://issues.apache.org/jira/browse/HBASE-5262
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin


Creating this JIRA to open a discussion about a structured (machine-readable) 
log that will record events such as compaction start/end times, compaction 
input/output files, their sizes, the same for flushes, etc. This can be stored 
e.g. in a new system table in HBase itself. The data from this log can then be 
analyzed and used to optimize compactions at run time, or otherwise auto-tune 
HBase configuration to reduce the number of knobs the user has to configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5257) Allow filter to be evaluated after version handling

2012-01-23 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191397#comment-13191397
 ] 

Lars Hofhansl commented on HBASE-5257:
--

Running filter post column trackers does only work for Filters that do nothing 
in filterRowKey and filterRow.

> Allow filter to be evaluated after version handling
> ---
>
> Key: HBASE-5257
> URL: https://issues.apache.org/jira/browse/HBASE-5257
> Project: HBase
>  Issue Type: Improvement
>Reporter: Lars Hofhansl
>
> There are various usecases and filter types where evaluating the filter 
> before version are handled either do not make sense, or make filter handling 
> more complicated.
> Also see this comment in ScanQueryMatcher:
> {code}
> /**
>  * Filters should be checked before checking column trackers. If we do
>  * otherwise, as was previously being done, ColumnTracker may increment 
> its
>  * counter for even that KV which may be discarded later on by Filter. 
> This
>  * would lead to incorrect results in certain cases.
>  */
> {code}
> So we had Filters after the column trackers (which do the version checking), 
> and then moved it.
> Should be at the discretion of the Filter.
> Could either add a new method to FilterBase (maybe excludeVersions() or 
> something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
> only be used as outmost filter and indicates the same (maybe 
> ExcludeVersionsFilter).
> See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5189) Add metrics to keep track of region-splits in RS

2012-01-23 Thread Mubarak Seyed (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mubarak Seyed updated HBASE-5189:
-

Attachment: HBASE-5189.trunk.v2.patch

If we move getMetrics().incrementSplitFailureCount() before rollback() and if 
rollback() returns false (or) throws RuntimeException then we don't need to 
increment split failure count as RS is going to abort itself.

The one place which needs to call getMetrics().incrementSplitFailureCount() is 
catch block

{code}
} catch (IOException ex) {
  LOG.error("Split failed " + this, RemoteExceptionHandler
  .checkIOException(ex));
  this.server.getMetrics().incrementSplitFailureCount();
  server.checkFileSystem();
{code}

as rollback() throws IOException.

The attached patch (HBASE-5189.trunk.v2.patch) updates the patch.
Thanks.

> Add metrics to keep track of region-splits in RS
> 
>
> Key: HBASE-5189
> URL: https://issues.apache.org/jira/browse/HBASE-5189
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics, regionserver
>Affects Versions: 0.90.5, 0.92.0
>Reporter: Mubarak Seyed
>Assignee: Mubarak Seyed
>Priority: Minor
>  Labels: noob
> Attachments: HBASE-5189.trunk.v1.patch, HBASE-5189.trunk.v2.patch
>
>
> For write-heavy workload with region-size 1 GB, region-split is considerably 
> high. We do normally grep the NN log (grep "mkdir*.split" NN.log | sort | 
> uniq -c) to get the count.
> I would like to have a counter incremented each time region-split execution 
> succeeds and this counter exposed via the metrics stuff in HBase.
> - regionSplitSuccessCount
> - regionSplitFailureCount (will help us to correlate the timestamp range in 
> RS logs across all RS)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4131) Make the Replication Service pluggable via a standard interface definition

2012-01-23 Thread Jeff Whiting (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191408#comment-13191408
 ] 

Jeff Whiting commented on HBASE-4131:
-

This work is great.  However we need this in 0.92 (and maybe 0.90).  I'm 
thinking it shouldn't be too big of a deal to backport this as it doesn't 
change any replication functionality but just makes it pluggable. 

I'll do the footwork of making the patches for the older versions and creating 
a new jira for the backport. Do you think it is this feasible to get this back 
ported?

> Make the Replication Service pluggable via a standard interface definition
> --
>
> Key: HBASE-4131
> URL: https://issues.apache.org/jira/browse/HBASE-4131
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.94.0
>
> Attachments: 4131-backedout.txt, replicationInterface1.txt, 
> replicationInterface2.txt, replicationInterface3.txt, 
> replicationInterface4.txt
>
>
> The current HBase code supports a replication service that can be used to 
> sync data from from one hbase cluster to another. It would be nice to make it 
> a pluggable interface so that other cross-data-center replication services 
> can be used in conjuction with HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5189) Add metrics to keep track of region-splits in RS

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191421#comment-13191421
 ] 

Zhihong Yu commented on HBASE-5189:
---

Patch v2 makes sense.

> Add metrics to keep track of region-splits in RS
> 
>
> Key: HBASE-5189
> URL: https://issues.apache.org/jira/browse/HBASE-5189
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics, regionserver
>Affects Versions: 0.90.5, 0.92.0
>Reporter: Mubarak Seyed
>Assignee: Mubarak Seyed
>Priority: Minor
>  Labels: noob
> Attachments: HBASE-5189.trunk.v1.patch, HBASE-5189.trunk.v2.patch
>
>
> For write-heavy workload with region-size 1 GB, region-split is considerably 
> high. We do normally grep the NN log (grep "mkdir*.split" NN.log | sort | 
> uniq -c) to get the count.
> I would like to have a counter incremented each time region-split execution 
> succeeds and this counter exposed via the metrics stuff in HBase.
> - regionSplitSuccessCount
> - regionSplitFailureCount (will help us to correlate the timestamp range in 
> RS logs across all RS)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191432#comment-13191432
 ] 

Zhihong Yu commented on HBASE-5231:
---

"hbase.master.loadbalance.bytable" controls whether per-table assignment is 
used.
If per-table assignment is off, the original getAssignments() would be called.

Both getAssignmentsByTable() and getAssignments() are package private.

> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231-v2.txt, 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Eugene Koontz (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191443#comment-13191443
 ] 

Eugene Koontz commented on HBASE-5258:
--

Hi Ted,
Do you have an estimate for how much network traffic or heap footprint that 
this would save?
Just curious, not an objection.
-Eugene

> Move coprocessors set out of RegionLoad
> ---
>
> Key: HBASE-5258
> URL: https://issues.apache.org/jira/browse/HBASE-5258
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>
> When I worked on HBASE-5256, I revisited the code related to Ser/De of 
> coprocessors set in RegionLoad.
> I think the rationale for embedding coprocessors set is for maximum 
> flexibility where each region can load different coprocessors.
> This flexibility is causing extra cost in the region server to Master 
> communication and increasing the footprint of Master heap.
> Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191461#comment-13191461
 ] 

Zhihong Yu commented on HBASE-5258:
---

Since each coprocessor is represented by a string, the potential savings can be 
considerable, especially if many regions are hosted on each region server.


> Move coprocessors set out of RegionLoad
> ---
>
> Key: HBASE-5258
> URL: https://issues.apache.org/jira/browse/HBASE-5258
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>
> When I worked on HBASE-5256, I revisited the code related to Ser/De of 
> coprocessors set in RegionLoad.
> I think the rationale for embedding coprocessors set is for maximum 
> flexibility where each region can load different coprocessors.
> This flexibility is causing extra cost in the region server to Master 
> communication and increasing the footprint of Master heap.
> Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5243) LogSyncerThread not getting shutdown waiting for the interrupted flag

2012-01-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5243:
--

Attachment: 5243-92.addendum

The addendum fixes broken 0.92 build

> LogSyncerThread not getting shutdown waiting for the interrupted flag
> -
>
> Key: HBASE-5243
> URL: https://issues.apache.org/jira/browse/HBASE-5243
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.6, 0.92.1
>
> Attachments: 5243-92.addendum, HBASE-5243_0.90.patch, 
> HBASE-5243_0.90_1.patch, HBASE-5243_trunk.patch
>
>
> In the LogSyncer run() we keep looping till this.isInterrupted flag is set.
> But in some cases the DFSclient is consuming the Interrupted exception.  So
> we are running into infinite loop in some shutdown cases.
> I would suggest that as we are the ones who tries to close down the
> LogSyncerThread we can introduce a variable like
> Close or shutdown and based on the state of this flag along with
> isInterrupted() we can make the thread stop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191488#comment-13191488
 ] 

Phabricator commented on HBASE-5230:


nspiegelberg has commented on the revision "[jira] [HBASE-5230] Extend 
TestCacheOnWrite to ensure we don't cache data blocks on compaction".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:764-766 
currently, there is no intelligence to estimate the resulting compacted 
filesize and cache compactions up to a max size, correct?
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java:879-880
 use this static function to write a toString method?
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4502 
this comment should be changed

  // read the row, this should be a cache miss because we don't cache on 
compaction
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java:254 add

 // TODO: need to change this test if we add a cache size threshold for 
compactions
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java:295 
Assert.assertNull() is nice for clarity

REVISION DETAIL
  https://reviews.facebook.net/D1353


> Unit test to ensure compactions don't cache data on write
> -
>
> Key: HBASE-5230
> URL: https://issues.apache.org/jira/browse/HBASE-5230
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch
>
>
> Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
> write during compactions even if cache-on-write is enabled generally 
> enabled). This is because we have very different implementations of 
> HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
> and with CacheConfig (presumably it's there but not sure if it even works, 
> since the patch in HBASE-3976 may not have been committed). We need to create 
> a unit test to verify that we don't cache data blocks on write during 
> compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5243) LogSyncerThread not getting shutdown waiting for the interrupted flag

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191486#comment-13191486
 ] 

Zhihong Yu commented on HBASE-5243:
---

Applied addendum to 0.92 branch

> LogSyncerThread not getting shutdown waiting for the interrupted flag
> -
>
> Key: HBASE-5243
> URL: https://issues.apache.org/jira/browse/HBASE-5243
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.6, 0.92.1
>
> Attachments: 5243-92.addendum, HBASE-5243_0.90.patch, 
> HBASE-5243_0.90_1.patch, HBASE-5243_trunk.patch
>
>
> In the LogSyncer run() we keep looping till this.isInterrupted flag is set.
> But in some cases the DFSclient is consuming the Interrupted exception.  So
> we are running into infinite loop in some shutdown cases.
> I would suggest that as we are the ones who tries to close down the
> LogSyncerThread we can introduce a variable like
> Close or shutdown and based on the state of this flag along with
> isInterrupted() we can make the thread stop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5243) LogSyncerThread not getting shutdown waiting for the interrupted flag

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191494#comment-13191494
 ] 

Hadoop QA commented on HBASE-5243:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511556/5243-92.addendum
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The patch appears to cause mvn compile goal to fail.

-1 findbugs.  The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/842//testReport/
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/842//console

This message is automatically generated.

> LogSyncerThread not getting shutdown waiting for the interrupted flag
> -
>
> Key: HBASE-5243
> URL: https://issues.apache.org/jira/browse/HBASE-5243
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.6, 0.92.1
>
> Attachments: 5243-92.addendum, HBASE-5243_0.90.patch, 
> HBASE-5243_0.90_1.patch, HBASE-5243_trunk.patch
>
>
> In the LogSyncer run() we keep looping till this.isInterrupted flag is set.
> But in some cases the DFSclient is consuming the Interrupted exception.  So
> we are running into infinite loop in some shutdown cases.
> I would suggest that as we are the ones who tries to close down the
> LogSyncerThread we can introduce a variable like
> Close or shutdown and based on the state of this flag along with
> isInterrupted() we can make the thread stop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191497#comment-13191497
 ] 

Zhihong Yu commented on HBASE-5230:
---

This patch should be applied to 0.92, right ?
A patch for 0.92 would be desirable.

> Unit test to ensure compactions don't cache data on write
> -
>
> Key: HBASE-5230
> URL: https://issues.apache.org/jira/browse/HBASE-5230
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch
>
>
> Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
> write during compactions even if cache-on-write is enabled generally 
> enabled). This is because we have very different implementations of 
> HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
> and with CacheConfig (presumably it's there but not sure if it even works, 
> since the patch in HBASE-3976 may not have been committed). We need to create 
> a unit test to verify that we don't cache data blocks on write during 
> compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5255) Use singletons for OperationStatus to save memory

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191522#comment-13191522
 ] 

Zhihong Yu commented on HBASE-5255:
---

Integrated to 0.92 and TRUNK.

> Use singletons for OperationStatus to save memory
> -
>
> Key: HBASE-5255
> URL: https://issues.apache.org/jira/browse/HBASE-5255
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.90.5, 0.92.0
>Reporter: Benoit Sigoure
>Assignee: Benoit Sigoure
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.0, 0.92.1
>
> Attachments: 5255-92.txt, 5255-v2.txt, 
> HBASE-5255-0.92-Use-singletons-to-remove-unnecessary-memory-allocati.patch, 
> HBASE-5255-trunk-Use-singletons-to-remove-unnecessary-memory-allocati.patch
>
>
> Every single {{Put}} causes the allocation of at least one 
> {{OperationStatus}}, yet {{OperationStatus}} is almost always stateless, so 
> these allocations are unnecessary and could be avoided.  Attached patch adds 
> a few singletons and uses them, with no public API change.  I didn't test the 
> patches, but you get the idea.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup

2012-01-23 Thread David S. Wang (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David S. Wang reassigned HBASE-5209:


Assignee: David S. Wang

> HConnection/HMasterInterface should allow for way to get hostname of 
> currently active master in multi-master HBase setup
> 
>
> Key: HBASE-5209
> URL: https://issues.apache.org/jira/browse/HBASE-5209
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.94.0, 0.90.5, 0.92.0
>Reporter: Aditya Acharya
>Assignee: David S. Wang
>
> I have a multi-master HBase set up, and I'm trying to programmatically 
> determine which of the masters is currently active. But the API does not 
> allow me to do this. There is a getMaster() method in the HConnection class, 
> but it returns an HMasterInterface, whose methods do not allow me to find out 
> which master won the last race. The API should have a 
> getActiveMasterHostname() or something to that effect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191564#comment-13191564
 ] 

Zhihong Yu commented on HBASE-4720:
---

Patch v6 looks good.
Will integrate if Andrew doesn't have further comment.

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
> HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5263) Preserving cached data on compactions through cache-on-write

2012-01-23 Thread Mikhail Bautin (Created) (JIRA)
Preserving cached data on compactions through cache-on-write


 Key: HBASE-5263
 URL: https://issues.apache.org/jira/browse/HBASE-5263
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor


We are tackling HBASE-3976 and HBASE-5230 to make sure we don't trash the block 
cache on compactions if cache-on-write is enabled. However, it would be ideal 
to reduce the effect compactions have on the cached data. For every block we 
are writing for a compacted file we can decide whether it needs to be cached 
based on whether the original blocks containing the same data were already in 
cache. More precisely, for every HFile reader in a compaction we can maintain a 
boolean flag saying whether the current key-value came from a disk IO or the 
block cache. In the HFile writer for the compaction's output we can maintain a 
flag that is set if any of the key-values in the block being written came from 
a cached block, use that flag at the end of a block to decide whether to 
cache-on-write the block, and reset the flag to false on a block boundary. If 
such an inclusive approach would still trash the cache, we could restrict the 
total number of blocks to be cached per an output HFile, switch to an "and" 
logic instead of "or" logic for deciding whether to cache an output file block, 
or only cache a certain percentage of output file blocks that contain some of 
the previously cached data. 

Thanks to Nicolas for this elegant online algorithm idea!


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191589#comment-13191589
 ] 

Andrew Purtell commented on HBASE-5258:
---

bq. This flexibility is causing extra cost in the region server to Master 
communication and increasing the footprint of Master heap.

No doubt it is redundant to have each region report a coprocessor given how the 
framework currently works: All regions for a table will have an identical set 
of coprocessors loaded, or there is something bad happening.

bq. Would HServerLoad be a better place for this set ?

I have no major objection.

However, maybe we want a way to know if something bad happened on a region and 
a coprocessor on it went away? One could comb logs but that is hardly a 
convenient way to get online state.



> Move coprocessors set out of RegionLoad
> ---
>
> Key: HBASE-5258
> URL: https://issues.apache.org/jira/browse/HBASE-5258
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>
> When I worked on HBASE-5256, I revisited the code related to Ser/De of 
> coprocessors set in RegionLoad.
> I think the rationale for embedding coprocessors set is for maximum 
> flexibility where each region can load different coprocessors.
> This flexibility is causing extra cost in the region server to Master 
> communication and increasing the footprint of Master heap.
> Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191594#comment-13191594
 ] 

Zhihong Yu commented on HBASE-5258:
---

To my knowledge, for a mis-behaving coprocessor we either remove the buggy 
coprocessor or abort.
I wonder what scenario would lead to imbalanced coprocessors on a region.

> Move coprocessors set out of RegionLoad
> ---
>
> Key: HBASE-5258
> URL: https://issues.apache.org/jira/browse/HBASE-5258
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>
> When I worked on HBASE-5256, I revisited the code related to Ser/De of 
> coprocessors set in RegionLoad.
> I think the rationale for embedding coprocessors set is for maximum 
> flexibility where each region can load different coprocessors.
> This flexibility is causing extra cost in the region server to Master 
> communication and increasing the footprint of Master heap.
> Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5230:
---

Attachment: D1353.4.patch

mbautin updated the revision "[jira] [HBASE-5230] Extend TestCacheOnWrite to 
ensure we don't cache data blocks on compaction".
Reviewers: nspiegelberg, tedyu, Liyin, stack, JIRA

  Addressing Nicolas's comments. Re-running all unit tests.

REVISION DETAIL
  https://reviews.facebook.net/D1353

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java


> Unit test to ensure compactions don't cache data on write
> -
>
> Key: HBASE-5230
> URL: https://issues.apache.org/jira/browse/HBASE-5230
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
> D1353.4.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch
>
>
> Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
> write during compactions even if cache-on-write is enabled generally 
> enabled). This is because we have very different implementations of 
> HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
> and with CacheConfig (presumably it's there but not sure if it even works, 
> since the patch in HBASE-3976 may not have been committed). We need to create 
> a unit test to verify that we don't cache data blocks on write during 
> compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191606#comment-13191606
 ] 

Phabricator commented on HBASE-5230:


mbautin has commented on the revision "[jira] [HBASE-5230] Extend 
TestCacheOnWrite to ensure we don't cache data blocks on compaction".

  Responses to comments inline.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java:746 Done.
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:756 Done.
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java:880
 Replaced this with a method that "prints out" the metrics into a StringBuilder 
and returns a string.
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java:253 
Since this is for data blocks, I renamed this to 
testNotCachingDataBlocksDuringCompaction.
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:764-766 That is 
correct, to my best knowledge. Eventually, I think we would like to 
intelligently decide whether to cache-on-write a block based on whether the 
data in question is already in the block cache as part of uncompacted files: 
https://issues.apache.org/jira/browse/HBASE-5263

  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java:879-880
 In my debugging I have not yet come across a case when I would have found a 
SchemaMetrics.toString method useful. Also, adapting this static method to 
implement toString would be tricky, since it relies on getMetricsSnapshot() 
that takes a "snapshot" of _all_ metrics, not just those for a particular 
table/CF combination corresponding to one SchemaMetrics instance. Therefore, I 
would prefer to leave SchemaMetrics.toString() out for now.
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4502 
Done.
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java:254 
Added.
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java:295 Done.

REVISION DETAIL
  https://reviews.facebook.net/D1353


> Unit test to ensure compactions don't cache data on write
> -
>
> Key: HBASE-5230
> URL: https://issues.apache.org/jira/browse/HBASE-5230
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
> D1353.4.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch
>
>
> Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
> write during compactions even if cache-on-write is enabled generally 
> enabled). This is because we have very different implementations of 
> HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
> and with CacheConfig (presumably it's there but not sure if it even works, 
> since the patch in HBASE-3976 may not have been committed). We need to create 
> a unit test to verify that we don't cache data blocks on write during 
> compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5264) Add 0.92.0 upgrade guide

2012-01-23 Thread stack (Created) (JIRA)
Add 0.92.0 upgrade guide


 Key: HBASE-5264
 URL: https://issues.apache.org/jira/browse/HBASE-5264
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: 5264.txt

Add an upgrade guide for going from 0.90 to 0.92.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5130) A map-reduce wrapper for HBase test suite ("mr-test-runner")

2012-01-23 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5130:
--

Description: We have a tool we call "mrunit" (but will call 
"mr-test-runner" in the open-source version) that runs HBase unit tests on a 
map-reduce cluster. We need modify it to use distributed cache to deploy the 
code on the cluster instead of our internal deployment tool, and open-source 
it.  (was: We have a tool we call "mrunit" that runs HBase unit tests on a 
map-reduce cluster. We need modify it to use distributed cache to deploy the 
code on the cluster instead of our internal deployment tool, and open-source 
it.)
Summary: A map-reduce wrapper for HBase test suite ("mr-test-runner")  
(was: A map-reduce wrapper for HBase test suite ("mrunit"))

> A map-reduce wrapper for HBase test suite ("mr-test-runner")
> 
>
> Key: HBASE-5130
> URL: https://issues.apache.org/jira/browse/HBASE-5130
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>
> We have a tool we call "mrunit" (but will call "mr-test-runner" in the 
> open-source version) that runs HBase unit tests on a map-reduce cluster. We 
> need modify it to use distributed cache to deploy the code on the cluster 
> instead of our internal deployment tool, and open-source it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5264) Add 0.92.0 upgrade guide

2012-01-23 Thread stack (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5264.
--

   Resolution: Fixed
Fix Version/s: 0.94.0

Committed TRUNK

> Add 0.92.0 upgrade guide
> 
>
> Key: HBASE-5264
> URL: https://issues.apache.org/jira/browse/HBASE-5264
> Project: HBase
>  Issue Type: Task
>Reporter: stack
> Fix For: 0.94.0
>
> Attachments: 5264.txt
>
>
> Add an upgrade guide for going from 0.90 to 0.92.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5264) Add 0.92.0 upgrade guide

2012-01-23 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5264:
-

Attachment: 5264.txt

A patch J-D and I hacked up.

> Add 0.92.0 upgrade guide
> 
>
> Key: HBASE-5264
> URL: https://issues.apache.org/jira/browse/HBASE-5264
> Project: HBase
>  Issue Type: Task
>Reporter: stack
> Fix For: 0.94.0
>
> Attachments: 5264.txt
>
>
> Add an upgrade guide for going from 0.90 to 0.92.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5230:
--

Attachment: Don-t-cache-data-blocks-on-compaction-2012-01-23_15_27_23.patch

A new patch addressing Nicolas's comments.

> Unit test to ensure compactions don't cache data on write
> -
>
> Key: HBASE-5230
> URL: https://issues.apache.org/jira/browse/HBASE-5230
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
> D1353.4.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_15_27_23.patch
>
>
> Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
> write during compactions even if cache-on-write is enabled generally 
> enabled). This is because we have very different implementations of 
> HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
> and with CacheConfig (presumably it's there but not sure if it even works, 
> since the patch in HBASE-3976 may not have been committed). We need to create 
> a unit test to verify that we don't cache data blocks on write during 
> compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5179:
--

Attachment: 5179-90v18.txt

Patch v18 addresses Stack's comments.

The sleep() isn't for unit test. I lowered wait interval to 500ms.

I created waitUntilNoLogDir(HServerAddress serverAddress) so that -ROOT- and 
.META. servers can reuse the logic.

Renamed logDirExists() to getLogDirIfExists()

> Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
> region to be assigned before log splitting is completed, causing data loss
> 
>
> Key: HBASE-5179
> URL: https://issues.apache.org/jira/browse/HBASE-5179
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
> 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
> 5179-90v16.patch, 5179-90v17.txt, 5179-90v18.txt, 5179-90v2.patch, 
> 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 
> 5179-90v7.patch, 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, 
> 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, 
> Errorlog, hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, 
> hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
> hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch
>
>
> If master's processing its failover and ServerShutdownHandler's processing 
> happen concurrently, it may appear following  case.
> 1.master completed splitLogAfterStartup()
> 2.RegionserverA restarts, and ServerShutdownHandler is processing.
> 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
> dead server.
> 4.master starts to assign regions of RegionserverA because it is a dead 
> server by step3.
> However, when doing step4(assigning region), ServerShutdownHandler may be 
> doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191630#comment-13191630
 ] 

Hadoop QA commented on HBASE-5179:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511589/5179-90v18.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/844//console

This message is automatically generated.

> Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
> region to be assigned before log splitting is completed, causing data loss
> 
>
> Key: HBASE-5179
> URL: https://issues.apache.org/jira/browse/HBASE-5179
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
> 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
> 5179-90v16.patch, 5179-90v17.txt, 5179-90v18.txt, 5179-90v2.patch, 
> 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 
> 5179-90v7.patch, 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, 
> 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, 
> Errorlog, hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, 
> hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
> hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch
>
>
> If master's processing its failover and ServerShutdownHandler's processing 
> happen concurrently, it may appear following  case.
> 1.master completed splitLogAfterStartup()
> 2.RegionserverA restarts, and ServerShutdownHandler is processing.
> 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
> dead server.
> 4.master starts to assign regions of RegionserverA because it is a dead 
> server by step3.
> However, when doing step4(assigning region), ServerShutdownHandler may be 
> doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191639#comment-13191639
 ] 

Andrew Purtell commented on HBASE-5258:
---

bq. for a mis-behaving coprocessor we either remove the buggy coprocessor

... from the coprocessor host for the given region (in the case of 
RegionCoprocessorHost) only...


> Move coprocessors set out of RegionLoad
> ---
>
> Key: HBASE-5258
> URL: https://issues.apache.org/jira/browse/HBASE-5258
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>
> When I worked on HBASE-5256, I revisited the code related to Ser/De of 
> coprocessors set in RegionLoad.
> I think the rationale for embedding coprocessors set is for maximum 
> flexibility where each region can load different coprocessors.
> This flexibility is causing extra cost in the region server to Master 
> communication and increasing the footprint of Master heap.
> Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5243) LogSyncerThread not getting shutdown waiting for the interrupted flag

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191638#comment-13191638
 ] 

Hudson commented on HBASE-5243:
---

Integrated in HBase-0.92 #258 (See 
[https://builds.apache.org/job/HBase-0.92/258/])
HBASE-5243 Addendum moves the close() method to right place

tedyu : 
Files : 
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java


> LogSyncerThread not getting shutdown waiting for the interrupted flag
> -
>
> Key: HBASE-5243
> URL: https://issues.apache.org/jira/browse/HBASE-5243
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.6, 0.92.1
>
> Attachments: 5243-92.addendum, HBASE-5243_0.90.patch, 
> HBASE-5243_0.90_1.patch, HBASE-5243_trunk.patch
>
>
> In the LogSyncer run() we keep looping till this.isInterrupted flag is set.
> But in some cases the DFSclient is consuming the Interrupted exception.  So
> we are running into infinite loop in some shutdown cases.
> I would suggest that as we are the ones who tries to close down the
> LogSyncerThread we can introduce a variable like
> Close or shutdown and based on the state of this flag along with
> isInterrupted() we can make the thread stop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191651#comment-13191651
 ] 

Zhihong Yu commented on HBASE-5258:
---

Since the combination of coprocessors on a region server is limited, I was 
suggesting that the report of uneven coprocessor presence be embedded in 
HServerLoad.

> Move coprocessors set out of RegionLoad
> ---
>
> Key: HBASE-5258
> URL: https://issues.apache.org/jira/browse/HBASE-5258
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>
> When I worked on HBASE-5256, I revisited the code related to Ser/De of 
> coprocessors set in RegionLoad.
> I think the rationale for embedding coprocessors set is for maximum 
> flexibility where each region can load different coprocessors.
> This flexibility is causing extra cost in the region server to Master 
> communication and increasing the footprint of Master heap.
> Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191646#comment-13191646
 ] 

Zhihong Yu commented on HBASE-5258:
---

I agree with the last comment @ 23/Jan/12 22:58
My understanding of the feature is that user should validate coprocessor by 
choosing the Abort policy for buggy coprocessor in pre-deployment stage. In 
production, the chance of buggy coprocessor dropping from individual region(s) 
should be low.

The ability of querying imbalanced coprocessors on a region server should be 
on-demand feature.

> Move coprocessors set out of RegionLoad
> ---
>
> Key: HBASE-5258
> URL: https://issues.apache.org/jira/browse/HBASE-5258
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>
> When I worked on HBASE-5256, I revisited the code related to Ser/De of 
> coprocessors set in RegionLoad.
> I think the rationale for embedding coprocessors set is for maximum 
> flexibility where each region can load different coprocessors.
> This flexibility is causing extra cost in the region server to Master 
> communication and increasing the footprint of Master heap.
> Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191657#comment-13191657
 ] 

Hadoop QA commented on HBASE-5230:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12511585/Don-t-cache-data-blocks-on-compaction-2012-01-23_15_27_23.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 8 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 84 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.replication.TestReplication
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/843//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/843//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/843//console

This message is automatically generated.

> Unit test to ensure compactions don't cache data on write
> -
>
> Key: HBASE-5230
> URL: https://issues.apache.org/jira/browse/HBASE-5230
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
> D1353.4.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_15_27_23.patch
>
>
> Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
> write during compactions even if cache-on-write is enabled generally 
> enabled). This is because we have very different implementations of 
> HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
> and with CacheConfig (presumably it's there but not sure if it even works, 
> since the patch in HBASE-3976 may not have been committed). We need to create 
> a unit test to verify that we don't cache data blocks on write during 
> compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-23 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191672#comment-13191672
 ] 

stack commented on HBASE-5179:
--

bq. The sleep() isn't for unit test. I lowered wait interval to 500ms.

But this code is exercised in tests.

My thinking on this patch is that if you fellas have confidence in it, commit 
it to 0.90 (but if it destabilizes 0.90 branch, I'm going to come looking for 
you all with a hammer!).

Lets open a new issue for TRUNK and work up a trunk-applicable version of this 
patch.  I'll help there so the trunk commit includes unit tests.



> Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
> region to be assigned before log splitting is completed, causing data loss
> 
>
> Key: HBASE-5179
> URL: https://issues.apache.org/jira/browse/HBASE-5179
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
> 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
> 5179-90v16.patch, 5179-90v17.txt, 5179-90v18.txt, 5179-90v2.patch, 
> 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 
> 5179-90v7.patch, 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, 
> 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, 
> Errorlog, hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, 
> hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
> hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch
>
>
> If master's processing its failover and ServerShutdownHandler's processing 
> happen concurrently, it may appear following  case.
> 1.master completed splitLogAfterStartup()
> 2.RegionserverA restarts, and ServerShutdownHandler is processing.
> 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
> dead server.
> 4.master starts to assign regions of RegionserverA because it is a dead 
> server by step3.
> However, when doing step4(assigning region), ServerShutdownHandler may be 
> doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5265) Fix 'revoke' shell command

2012-01-23 Thread Andrew Purtell (Created) (JIRA)
Fix 'revoke' shell command
--

 Key: HBASE-5265
 URL: https://issues.apache.org/jira/browse/HBASE-5265
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.94.0, 0.92.1


The 'revoke' shell command needs to be reworked for the AccessControlProtocol 
implementation that was finalized for 0.92. The permissions being removed must 
exactly match what was previously granted. No wildcard matching is done server 
side.

Allow two forms of the command in the shell for convenience:

Revocation of a specific grant:
{code}
revoke , ,  [ ,  ]
{code}

Have the shell automatically do so for all permissions on a table for a given 
user:
{code}
revoke , 
{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5256) Use WritableUtils.readVInt() in RegionLoad.readFields()

2012-01-23 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191676#comment-13191676
 ] 

Lars Hofhansl commented on HBASE-5256:
--

Should we recommend this only for new metrics (in order to avoid more wire 
incompatibilities?


> Use WritableUtils.readVInt() in RegionLoad.readFields()
> ---
>
> Key: HBASE-5256
> URL: https://issues.apache.org/jira/browse/HBASE-5256
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
> Fix For: 0.94.0
>
>
> Currently in.readInt() is used in RegionLoad.readFields()
> More metrics would be added to RegionLoad in the future, we should utilize 
> WritableUtils.readVInt() to reduce the amount of data exchanged between 
> Master and region servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191678#comment-13191678
 ] 

Zhihong Yu commented on HBASE-5179:
---

What do we do with 5179-92v17.patch ?
Test harness may not be ready in 0.92 for the (future) trunk patch to be 
applied.

> Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
> region to be assigned before log splitting is completed, causing data loss
> 
>
> Key: HBASE-5179
> URL: https://issues.apache.org/jira/browse/HBASE-5179
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
> 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
> 5179-90v16.patch, 5179-90v17.txt, 5179-90v18.txt, 5179-90v2.patch, 
> 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 
> 5179-90v7.patch, 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, 
> 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, 
> Errorlog, hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, 
> hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
> hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch
>
>
> If master's processing its failover and ServerShutdownHandler's processing 
> happen concurrently, it may appear following  case.
> 1.master completed splitLogAfterStartup()
> 2.RegionserverA restarts, and ServerShutdownHandler is processing.
> 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
> dead server.
> 4.master starts to assign regions of RegionserverA because it is a dead 
> server by step3.
> However, when doing step4(assigning region), ServerShutdownHandler may be 
> doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5179:
--

Comment: was deleted

(was: -1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511589/5179-90v18.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/844//console

This message is automatically generated.)

> Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
> region to be assigned before log splitting is completed, causing data loss
> 
>
> Key: HBASE-5179
> URL: https://issues.apache.org/jira/browse/HBASE-5179
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
> 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
> 5179-90v16.patch, 5179-90v17.txt, 5179-90v18.txt, 5179-90v2.patch, 
> 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 
> 5179-90v7.patch, 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, 
> 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, 
> Errorlog, hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, 
> hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
> hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch
>
>
> If master's processing its failover and ServerShutdownHandler's processing 
> happen concurrently, it may appear following  case.
> 1.master completed splitLogAfterStartup()
> 2.RegionserverA restarts, and ServerShutdownHandler is processing.
> 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
> dead server.
> 4.master starts to assign regions of RegionserverA because it is a dead 
> server by step3.
> However, when doing step4(assigning region), ServerShutdownHandler may be 
> doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-23 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191682#comment-13191682
 ] 

Lars Hofhansl commented on HBASE-5262:
--

You thinking JSON or something?

> Structured event log for HBase for monitoring and auto-tuning performance
> -
>
> Key: HBASE-5262
> URL: https://issues.apache.org/jira/browse/HBASE-5262
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> Creating this JIRA to open a discussion about a structured (machine-readable) 
> log that will record events such as compaction start/end times, compaction 
> input/output files, their sizes, the same for flushes, etc. This can be 
> stored e.g. in a new system table in HBase itself. The data from this log can 
> then be analyzed and used to optimize compactions at run time, or otherwise 
> auto-tune HBase configuration to reduce the number of knobs the user has to 
> configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5256) Use WritableUtils.readVInt() in RegionLoad.readFields()

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191683#comment-13191683
 ] 

Zhihong Yu commented on HBASE-5256:
---

Since the version of RegionLoad would be bumped, I think this change should be 
applied to all integer/long metrics.

> Use WritableUtils.readVInt() in RegionLoad.readFields()
> ---
>
> Key: HBASE-5256
> URL: https://issues.apache.org/jira/browse/HBASE-5256
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
> Fix For: 0.94.0
>
>
> Currently in.readInt() is used in RegionLoad.readFields()
> More metrics would be added to RegionLoad in the future, we should utilize 
> WritableUtils.readVInt() to reduce the amount of data exchanged between 
> Master and region servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5261) Update HBase for Java 7

2012-01-23 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191686#comment-13191686
 ] 

Lars Hofhansl commented on HBASE-5261:
--

I think we need to be careful to maintain compatibility with JDK 6 for a 
lng time. For many enterprises switched to JDK 7 is a major effort.


> Update HBase for Java 7
> ---
>
> Key: HBASE-5261
> URL: https://issues.apache.org/jira/browse/HBASE-5261
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>
> We need to make sure that HBase compiles and works with JDK 7. Once we verify 
> it is reasonably stable, we can explore utilizing the G1 garbage collector. 
> When all deployments are ready to move to JDK 7, we can start using new 
> language features, but in the transition period we will need to maintain a 
> codebase that compiles both with JDK 6 and JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5266) Add documentation for ColumnRangeFilter

2012-01-23 Thread Lars Hofhansl (Created) (JIRA)
Add documentation for ColumnRangeFilter
---

 Key: HBASE-5266
 URL: https://issues.apache.org/jira/browse/HBASE-5266
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.0


There are only a few lines of documentation for ColumnRangeFilter.
Given the usefulness of this filter for efficient intra-row scanning (see 
HASE-5229 and HBASE-4256), we should make this filter more prominent in the 
documentation.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-23 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191689#comment-13191689
 ] 

Mikhail Bautin commented on HBASE-5262:
---

JSON could be the encoding for the "value" part of each log entry. However, if 
we decide to store this type of information in HBase itself, we will need to 
think through the schema from the point of view of at least couple of different 
use cases, e.g. analyzing compaction performance, auto-tuning the compaction 
algorithm, maybe auto-tuning some block cache settings, etc.

> Structured event log for HBase for monitoring and auto-tuning performance
> -
>
> Key: HBASE-5262
> URL: https://issues.apache.org/jira/browse/HBASE-5262
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> Creating this JIRA to open a discussion about a structured (machine-readable) 
> log that will record events such as compaction start/end times, compaction 
> input/output files, their sizes, the same for flushes, etc. This can be 
> stored e.g. in a new system table in HBase itself. The data from this log can 
> then be analyzed and used to optimize compactions at run time, or otherwise 
> auto-tune HBase configuration to reduce the number of knobs the user has to 
> configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >