[jira] [Commented] (HBASE-10154) Add a unit test for Canary tool

2013-12-12 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847236#comment-13847236
 ] 

takeshi.miao commented on HBASE-10154:
--

The later version, I plan to do... 
1. more tests for covering different kind of options combinations.
2. use _'ExtendedSink' interface_ to do the mock impl. for verifying the Canary 
output results.

> Add a unit test for Canary tool
> ---
>
> Key: HBASE-10154
> URL: https://issues.apache.org/jira/browse/HBASE-10154
> Project: HBase
>  Issue Type: Improvement
>  Components: monitoring, test
>Reporter: takeshi.miao
>Assignee: takeshi.miao
>Priority: Minor
> Attachments: HBASE-10154-trunk-v01.patch
>
>
> Due to HBASE-10108, I am working to come out a unit test for 
> o.h.hbase.tool.Canary to eliminate this kind of issue.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10154) Add a unit test for Canary tool

2013-12-12 Thread takeshi.miao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

takeshi.miao updated HBASE-10154:
-

Attachment: HBASE-10154-trunk-v01.patch

Upload a draft version, pls tell me if any suggestion.

> Add a unit test for Canary tool
> ---
>
> Key: HBASE-10154
> URL: https://issues.apache.org/jira/browse/HBASE-10154
> Project: HBase
>  Issue Type: Improvement
>  Components: monitoring, test
>Reporter: takeshi.miao
>Assignee: takeshi.miao
>Priority: Minor
> Attachments: HBASE-10154-trunk-v01.patch
>
>
> Due to HBASE-10108, I am working to come out a unit test for 
> o.h.hbase.tool.Canary to eliminate this kind of issue.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HBASE-10154) Add a unit test for Canary tool

2013-12-12 Thread takeshi.miao (JIRA)
takeshi.miao created HBASE-10154:


 Summary: Add a unit test for Canary tool
 Key: HBASE-10154
 URL: https://issues.apache.org/jira/browse/HBASE-10154
 Project: HBase
  Issue Type: Improvement
  Components: monitoring, test
Reporter: takeshi.miao
Assignee: takeshi.miao
Priority: Minor


Due to HBASE-10108, I am working to come out a unit test for 
o.h.hbase.tool.Canary to eliminate this kind of issue.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10106) Remove some unnecessary code from TestOpenTableInCoprocessor

2013-12-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847219#comment-13847219
 ] 

Hudson commented on HBASE-10106:


SUCCESS: Integrated in HBase-0.94-security #358 (See 
[https://builds.apache.org/job/HBase-0.94-security/358/])
HBASE-10106: Remove some unnecessary code from TestOpenTableInCoprocessor 
(Benoit Sigoure) (jyates: rev 1550567)
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/coprocessor/TestOpenTableInCoprocessor.java


> Remove some unnecessary code from TestOpenTableInCoprocessor
> 
>
> Key: HBASE-10106
> URL: https://issues.apache.org/jira/browse/HBASE-10106
> Project: HBase
>  Issue Type: Test
>Affects Versions: 0.98.0, 0.96.0, 0.94.15, 0.99.0
>Reporter: Benoit Sigoure
>Assignee: Benoit Sigoure
>Priority: Trivial
> Attachments: hbase-10106-0.94.patch, hbase-10106.txt
>
>
> {code}
> diff --git 
> a/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestOpenTableInCoprocessor.java
>  
> b/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestOpenTableInCoprocessor.java
> index 7bc2a78..67b97ce 100644
> --- 
> a/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestOpenTableInCoprocessor.java
> +++ 
> b/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestOpenTableInCoprocessor.java
> @@ -69,8 +69,6 @@ public class TestOpenTableInCoprocessor {
>  public void prePut(final ObserverContext 
> e, final Put put,
>  final WALEdit edit, final Durability durability) throws IOException {
>HTableInterface table = e.getEnvironment().getTable(otherTable);
> -  Put p = new Put(new byte[] { 'a' });
> -  p.add(family, null, new byte[] { 'a' });
>table.put(put);
>table.flushCommits();
>completed[0] = true;
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10120) start-hbase.sh doesn't respect --config in non-distributed mode

2013-12-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847220#comment-13847220
 ] 

Hudson commented on HBASE-10120:


SUCCESS: Integrated in HBase-0.94-security #358 (See 
[https://builds.apache.org/job/HBase-0.94-security/358/])
HBASE-10120 start-hbase.sh doesn't respect --config in non-distributed mode 
(ndimiduk: rev 1550041)
* /hbase/branches/0.94/bin/start-hbase.sh


> start-hbase.sh doesn't respect --config in non-distributed mode
> ---
>
> Key: HBASE-10120
> URL: https://issues.apache.org/jira/browse/HBASE-10120
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
>Priority: Trivial
> Fix For: 0.98.0, 0.94.15, 0.96.2, 0.99.0
>
> Attachments: HBASE-10120.00.patch
>
>
> A custom value for --config is not passed along to hbase-daemon.sh by 
> start-hbase.sh when invoked in local mode (hbase.cluster.distributed=false). 
> When --config is specified, variables from hbase-env.sh are applied to the 
> runtime, but not hbase-site.xml.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10119) Allow HBase coprocessors to clean up when they fail

2013-12-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847218#comment-13847218
 ] 

Hudson commented on HBASE-10119:


SUCCESS: Integrated in HBase-0.94-security #358 (See 
[https://builds.apache.org/job/HBase-0.94-security/358/])
HBASE-10119. Allow HBase coprocessors to clean up when they fail (Benoit 
Sigoure) (apurtell: rev 1550030)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java


> Allow HBase coprocessors to clean up when they fail
> ---
>
> Key: HBASE-10119
> URL: https://issues.apache.org/jira/browse/HBASE-10119
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.96.0
>Reporter: Benoit Sigoure
>Assignee: Benoit Sigoure
> Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0
>
> Attachments: HBASE-10119-0.94.patch, HBASE-10119.patch
>
>
> In the thread [Giving a chance to buggy coprocessors to clean 
> up|http://osdir.com/ml/general/2013-12/msg17334.html] I brought up the issue 
> that coprocessors currently don't have a chance to release their own 
> resources (be they internal resources within the JVM, or external resources 
> elsewhere) when they get forcefully removed due to an uncaught exception 
> escaping.
> It would be nice to fix that, either by adding an API called by the 
> {{CoprocessorHost}} when killing a faulty coprocessor, or by guaranteeing 
> that the coprocessor's {{stop()}} method will be invoked then.
> This feature request is actually pretty important due to bug HBASE-9046, 
> which means that it's not possible to properly clean up a coprocessor without 
> restarting the RegionServer (!!).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10136) Alter table conflicts with concurrent snapshot attempt on that table

2013-12-12 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847214#comment-13847214
 ] 

Enis Soztutar commented on HBASE-10136:
---

Matteo you are right in the analysis. The table lock is released before the 
regions are complete with opening, because of how region reopening and the 
master handlers are implemented. The most clear fix to this is to fix the 
master itself I think (HBASE-5487).

> Alter table conflicts with concurrent snapshot attempt on that table
> 
>
> Key: HBASE-10136
> URL: https://issues.apache.org/jira/browse/HBASE-10136
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 0.96.0, 0.98.1, 0.99.0
>Reporter: Aleksandr Shulman
>Assignee: Matteo Bertozzi
>  Labels: online_schema_change
>
> Expected behavior:
> A user can issue a request for a snapshot of a table while that table is 
> undergoing an online schema change and expect that snapshot request to 
> complete correctly. Also, the same is true if a user issues a online schema 
> change request while a snapshot attempt is ongoing.
> Observed behavior:
> Snapshot attempts time out when there is an ongoing online schema change 
> because the region is closed and opened during the snapshot. 
> As a side-note, I would expect that the attempt should fail quickly as 
> opposed to timing out. 
> Further, what I have seen is that subsequent attempts to snapshot the table 
> fail because of some state/cleanup issues. This is also concerning.
> Immediate error:
> {code}type=FLUSH }' is still in progress!
> 2013-12-11 15:58:32,883 DEBUG [Thread-385] client.HBaseAdmin(2696): (#11) 
> Sleeping: 1ms while waiting for snapshot completion.
> 2013-12-11 15:58:42,884 DEBUG [Thread-385] client.HBaseAdmin(2704): Getting 
> current status of snapshot from master...
> 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
> master.HMaster(2891): Checking to see if snapshot from request:{ ss=snapshot0 
> table=changeSchemaDuringSnapshot1386806258640 type=FLUSH } is done
> 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
> snapshot.SnapshotManager(374): Snapshoting '{ ss=snapshot0 
> table=changeSchemaDuringSnapshot1386806258640 type=FLUSH }' is still in 
> progress!
> Snapshot failure occurred
> org.apache.hadoop.hbase.snapshot.SnapshotCreationException: Snapshot 
> 'snapshot0' wasn't completed in expectedTime:6 ms
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2713)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2638)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2602)
>   at 
> org.apache.hadoop.hbase.client.TestAdmin$BackgroundSnapshotThread.run(TestAdmin.java:1974){code}
> Likely root cause of error:
> {code}Exception in SnapshotSubprocedurePool
> java.util.concurrent.ExecutionException: 
> org.apache.hadoop.hbase.NotServingRegionException: 
> changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
>  is closing
>   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>   at 
> org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:314)
>   at 
> org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
>   at 
> org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
>   at 
> org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
>   at 
> org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:1)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.hadoop.hbase.NotServingRegionException: 
> changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
>  is closing
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5327)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5289)
>   at 
> org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79

[jira] [Commented] (HBASE-10076) Backport MapReduce over snapshot files [0.94]

2013-12-12 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847210#comment-13847210
 ] 

Enis Soztutar commented on HBASE-10076:
---

bq. Looks like in 0.94 its just added based on the config
In trunk, initTableSnapshotMapper() calls: 
{code}
initTableMapperJob(snapshotName, scan, mapper, outputKeyClass,
outputValueClass, job, addDependencyJars, false, 
TableSnapshotInputFormat.class);
{code}
the false is initCredentials to the overloaded function, no? I don't have 
eclipse open, I cannot check : ) 

bq. how can we test that the locality selection is correct? Its not really 
covered anywhere in this patch or the original
It is using the HDFSBlocksDistribution, which I though is tested on it's own. 
Did not check whether there is actual coverage though. It should be possible to 
mock that up I guess. 

> Backport MapReduce over snapshot files [0.94]
> -
>
> Key: HBASE-10076
> URL: https://issues.apache.org/jira/browse/HBASE-10076
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.15
>
> Attachments: hbase-10076-v0.patch
>
>
> MapReduce over Snapshots would be valuable on 0.94.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10137) GeneralBulkAssigner with retain assignment plan can be used in EnableTableHandler to bulk assign the regions

2013-12-12 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847205#comment-13847205
 ] 

rajeshbabu commented on HBASE-10137:


bq, Let's not do this in 0.94 then. I removed the 0.94 fix tag.
Ok. No problem Lars.

> GeneralBulkAssigner with retain assignment plan can be used in 
> EnableTableHandler to bulk assign the regions
> 
>
> Key: HBASE-10137
> URL: https://issues.apache.org/jira/browse/HBASE-10137
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.96.0, 0.94.14
>Reporter: rajeshbabu
>Assignee: rajeshbabu
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBASE-10137.patch
>
>
> Current in BulkEnabler we are assigning one region at a time, instead we can 
> use GeneralBulkAssigner to bulk assign multiple regions at a time.
> {code}
>   for (HRegionInfo region : regions) {
> if (assignmentManager.getRegionStates()
> .isRegionInTransition(region)) {
>   continue;
> }
> final HRegionInfo hri = region;
> pool.execute(Trace.wrap("BulkEnabler.populatePool",new Runnable() {
>   public void run() {
> assignmentManager.assign(hri, true);
>   }
> }));
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10137) GeneralBulkAssigner with retain assignment plan can be used in EnableTableHandler to bulk assign the regions

2013-12-12 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847199#comment-13847199
 ] 

rajeshbabu commented on HBASE-10137:


[~te...@apache.org]
bq. What is the purpose for the above change ?
Without the change TestAdmin#testEnableTableRetainAssignment is failing. 
Because in test cases all servers have same host name and retainAssignment may 
give server with different port and start code(assertion will fail). 
{code}
for (Map.Entry entry : regions.entrySet()) {
  assertEquals(regions2.get(entry.getKey()), entry.getValue());
}
{code}
We can change test case to check for only hostnames,but in such case it will 
pass always. 
So I felt it's ok to select old server as destination if its in online servers. 
Is it ok?


> GeneralBulkAssigner with retain assignment plan can be used in 
> EnableTableHandler to bulk assign the regions
> 
>
> Key: HBASE-10137
> URL: https://issues.apache.org/jira/browse/HBASE-10137
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.96.0, 0.94.14
>Reporter: rajeshbabu
>Assignee: rajeshbabu
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBASE-10137.patch
>
>
> Current in BulkEnabler we are assigning one region at a time, instead we can 
> use GeneralBulkAssigner to bulk assign multiple regions at a time.
> {code}
>   for (HRegionInfo region : regions) {
> if (assignmentManager.getRegionStates()
> .isRegionInTransition(region)) {
>   continue;
> }
> final HRegionInfo hri = region;
> pool.execute(Trace.wrap("BulkEnabler.populatePool",new Runnable() {
>   public void run() {
> assignmentManager.assign(hri, true);
>   }
> }));
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload

2013-12-12 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847196#comment-13847196
 ] 

Liang Xie commented on HBASE-5349:
--

Ok,  my concern is gone now. thanks all for reply:) I haven't learned the code 
yet, but the idea/feature is pretty cool absolutely !

> Automagically tweak global memstore and block cache sizes based on workload
> ---
>
> Key: HBASE-5349
> URL: https://issues.apache.org/jira/browse/HBASE-5349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: Anoop Sam John
> Fix For: 0.99.0
>
> Attachments: HBASE-5349_V2.patch, HBASE-5349_V3.patch, 
> HBASE-5349_V4.patch, HBASE-5349_V5.patch, WIP_HBASE-5349.patch
>
>
> Hypertable does a neat thing where it changes the size given to the CellCache 
> (our MemStores) and Block Cache based on the workload. If you need an image, 
> scroll down at the bottom of this link: 
> http://www.hypertable.com/documentation/architecture/
> That'd be one less thing to configure.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

2013-12-12 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847195#comment-13847195
 ] 

Feng Honghua commented on HBASE-8755:
-

bq.It does 10% less throughput after ~25minutes. 
==> it's normal when flush/compact occurs after ~25 minutes write, we saw such 
level downgrade when doing long time tests for both with/without patch.

> A new write thread model for HLog to improve the overall HBase write 
> throughput
> ---
>
> Key: HBASE-8755
> URL: https://issues.apache.org/jira/browse/HBASE-8755
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, wal
>Reporter: Feng Honghua
>Assignee: stack
>Priority: Critical
> Attachments: 8755-syncer.patch, 8755trunkV2.txt, 8755v8.txt, 
> 8755v9.txt, HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
> HBASE-8755-0.96-v0.patch, HBASE-8755-trunk-V0.patch, 
> HBASE-8755-trunk-V1.patch, HBASE-8755-trunk-v4.patch, 
> HBASE-8755-trunk-v6.patch, HBASE-8755-trunk-v7.patch, HBASE-8755-v5.patch, 
> thread.out
>
>
> In current write model, each write handler thread (executing put()) will 
> individually go through a full 'append (hlog local buffer) => HLog writer 
> append (write to hdfs) => HLog writer sync (sync hdfs)' cycle for each write, 
> which incurs heavy race condition on updateLock and flushLock.
> The only optimization where checking if current syncTillHere > txid in 
> expectation for other thread help write/sync its own txid to hdfs and 
> omitting the write/sync actually help much less than expectation.
> Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
> proposed a new write thread model for writing hdfs sequence file and the 
> prototype implementation shows a 4X improvement for throughput (from 17000 to 
> 7+). 
> I apply this new write thread model in HLog and the performance test in our 
> test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
> RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
> even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
> write throughput then is 31002). I can provide the detailed performance test 
> results if anyone is interested.
> The change for new write thread model is as below:
>  1> All put handler threads append the edits to HLog's local pending buffer; 
> (it notifies AsyncWriter thread that there is new edits in local buffer)
>  2> All put handler threads wait in HLog.syncer() function for underlying 
> threads to finish the sync that contains its txid;
>  3> An single AsyncWriter thread is responsible for retrieve all the buffered 
> edits in HLog's local pending buffer and write to the hdfs 
> (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
> writes to hdfs that needs a sync)
>  4> An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
> to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
> that sync watermark increases)
>  5> An single AsyncNotifier thread is responsible for notifying all pending 
> put handler threads which are waiting in the HLog.syncer() function
>  6> No LogSyncer thread any more (since there is always 
> AsyncWriter/AsyncFlusher threads do the same job it does)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

2013-12-12 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847194#comment-13847194
 ] 

Feng Honghua commented on HBASE-8755:
-

[~stack] thanks. seems no further blocking issue?

> A new write thread model for HLog to improve the overall HBase write 
> throughput
> ---
>
> Key: HBASE-8755
> URL: https://issues.apache.org/jira/browse/HBASE-8755
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, wal
>Reporter: Feng Honghua
>Assignee: stack
>Priority: Critical
> Attachments: 8755-syncer.patch, 8755trunkV2.txt, 8755v8.txt, 
> 8755v9.txt, HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
> HBASE-8755-0.96-v0.patch, HBASE-8755-trunk-V0.patch, 
> HBASE-8755-trunk-V1.patch, HBASE-8755-trunk-v4.patch, 
> HBASE-8755-trunk-v6.patch, HBASE-8755-trunk-v7.patch, HBASE-8755-v5.patch, 
> thread.out
>
>
> In current write model, each write handler thread (executing put()) will 
> individually go through a full 'append (hlog local buffer) => HLog writer 
> append (write to hdfs) => HLog writer sync (sync hdfs)' cycle for each write, 
> which incurs heavy race condition on updateLock and flushLock.
> The only optimization where checking if current syncTillHere > txid in 
> expectation for other thread help write/sync its own txid to hdfs and 
> omitting the write/sync actually help much less than expectation.
> Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
> proposed a new write thread model for writing hdfs sequence file and the 
> prototype implementation shows a 4X improvement for throughput (from 17000 to 
> 7+). 
> I apply this new write thread model in HLog and the performance test in our 
> test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
> RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
> even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
> write throughput then is 31002). I can provide the detailed performance test 
> results if anyone is interested.
> The change for new write thread model is as below:
>  1> All put handler threads append the edits to HLog's local pending buffer; 
> (it notifies AsyncWriter thread that there is new edits in local buffer)
>  2> All put handler threads wait in HLog.syncer() function for underlying 
> threads to finish the sync that contains its txid;
>  3> An single AsyncWriter thread is responsible for retrieve all the buffered 
> edits in HLog's local pending buffer and write to the hdfs 
> (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
> writes to hdfs that needs a sync)
>  4> An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
> to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
> that sync watermark increases)
>  5> An single AsyncNotifier thread is responsible for notifying all pending 
> put handler threads which are waiting in the HLog.syncer() function
>  6> No LogSyncer thread any more (since there is always 
> AsyncWriter/AsyncFlusher threads do the same job it does)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Resolved] (HBASE-10152) [Duplicated with 10153] improve VerifyReplication to compute BADROWS more accurately

2013-12-12 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John resolved HBASE-10152.


Resolution: Duplicate

> [Duplicated with 10153] improve VerifyReplication to compute BADROWS more 
> accurately
> 
>
> Key: HBASE-10152
> URL: https://issues.apache.org/jira/browse/HBASE-10152
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 0.94.14
>Reporter: cuijianwei
>
> VerifyReplicaiton could compare the source table with its peer table and 
> compute BADROWS. However, the current BADROWS computing method might not be 
> accurate enough. For example, if source table contains rows as {r1, r2, r3, 
> r4} and peer table contains rows as {r1, r3, r4}, the BADROWS counter will be 
> 3 because 'r2' in source table will make all the later comparisons fail. Will 
> it be better if the BADROWS is computed to 1 in this situation? Maybe, we can 
> compute the BADROWS more accurately in merge comparison?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload

2013-12-12 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847167#comment-13847167
 ] 

Anoop Sam John commented on HBASE-5349:
---

By default the auto tuning is turned off. One need to give the below 4 configs 
so as to define the range of heap %.  
"hbase.regionserver.global.memstore.size.max.range" and 
"hbase.regionserver.global.memstore.size.min.range" using which one can specify 
the total heap % within which the memstore size can vary. 
"hfile.block.cache.size.max.range" and "hfile.block.cache.size.min.range" using 
which one can specify the total heap % within which the block cache size can 
vary. 
There is no default values for these and so by default there wont be automatic 
tuning happening.

Is that good enough [~xieliang007]

> Automagically tweak global memstore and block cache sizes based on workload
> ---
>
> Key: HBASE-5349
> URL: https://issues.apache.org/jira/browse/HBASE-5349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: Anoop Sam John
> Fix For: 0.99.0
>
> Attachments: HBASE-5349_V2.patch, HBASE-5349_V3.patch, 
> HBASE-5349_V4.patch, HBASE-5349_V5.patch, WIP_HBASE-5349.patch
>
>
> Hypertable does a neat thing where it changes the size given to the CellCache 
> (our MemStores) and Block Cache based on the workload. If you need an image, 
> scroll down at the bottom of this link: 
> http://www.hypertable.com/documentation/architecture/
> That'd be one less thing to configure.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10150) Write attachment Id of tested patch into JIRA comment

2013-12-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10150:
---

Status: Patch Available  (was: Open)

> Write attachment Id of tested patch into JIRA comment
> -
>
> Key: HBASE-10150
> URL: https://issues.apache.org/jira/browse/HBASE-10150
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 0.99.0
>
> Attachments: 10150-v1.txt
>
>
> The optimization proposed in HBASE-10044 wouldn't work if QA bot doesn't know 
> the attachment Id of the most recently tested patch.
> The first step is to write attachment Id of tested patch into JIRA comment.
> For details, see HADOOP-10163



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10150) Write attachment Id of tested patch into JIRA comment

2013-12-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10150:
---

Attachment: 10150-v1.txt

> Write attachment Id of tested patch into JIRA comment
> -
>
> Key: HBASE-10150
> URL: https://issues.apache.org/jira/browse/HBASE-10150
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
> Fix For: 0.99.0
>
> Attachments: 10150-v1.txt
>
>
> The optimization proposed in HBASE-10044 wouldn't work if QA bot doesn't know 
> the attachment Id of the most recently tested patch.
> The first step is to write attachment Id of tested patch into JIRA comment.
> For details, see HADOOP-10163



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Assigned] (HBASE-10150) Write attachment Id of tested patch into JIRA comment

2013-12-12 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-10150:
--

Assignee: Ted Yu

> Write attachment Id of tested patch into JIRA comment
> -
>
> Key: HBASE-10150
> URL: https://issues.apache.org/jira/browse/HBASE-10150
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 0.99.0
>
> Attachments: 10150-v1.txt
>
>
> The optimization proposed in HBASE-10044 wouldn't work if QA bot doesn't know 
> the attachment Id of the most recently tested patch.
> The first step is to write attachment Id of tested patch into JIRA comment.
> For details, see HADOOP-10163



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10149) TestZKPermissionsWatcher.testPermissionsWatcher test failure

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847154#comment-13847154
 ] 

Hadoop QA commented on HBASE-10149:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12618533/10149.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hbase.TestAcidGuarantees.testScanAtomicity(TestAcidGuarantees.java:341)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8152//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8152//console

This message is automatically generated.

> TestZKPermissionsWatcher.testPermissionsWatcher test failure
> 
>
> Key: HBASE-10149
> URL: https://issues.apache.org/jira/browse/HBASE-10149
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.0, 0.99.0
>
> Attachments: 10149.patch, 10149.patch
>
>
> {noformat}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher.testPermissionsWatcher(TestZKPermissionsWatcher.java:119)
> {noformat}
> In testPermissionsWatcher we are not always waiting long enough for the 
> propagation of the permissions change for user "george" to take place.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-9047) Tool to handle finishing replication when the cluster is offline

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847148#comment-13847148
 ] 

Hadoop QA commented on HBASE-9047:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12618538/HBASE-9047-0.94-v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8153//console

This message is automatically generated.

> Tool to handle finishing replication when the cluster is offline
> 
>
> Key: HBASE-9047
> URL: https://issues.apache.org/jira/browse/HBASE-9047
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.96.0
>Reporter: Jean-Daniel Cryans
>Assignee: Demai Ni
> Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0
>
> Attachments: HBASE-9047-0.94-v1.patch, HBASE-9047-0.94.9-v0.PATCH, 
> HBASE-9047-trunk-v0.patch, HBASE-9047-trunk-v1.patch, 
> HBASE-9047-trunk-v2.patch, HBASE-9047-trunk-v3.patch, 
> HBASE-9047-trunk-v4.patch, HBASE-9047-trunk-v4.patch, 
> HBASE-9047-trunk-v5.patch, HBASE-9047-trunk-v6.patch, 
> HBASE-9047-trunk-v7.patch, HBASE-9047-trunk-v7.patch
>
>
> We're having a discussion on the mailing list about replicating the data on a 
> cluster that was shut down in an offline fashion. The motivation could be 
> that you don't want to bring HBase back up but still need that data on the 
> slave.
> So I have this idea of a tool that would be running on the master cluster 
> while it is down, although it could also run at any time. Basically it would 
> be able to read the replication state of each master region server, finish 
> replicating what's missing to all the slave, and then clear that state in 
> zookeeper.
> The code that handles replication does most of that already, see 
> ReplicationSourceManager and ReplicationSource. Basically when 
> ReplicationSourceManager.init() is called, it will check all the queues in ZK 
> and try to grab those that aren't attached to a region server. If the whole 
> cluster is down, it will grab all of them.
> The beautiful thing here is that you could start that tool on all your 
> machines and the load will be spread out, but that might not be a big concern 
> if replication wasn't lagging since it would take a few seconds to finish 
> replicating the missing data for each region server.
> I'm guessing when starting ReplicationSourceManager you'd give it a fake 
> region server ID, and you'd tell it not to start its own source.
> FWIW the main difference in how replication is handled between Apache's HBase 
> and Facebook's is that the latter is always done separately of HBase itself. 
> This jira isn't about doing that.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10149) TestZKPermissionsWatcher.testPermissionsWatcher test failure

2013-12-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10149:
---

   Resolution: Fixed
Fix Version/s: 0.99.0
 Assignee: Andrew Purtell
   Status: Resolved  (was: Patch Available)

Committed trivial test fix to trunk and 0.98. 

> TestZKPermissionsWatcher.testPermissionsWatcher test failure
> 
>
> Key: HBASE-10149
> URL: https://issues.apache.org/jira/browse/HBASE-10149
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.0, 0.99.0
>
> Attachments: 10149.patch, 10149.patch
>
>
> {noformat}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher.testPermissionsWatcher(TestZKPermissionsWatcher.java:119)
> {noformat}
> In testPermissionsWatcher we are not always waiting long enough for the 
> propagation of the permissions change for user "george" to take place.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10149) TestZKPermissionsWatcher.testPermissionsWatcher test failure

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847145#comment-13847145
 ] 

Hadoop QA commented on HBASE-10149:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12618529/10149.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8151//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8151//console

This message is automatically generated.

> TestZKPermissionsWatcher.testPermissionsWatcher test failure
> 
>
> Key: HBASE-10149
> URL: https://issues.apache.org/jira/browse/HBASE-10149
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 10149.patch, 10149.patch
>
>
> {noformat}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher.testPermissionsWatcher(TestZKPermissionsWatcher.java:119)
> {noformat}
> In testPermissionsWatcher we are not always waiting long enough for the 
> propagation of the permissions change for user "george" to take place.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-9047) Tool to handle finishing replication when the cluster is offline

2013-12-12 Thread Demai Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Demai Ni updated HBASE-9047:


Attachment: HBASE-9047-0.94-v1.patch

attached patch for 0.94. 

for 0.96, the trunk patch should be able to apply. do we need create a 
separated 0.96 patch? 

thanks... Demai

> Tool to handle finishing replication when the cluster is offline
> 
>
> Key: HBASE-9047
> URL: https://issues.apache.org/jira/browse/HBASE-9047
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.96.0
>Reporter: Jean-Daniel Cryans
>Assignee: Demai Ni
> Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0
>
> Attachments: HBASE-9047-0.94-v1.patch, HBASE-9047-0.94.9-v0.PATCH, 
> HBASE-9047-trunk-v0.patch, HBASE-9047-trunk-v1.patch, 
> HBASE-9047-trunk-v2.patch, HBASE-9047-trunk-v3.patch, 
> HBASE-9047-trunk-v4.patch, HBASE-9047-trunk-v4.patch, 
> HBASE-9047-trunk-v5.patch, HBASE-9047-trunk-v6.patch, 
> HBASE-9047-trunk-v7.patch, HBASE-9047-trunk-v7.patch
>
>
> We're having a discussion on the mailing list about replicating the data on a 
> cluster that was shut down in an offline fashion. The motivation could be 
> that you don't want to bring HBase back up but still need that data on the 
> slave.
> So I have this idea of a tool that would be running on the master cluster 
> while it is down, although it could also run at any time. Basically it would 
> be able to read the replication state of each master region server, finish 
> replicating what's missing to all the slave, and then clear that state in 
> zookeeper.
> The code that handles replication does most of that already, see 
> ReplicationSourceManager and ReplicationSource. Basically when 
> ReplicationSourceManager.init() is called, it will check all the queues in ZK 
> and try to grab those that aren't attached to a region server. If the whole 
> cluster is down, it will grab all of them.
> The beautiful thing here is that you could start that tool on all your 
> machines and the load will be spread out, but that might not be a big concern 
> if replication wasn't lagging since it would take a few seconds to finish 
> replicating the missing data for each region server.
> I'm guessing when starting ReplicationSourceManager you'd give it a fake 
> region server ID, and you'd tell it not to start its own source.
> FWIW the main difference in how replication is handled between Apache's HBase 
> and Facebook's is that the latter is always done separately of HBase itself. 
> This jira isn't about doing that.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-9047) Tool to handle finishing replication when the cluster is offline

2013-12-12 Thread Demai Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Demai Ni updated HBASE-9047:


Fix Version/s: 0.94.15
   0.96.1

> Tool to handle finishing replication when the cluster is offline
> 
>
> Key: HBASE-9047
> URL: https://issues.apache.org/jira/browse/HBASE-9047
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.96.0
>Reporter: Jean-Daniel Cryans
>Assignee: Demai Ni
> Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0
>
> Attachments: HBASE-9047-0.94.9-v0.PATCH, HBASE-9047-trunk-v0.patch, 
> HBASE-9047-trunk-v1.patch, HBASE-9047-trunk-v2.patch, 
> HBASE-9047-trunk-v3.patch, HBASE-9047-trunk-v4.patch, 
> HBASE-9047-trunk-v4.patch, HBASE-9047-trunk-v5.patch, 
> HBASE-9047-trunk-v6.patch, HBASE-9047-trunk-v7.patch, 
> HBASE-9047-trunk-v7.patch
>
>
> We're having a discussion on the mailing list about replicating the data on a 
> cluster that was shut down in an offline fashion. The motivation could be 
> that you don't want to bring HBase back up but still need that data on the 
> slave.
> So I have this idea of a tool that would be running on the master cluster 
> while it is down, although it could also run at any time. Basically it would 
> be able to read the replication state of each master region server, finish 
> replicating what's missing to all the slave, and then clear that state in 
> zookeeper.
> The code that handles replication does most of that already, see 
> ReplicationSourceManager and ReplicationSource. Basically when 
> ReplicationSourceManager.init() is called, it will check all the queues in ZK 
> and try to grab those that aren't attached to a region server. If the whole 
> cluster is down, it will grab all of them.
> The beautiful thing here is that you could start that tool on all your 
> machines and the load will be spread out, but that might not be a big concern 
> if replication wasn't lagging since it would take a few seconds to finish 
> replicating the missing data for each region server.
> I'm guessing when starting ReplicationSourceManager you'd give it a fake 
> region server ID, and you'd tell it not to start its own source.
> FWIW the main difference in how replication is handled between Apache's HBase 
> and Facebook's is that the latter is always done separately of HBase itself. 
> This jira isn't about doing that.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10152) improve VerifyReplication to compute BADROWS more accurately

2013-12-12 Thread cuijianwei (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847140#comment-13847140
 ] 

cuijianwei commented on HBASE-10152:


This issue is duplicated with 
https://issues.apache.org/jira/browse/HBASE-10153, igore this.

> improve VerifyReplication to compute BADROWS more accurately
> 
>
> Key: HBASE-10152
> URL: https://issues.apache.org/jira/browse/HBASE-10152
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 0.94.14
>Reporter: cuijianwei
>
> VerifyReplicaiton could compare the source table with its peer table and 
> compute BADROWS. However, the current BADROWS computing method might not be 
> accurate enough. For example, if source table contains rows as {r1, r2, r3, 
> r4} and peer table contains rows as {r1, r3, r4}, the BADROWS counter will be 
> 3 because 'r2' in source table will make all the later comparisons fail. Will 
> it be better if the BADROWS is computed to 1 in this situation? Maybe, we can 
> compute the BADROWS more accurately in merge comparison?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10152) [Duplicated with 10153] improve VerifyReplication to compute BADROWS more accurately

2013-12-12 Thread cuijianwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cuijianwei updated HBASE-10152:
---

Summary: [Duplicated with 10153] improve VerifyReplication to compute 
BADROWS more accurately  (was: improve VerifyReplication to compute BADROWS 
more accurately)

> [Duplicated with 10153] improve VerifyReplication to compute BADROWS more 
> accurately
> 
>
> Key: HBASE-10152
> URL: https://issues.apache.org/jira/browse/HBASE-10152
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 0.94.14
>Reporter: cuijianwei
>
> VerifyReplicaiton could compare the source table with its peer table and 
> compute BADROWS. However, the current BADROWS computing method might not be 
> accurate enough. For example, if source table contains rows as {r1, r2, r3, 
> r4} and peer table contains rows as {r1, r3, r4}, the BADROWS counter will be 
> 3 because 'r2' in source table will make all the later comparisons fail. Will 
> it be better if the BADROWS is computed to 1 in this situation? Maybe, we can 
> compute the BADROWS more accurately in merge comparison?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10153) improve VerifyReplication to compute BADROWS more accurately

2013-12-12 Thread cuijianwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cuijianwei updated HBASE-10153:
---

Description: VerifyReplicaiton could compare the source table with its peer 
table and compute BADROWS. However, the current BADROWS computing method might 
not be accurate enough. For example, if source table contains rows as {r1, r2, 
r3, r4} and peer table contains rows as {r1, r3, r4} BADROWS will be 3 because 
'r2' in source table will make all the later row comparisons fail. Will it be 
better if the BADROWS is computed to 1 in this situation? Maybe, we can compute 
the BADROWS more accurately in merge comparison?  (was: VerifyReplicaiton could 
compare the source table with its peer table and compute BADROWS. However, the 
current BADROWS computing method might not be accurate enough. For example, if 
source table contains rows as {r1, r2, r3, r4} and peer table contains rows as 
{r1, r3, r4}, the BADROWS counter will be 3 because 'r2' in source table will 
make all the later comparisons fail. Will it be better if the BADROWS is 
computed to 1 in this situation? Maybe, we can compute the BADROWS more 
accurately in merge comparison?)

> improve VerifyReplication to compute BADROWS more accurately
> 
>
> Key: HBASE-10153
> URL: https://issues.apache.org/jira/browse/HBASE-10153
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 0.94.14
>Reporter: cuijianwei
> Attachments: HBASE-10153-0.94-v1.patch
>
>
> VerifyReplicaiton could compare the source table with its peer table and 
> compute BADROWS. However, the current BADROWS computing method might not be 
> accurate enough. For example, if source table contains rows as {r1, r2, r3, 
> r4} and peer table contains rows as {r1, r3, r4} BADROWS will be 3 because 
> 'r2' in source table will make all the later row comparisons fail. Will it be 
> better if the BADROWS is computed to 1 in this situation? Maybe, we can 
> compute the BADROWS more accurately in merge comparison?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10153) improve VerifyReplication to compute BADROWS more accurately

2013-12-12 Thread cuijianwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cuijianwei updated HBASE-10153:
---

Attachment: HBASE-10153-0.94-v1.patch

This patch try to improve BADROWS computing including:
1. BADROWS is refined to ONLY_IN_SOURCE_TABLE_ROWS, ONLY_IN_PEER_TABLE_ROWS and 
CONTENT_DIFFERENT_ROWS. 
2. compute these counters in merge comparison between source and peer table.

> improve VerifyReplication to compute BADROWS more accurately
> 
>
> Key: HBASE-10153
> URL: https://issues.apache.org/jira/browse/HBASE-10153
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 0.94.14
>Reporter: cuijianwei
> Attachments: HBASE-10153-0.94-v1.patch
>
>
> VerifyReplicaiton could compare the source table with its peer table and 
> compute BADROWS. However, the current BADROWS computing method might not be 
> accurate enough. For example, if source table contains rows as {r1, r2, r3, 
> r4} and peer table contains rows as {r1, r3, r4}, the BADROWS counter will be 
> 3 because 'r2' in source table will make all the later comparisons fail. Will 
> it be better if the BADROWS is computed to 1 in this situation? Maybe, we can 
> compute the BADROWS more accurately in merge comparison?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10107) [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on Jenkins

2013-12-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10107:
---

   Resolution: Fixed
Fix Version/s: 0.99.0
 Assignee: Andrew Purtell
   Status: Resolved  (was: Patch Available)

Committed trivial test fix to trunk and 0.98.

> [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on 
> Jenkins
> ---
>
> Key: HBASE-10107
> URL: https://issues.apache.org/jira/browse/HBASE-10107
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.0, 0.99.0
>
> Attachments: 10107.patch
>
>
> TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation will fail up on Jenkins 
> in builds using "JDK 7 (latest)" but not those using "JDK 6 (latest)". The 
> stacktrace:
> {noformat}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:119)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HBASE-10153) improve VerifyReplication to compute BADROWS more accurately

2013-12-12 Thread cuijianwei (JIRA)
cuijianwei created HBASE-10153:
--

 Summary: improve VerifyReplication to compute BADROWS more 
accurately
 Key: HBASE-10153
 URL: https://issues.apache.org/jira/browse/HBASE-10153
 Project: HBase
  Issue Type: Improvement
  Components: Replication
Affects Versions: 0.94.14
Reporter: cuijianwei


VerifyReplicaiton could compare the source table with its peer table and 
compute BADROWS. However, the current BADROWS computing method might not be 
accurate enough. For example, if source table contains rows as {r1, r2, r3, r4} 
and peer table contains rows as {r1, r3, r4}, the BADROWS counter will be 3 
because 'r2' in source table will make all the later comparisons fail. Will it 
be better if the BADROWS is computed to 1 in this situation? Maybe, we can 
compute the BADROWS more accurately in merge comparison?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HBASE-10152) improve VerifyReplication to compute BADROWS more accurately

2013-12-12 Thread cuijianwei (JIRA)
cuijianwei created HBASE-10152:
--

 Summary: improve VerifyReplication to compute BADROWS more 
accurately
 Key: HBASE-10152
 URL: https://issues.apache.org/jira/browse/HBASE-10152
 Project: HBase
  Issue Type: Improvement
  Components: Replication
Affects Versions: 0.94.14
Reporter: cuijianwei


VerifyReplicaiton could compare the source table with its peer table and 
compute BADROWS. However, the current BADROWS computing method might not be 
accurate enough. For example, if source table contains rows as {r1, r2, r3, r4} 
and peer table contains rows as {r1, r3, r4}, the BADROWS counter will be 3 
because 'r2' in source table will make all the later comparisons fail. Will it 
be better if the BADROWS is computed to 1 in this situation? Maybe, we can 
compute the BADROWS more accurately in merge comparison?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10151) No-op HeapMemoryTuner

2013-12-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10151:
---

Affects Version/s: (was: 0.98.0)
Fix Version/s: (was: 0.98.0)

> No-op HeapMemoryTuner
> -
>
> Key: HBASE-10151
> URL: https://issues.apache.org/jira/browse/HBASE-10151
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.99.0
>Reporter: Andrew Purtell
>Assignee: Anoop Sam John
>
> Provide a no-op HeapMemoryTuner that does not change any memory settings, 
> just enforces the old style fixed proportions. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HBASE-10151) No-op HeapMemoryTuner

2013-12-12 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-10151:
--

 Summary: No-op HeapMemoryTuner
 Key: HBASE-10151
 URL: https://issues.apache.org/jira/browse/HBASE-10151
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.98.0, 0.99.0
Reporter: Andrew Purtell
Assignee: Anoop Sam John
 Fix For: 0.98.0


Provide a no-op HeapMemoryTuner that does not change any memory settings, just 
enforces the old style fixed proportions. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload

2013-12-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847111#comment-13847111
 ] 

Andrew Purtell commented on HBASE-5349:
---

bq.  if one production cluster has a lower enough 99th latency requirement 
considering gc factor, a null tuner probably must be provided?

You bet [~xieliang007], see HBASE-10151

> Automagically tweak global memstore and block cache sizes based on workload
> ---
>
> Key: HBASE-5349
> URL: https://issues.apache.org/jira/browse/HBASE-5349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: Anoop Sam John
> Fix For: 0.99.0
>
> Attachments: HBASE-5349_V2.patch, HBASE-5349_V3.patch, 
> HBASE-5349_V4.patch, HBASE-5349_V5.patch, WIP_HBASE-5349.patch
>
>
> Hypertable does a neat thing where it changes the size given to the CellCache 
> (our MemStores) and Block Cache based on the workload. If you need an image, 
> scroll down at the bottom of this link: 
> http://www.hypertable.com/documentation/architecture/
> That'd be one less thing to configure.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10149) TestZKPermissionsWatcher.testPermissionsWatcher test failure

2013-12-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10149:
---

Attachment: 10149.patch

> TestZKPermissionsWatcher.testPermissionsWatcher test failure
> 
>
> Key: HBASE-10149
> URL: https://issues.apache.org/jira/browse/HBASE-10149
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 10149.patch, 10149.patch
>
>
> {noformat}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher.testPermissionsWatcher(TestZKPermissionsWatcher.java:119)
> {noformat}
> In testPermissionsWatcher we are not always waiting long enough for the 
> propagation of the permissions change for user "george" to take place.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10107) [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on Jenkins

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847098#comment-13847098
 ] 

Hadoop QA commented on HBASE-10107:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12618518/10107.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8150//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8150//console

This message is automatically generated.

> [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on 
> Jenkins
> ---
>
> Key: HBASE-10107
> URL: https://issues.apache.org/jira/browse/HBASE-10107
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 10107.patch
>
>
> TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation will fail up on Jenkins 
> in builds using "JDK 7 (latest)" but not those using "JDK 6 (latest)". The 
> stacktrace:
> {noformat}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:119)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10149) TestZKPermissionsWatcher.testPermissionsWatcher test failure

2013-12-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10149:
---

Attachment: 10149.patch

> TestZKPermissionsWatcher.testPermissionsWatcher test failure
> 
>
> Key: HBASE-10149
> URL: https://issues.apache.org/jira/browse/HBASE-10149
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 10149.patch
>
>
> {noformat}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher.testPermissionsWatcher(TestZKPermissionsWatcher.java:119)
> {noformat}
> In testPermissionsWatcher we are not always waiting long enough for the 
> propagation of the permissions change for user "george" to take place.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10149) TestZKPermissionsWatcher.testPermissionsWatcher test failure

2013-12-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10149:
---

Attachment: (was: 10149.patch)

> TestZKPermissionsWatcher.testPermissionsWatcher test failure
> 
>
> Key: HBASE-10149
> URL: https://issues.apache.org/jira/browse/HBASE-10149
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 10149.patch
>
>
> {noformat}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher.testPermissionsWatcher(TestZKPermissionsWatcher.java:119)
> {noformat}
> In testPermissionsWatcher we are not always waiting long enough for the 
> propagation of the permissions change for user "george" to take place.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HBASE-10150) Write attachment Id of tested patch into JIRA comment

2013-12-12 Thread Ted Yu (JIRA)
Ted Yu created HBASE-10150:
--

 Summary: Write attachment Id of tested patch into JIRA comment
 Key: HBASE-10150
 URL: https://issues.apache.org/jira/browse/HBASE-10150
 Project: HBase
  Issue Type: Sub-task
Reporter: Ted Yu


The optimization proposed in HBASE-10044 wouldn't work if QA bot doesn't know 
the attachment Id of the most recently tested patch.

The first step is to write attachment Id of tested patch into JIRA comment.

For details, see HADOOP-10163



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10149) TestZKPermissionsWatcher.testPermissionsWatcher test failure

2013-12-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10149:
---

Status: Patch Available  (was: Open)

See what HadoopQA thinks

> TestZKPermissionsWatcher.testPermissionsWatcher test failure
> 
>
> Key: HBASE-10149
> URL: https://issues.apache.org/jira/browse/HBASE-10149
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 10149.patch
>
>
> {noformat}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher.testPermissionsWatcher(TestZKPermissionsWatcher.java:119)
> {noformat}
> In testPermissionsWatcher we are not always waiting long enough for the 
> propagation of the permissions change for user "george" to take place.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10149) TestZKPermissionsWatcher.testPermissionsWatcher test failure

2013-12-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10149:
---

Attachment: 10149.patch

> TestZKPermissionsWatcher.testPermissionsWatcher test failure
> 
>
> Key: HBASE-10149
> URL: https://issues.apache.org/jira/browse/HBASE-10149
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 10149.patch
>
>
> {noformat}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher.testPermissionsWatcher(TestZKPermissionsWatcher.java:119)
> {noformat}
> In testPermissionsWatcher we are not always waiting long enough for the 
> propagation of the permissions change for user "george" to take place.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847069#comment-13847069
 ] 

Hadoop QA commented on HBASE-8755:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12618509/8755v9.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8149//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8149//console

This message is automatically generated.

> A new write thread model for HLog to improve the overall HBase write 
> throughput
> ---
>
> Key: HBASE-8755
> URL: https://issues.apache.org/jira/browse/HBASE-8755
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, wal
>Reporter: Feng Honghua
>Assignee: stack
>Priority: Critical
> Attachments: 8755-syncer.patch, 8755trunkV2.txt, 8755v8.txt, 
> 8755v9.txt, HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
> HBASE-8755-0.96-v0.patch, HBASE-8755-trunk-V0.patch, 
> HBASE-8755-trunk-V1.patch, HBASE-8755-trunk-v4.patch, 
> HBASE-8755-trunk-v6.patch, HBASE-8755-trunk-v7.patch, HBASE-8755-v5.patch, 
> thread.out
>
>
> In current write model, each write handler thread (executing put()) will 
> individually go through a full 'append (hlog local buffer) => HLog writer 
> append (write to hdfs) => HLog writer sync (sync hdfs)' cycle for each write, 
> which incurs heavy race condition on updateLock and flushLock.
> The only optimization where checking if current syncTillHere > txid in 
> expectation for other thread help write/sync its own txid to hdfs and 
> omitting the write/sync actually help much less than expectation.
> Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
> proposed a new write thread model for writing hdfs sequence file and the 
> prototype implementation shows a 4X improvement for throughput (from 17000 to 
> 7+). 
> I apply this new write thread model in HLog and the performance test in our 
> test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
> RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
> even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
> write throughput then is 31002)

[jira] [Commented] (HBASE-10129) support real time rpc invoke latency percentile statistics for methods of HRegionInterface

2013-12-12 Thread cuijianwei (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847067#comment-13847067
 ] 

cuijianwei commented on HBASE-10129:


[~ndimiduk], thanks, I will try the it.

> support real time rpc invoke latency percentile statistics for methods of 
> HRegionInterface 
> ---
>
> Key: HBASE-10129
> URL: https://issues.apache.org/jira/browse/HBASE-10129
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.94.14
>Reporter: cuijianwei
> Attachments: HBASE-10129-0.94-v1.patch
>
>
> It is important for applications to get latency statistics when invoking 
> hbase apis. Currently, the average latency of methods in HRegionInterface 
> will be computed in HBaseRpcMetrics. However, user might expect more detail 
> latency statistics, such as 75% percentile latency, 95% percentile latency, 
> etc. Therefore, will it be useful if we computing latency percentiles for rpc 
> invoking of region server methods? 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload

2013-12-12 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847058#comment-13847058
 ] 

Liang Xie commented on HBASE-5349:
--

bq. We can provide a null tuner to remove this as a factor when getting to the 
bottom of excessive GCs though.
thanks Andrew, i definitely like this feature, and still, if one production 
cluster has a lower enough 99th latency requirement considering gc factor,  a 
null tuner probably must be provided? :) 

> Automagically tweak global memstore and block cache sizes based on workload
> ---
>
> Key: HBASE-5349
> URL: https://issues.apache.org/jira/browse/HBASE-5349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: Anoop Sam John
> Fix For: 0.99.0
>
> Attachments: HBASE-5349_V2.patch, HBASE-5349_V3.patch, 
> HBASE-5349_V4.patch, HBASE-5349_V5.patch, WIP_HBASE-5349.patch
>
>
> Hypertable does a neat thing where it changes the size given to the CellCache 
> (our MemStores) and Block Cache based on the workload. If you need an image, 
> scroll down at the bottom of this link: 
> http://www.hypertable.com/documentation/architecture/
> That'd be one less thing to configure.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10127) support balance table

2013-12-12 Thread cuijianwei (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847057#comment-13847057
 ] 

cuijianwei commented on HBASE-10127:


Thanks for your comment [~lhofhansl]. I go through the logic of 'balance()' and 
try the patch in our cluster, the balancing result seems good. I will take a 
close look at the logic of 'balance()'. On the other hand, if there is a 
specific issue where the resulting balancing was bad, I think we might need to 
fix the problem because bad cluster balancing result may also be generated if 
there is only one table in the cluster.

> support balance table
> -
>
> Key: HBASE-10127
> URL: https://issues.apache.org/jira/browse/HBASE-10127
> Project: HBase
>  Issue Type: Improvement
>  Components: master, shell
>Affects Versions: 0.94.14
>Reporter: cuijianwei
> Attachments: HBASE-10127-0.94-v1.patch
>
>
> HMaster provides a rpc interface : 'balance()' to balance all the regions 
> among region servers in the cluster. Sometimes, we might want to balance all 
> the regions belonging to a table while keeping the region assignments of 
> other tables. This demand may reveal in a shared cluster where we want to 
> balance regions for one application's table without affecting other 
> applications. Therefore, will it be useful if we extend the current 
> 'balance()' interface to only balance regions of the same table? 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10089) Metrics intern table names cause eventual permgen OOM in 0.94

2013-12-12 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847053#comment-13847053
 ] 

Liang Xie commented on HBASE-10089:
---

[~jmspaggi], thanks for sharing your result ! seems has a consistency with my 
original knowledge.


> Metrics intern table names cause eventual permgen OOM in 0.94
> -
>
> Key: HBASE-10089
> URL: https://issues.apache.org/jira/browse/HBASE-10089
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.94.14
>Reporter: Dave Latham
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 0.94.15
>
> Attachments: 10089-0.94.txt
>
>
> As part of the metrics system introduced in HBASE-4768 there are two places 
> that hbase uses String interning ( SchemaConfigured and SchemaMetrics ).  
> This includes interning table names.  We have long running environment where 
> we run regular integration tests on our application using hbase.  Those tests 
> create and drop tables with new names regularly.  These leads to filling up 
> the permgen with interned table names.  Workaround is to periodically restart 
> the region servers.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10107) [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on Jenkins

2013-12-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10107:
---

Attachment: 10107.patch

> [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on 
> Jenkins
> ---
>
> Key: HBASE-10107
> URL: https://issues.apache.org/jira/browse/HBASE-10107
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 10107.patch
>
>
> TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation will fail up on Jenkins 
> in builds using "JDK 7 (latest)" but not those using "JDK 6 (latest)". The 
> stacktrace:
> {noformat}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:119)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10107) [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on Jenkins

2013-12-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10107:
---

Attachment: 10107.patch

> [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on 
> Jenkins
> ---
>
> Key: HBASE-10107
> URL: https://issues.apache.org/jira/browse/HBASE-10107
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
>
> TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation will fail up on Jenkins 
> in builds using "JDK 7 (latest)" but not those using "JDK 6 (latest)". The 
> stacktrace:
> {noformat}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:119)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10107) [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on Jenkins

2013-12-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10107:
---

Status: Patch Available  (was: Open)

See what HadoopQA thinks

> [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on 
> Jenkins
> ---
>
> Key: HBASE-10107
> URL: https://issues.apache.org/jira/browse/HBASE-10107
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
>
> TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation will fail up on Jenkins 
> in builds using "JDK 7 (latest)" but not those using "JDK 6 (latest)". The 
> stacktrace:
> {noformat}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:119)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10107) [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on Jenkins

2013-12-12 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10107:
---

Attachment: (was: 10107.patch)

> [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on 
> Jenkins
> ---
>
> Key: HBASE-10107
> URL: https://issues.apache.org/jira/browse/HBASE-10107
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
>
> TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation will fail up on Jenkins 
> in builds using "JDK 7 (latest)" but not those using "JDK 6 (latest)". The 
> stacktrace:
> {noformat}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:119)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10146) Bump HTrace version to 2.04

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847031#comment-13847031
 ] 

Hadoop QA commented on HBASE-10146:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12618489/HBASE-10146-0.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8147//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8147//console

This message is automatically generated.

> Bump HTrace version to 2.04
> ---
>
> Key: HBASE-10146
> URL: https://issues.apache.org/jira/browse/HBASE-10146
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.96.1, 0.99.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HBASE-10146-0.patch
>
>
> 2.04 has been released with a bug fix for what happens when htrace fails.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10076) Backport MapReduce over snapshot files [0.94]

2013-12-12 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847018#comment-13847018
 ] 

Jesse Yates commented on HBASE-10076:
-

[~enis] other questions Lars and I had offline - how can we test that the 
locality selection is correct? Its not really covered anywhere in this patch or 
the original

> Backport MapReduce over snapshot files [0.94]
> -
>
> Key: HBASE-10076
> URL: https://issues.apache.org/jira/browse/HBASE-10076
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.15
>
> Attachments: hbase-10076-v0.patch
>
>
> MapReduce over Snapshots would be valuable on 0.94.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10076) Backport MapReduce over snapshot files [0.94]

2013-12-12 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847017#comment-13847017
 ] 

Jesse Yates commented on HBASE-10076:
-

bq. Any interest in bringing in the new test TestCellUtil. 
testOverlappingKeys() ? Not needed that much, just checking

Done - good call.

bq. Not sure about the change in TableInputFormatBase. Is this needed? Let's 
leave this out otherwise.

Not needed, but this is a cleaner, better implementation and good to reuse the 
util, now that we have it. Agree its a little extra and a bit unneccessary, but 
trivially so.

bq. In the original patch, TableMapReduceUtil. initTableMapperJob() now accepts 
an initCredentials param, because we do not want to get tokens from HBase at 
all. Otherwise, if hbase is used with security, offline clusters won't work.

Looks like in 0.94 its just added based on the config. In trunk, looks like 
TableMapReduceUtil.initTableSnapshotMapperJob just calls initTableMapperJob 
without any parameter, which always initializes the credentials. Maybe a bug in 
trunk? Seems like you would need a more sweeping change to make it configurable 
as well.

> Backport MapReduce over snapshot files [0.94]
> -
>
> Key: HBASE-10076
> URL: https://issues.apache.org/jira/browse/HBASE-10076
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.15
>
> Attachments: hbase-10076-v0.patch
>
>
> MapReduce over Snapshots would be valuable on 0.94.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847005#comment-13847005
 ] 

stack commented on HBASE-8369:
--

bq. Something like that could go into 0.96 surely. stack ?

Could do that (looking at the patch it doesn't look needed in 0.96).

No objection from me adding in hooks into 0.94 as long as... you know what I'm 
going to say.

I'll bring flowers next time we meet [~lhofhansl]

> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Comment Edited] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847004#comment-13847004
 ] 

Lars Hofhansl edited comment on HBASE-8369 at 12/13/13 1:02 AM:


The only changes to existing HBase classes are exactly these hooks, though. 
Without them it cannot be done with outside code. When those are in place 
anyway, might as well add some new classes for M/R stuff; but it's fine to keep 
these outside, they just become part of the M/R job then.

To explain my comment above:
Adding a few classes is not a fork of course, but it starts a slippery slope. 
Once you started it's easy to pile on top of that. And there are some HBase 
changes needed, so it is an actual patch we need to maintain.
We have so far completely avoided that (except for some hopefully temporary 
security related changes to HDFS), and I have been a strong advocate for that 
in our organization. We have also always forward ported any changes we made to 
0.96+. So it is frustrating having to start this even (or especially) for such 
a small change.

So please pardon my frustration.
I do not understand the reluctance with this, as it is almost no risk and some 
folks will be using 0.94 for a while.
Whether it's a new "feature" or not is not relevant (IMHO). HBase's slow M/R 
performance could be considered a bug too, and then this would be bug fix.

We're not breaking up over this :)

So it seems a good compromise would be to get the required hooks into HBase...?
[~jesse_yates], FYI.



was (Author: lhofhansl):
The only changes to existing HBase classes are exactly these hooks, though. 
Without them it cannot be done with outside code. When those are in place 
anyway, might as well add some new classes for M/R stuff; but it's fine to keep 
these outside, they just become part of the M/R job then.

To explain my comment above:
Adding a few classes is not a fork of course, but it starts a slippery slope. 
Once you started it's easy to pile on top of that. And there are some HBase 
changes needed, so it is an actual patch we need to maintain.
We have so far completely avoided that (except for some hopefully temporary 
security related changes to HDFS), and I have been a strong advocate for that 
in our organization. We have also always forward ported any changes we made to 
0.96+. So it is frustrating having to start this even (or especially) for such 
a small change.

So please pardon my frustration.
I do not understand the reluctance with this, as it is almost no risk and some 
folks will be using 0.94 for a while.
Whether it's a new "feature" or not is not relevant (IMHO). HBase's slow M/R 
performance could be considered a bug too, and then this would be bug fix.

We're not breaking up over this :)

So it seems a good compromise would be to get the required hooks into HBase...?
[~jesse_yates].


> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847004#comment-13847004
 ] 

Lars Hofhansl commented on HBASE-8369:
--

The only changes to existing HBase classes are exactly these hooks, though. 
Without them it cannot be done with outside code. When those are in place 
anyway, might as well add some new classes for M/R stuff; but it's fine to keep 
these outside, they just become part of the M/R job then.

To explain my comment above:
Adding a few classes is not a fork of course, but it starts a slippery slope. 
Once you started it's easy to pile on top of that. And there are some HBase 
changes needed, so it is an actual patch we need to maintain.
We have so far completely avoided that (except for some hopefully temporary 
security related changes to HDFS), and I have been a strong advocate for that 
in our organization. We have also always forward ported any changes we made to 
0.96+. So it is frustrating having to start this even (or especially) for such 
a small change.

So please pardon my frustration.
I do not understand the reluctance with this, as it is almost no risk and some 
folks will be using 0.94 for a while.
Whether it's a new "feature" or not is not relevant (IMHO). HBase's slow M/R 
performance could be considered a bug too, and then this would be bug fix.

We're not breaking up over this :)

So it seems a good compromise would be to get the required hooks into HBase...?
[~jesse_yates].


> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

2013-12-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846997#comment-13846997
 ] 

stack commented on HBASE-8755:
--

32 threads on one node writing a cluster of 4 nodes (~8 threads per server 
which according to our tests to date shows this model running slower than what 
we have).  It does 10% less throughput after ~25minutes.  We need to get the 
other speedups in after this goes in.

> A new write thread model for HLog to improve the overall HBase write 
> throughput
> ---
>
> Key: HBASE-8755
> URL: https://issues.apache.org/jira/browse/HBASE-8755
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, wal
>Reporter: Feng Honghua
>Assignee: stack
>Priority: Critical
> Attachments: 8755-syncer.patch, 8755trunkV2.txt, 8755v8.txt, 
> 8755v9.txt, HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
> HBASE-8755-0.96-v0.patch, HBASE-8755-trunk-V0.patch, 
> HBASE-8755-trunk-V1.patch, HBASE-8755-trunk-v4.patch, 
> HBASE-8755-trunk-v6.patch, HBASE-8755-trunk-v7.patch, HBASE-8755-v5.patch, 
> thread.out
>
>
> In current write model, each write handler thread (executing put()) will 
> individually go through a full 'append (hlog local buffer) => HLog writer 
> append (write to hdfs) => HLog writer sync (sync hdfs)' cycle for each write, 
> which incurs heavy race condition on updateLock and flushLock.
> The only optimization where checking if current syncTillHere > txid in 
> expectation for other thread help write/sync its own txid to hdfs and 
> omitting the write/sync actually help much less than expectation.
> Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
> proposed a new write thread model for writing hdfs sequence file and the 
> prototype implementation shows a 4X improvement for throughput (from 17000 to 
> 7+). 
> I apply this new write thread model in HLog and the performance test in our 
> test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
> RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
> even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
> write throughput then is 31002). I can provide the detailed performance test 
> results if anyone is interested.
> The change for new write thread model is as below:
>  1> All put handler threads append the edits to HLog's local pending buffer; 
> (it notifies AsyncWriter thread that there is new edits in local buffer)
>  2> All put handler threads wait in HLog.syncer() function for underlying 
> threads to finish the sync that contains its txid;
>  3> An single AsyncWriter thread is responsible for retrieve all the buffered 
> edits in HLog's local pending buffer and write to the hdfs 
> (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
> writes to hdfs that needs a sync)
>  4> An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
> to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
> that sync watermark increases)
>  5> An single AsyncNotifier thread is responsible for notifying all pending 
> put handler threads which are waiting in the HLog.syncer() function
>  6> No LogSyncer thread any more (since there is always 
> AsyncWriter/AsyncFlusher threads do the same job it does)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

2013-12-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846994#comment-13846994
 ] 

stack commented on HBASE-8755:
--

This is what I'll commit.  I've been running it on small cluster this afternoon 
and after fixing hardware, it seems to run fine at about the same speed as what 
we have currently (ycsb read/write loading).

> A new write thread model for HLog to improve the overall HBase write 
> throughput
> ---
>
> Key: HBASE-8755
> URL: https://issues.apache.org/jira/browse/HBASE-8755
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, wal
>Reporter: Feng Honghua
>Assignee: stack
>Priority: Critical
> Attachments: 8755-syncer.patch, 8755trunkV2.txt, 8755v8.txt, 
> 8755v9.txt, HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
> HBASE-8755-0.96-v0.patch, HBASE-8755-trunk-V0.patch, 
> HBASE-8755-trunk-V1.patch, HBASE-8755-trunk-v4.patch, 
> HBASE-8755-trunk-v6.patch, HBASE-8755-trunk-v7.patch, HBASE-8755-v5.patch, 
> thread.out
>
>
> In current write model, each write handler thread (executing put()) will 
> individually go through a full 'append (hlog local buffer) => HLog writer 
> append (write to hdfs) => HLog writer sync (sync hdfs)' cycle for each write, 
> which incurs heavy race condition on updateLock and flushLock.
> The only optimization where checking if current syncTillHere > txid in 
> expectation for other thread help write/sync its own txid to hdfs and 
> omitting the write/sync actually help much less than expectation.
> Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
> proposed a new write thread model for writing hdfs sequence file and the 
> prototype implementation shows a 4X improvement for throughput (from 17000 to 
> 7+). 
> I apply this new write thread model in HLog and the performance test in our 
> test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
> RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
> even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
> write throughput then is 31002). I can provide the detailed performance test 
> results if anyone is interested.
> The change for new write thread model is as below:
>  1> All put handler threads append the edits to HLog's local pending buffer; 
> (it notifies AsyncWriter thread that there is new edits in local buffer)
>  2> All put handler threads wait in HLog.syncer() function for underlying 
> threads to finish the sync that contains its txid;
>  3> An single AsyncWriter thread is responsible for retrieve all the buffered 
> edits in HLog's local pending buffer and write to the hdfs 
> (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
> writes to hdfs that needs a sync)
>  4> An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
> to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
> that sync watermark increases)
>  5> An single AsyncNotifier thread is responsible for notifying all pending 
> put handler threads which are waiting in the HLog.syncer() function
>  6> No LogSyncer thread any more (since there is always 
> AsyncWriter/AsyncFlusher threads do the same job it does)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

2013-12-12 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-8755:
-

Attachment: 8755v9.txt

Fix compile issue.

> A new write thread model for HLog to improve the overall HBase write 
> throughput
> ---
>
> Key: HBASE-8755
> URL: https://issues.apache.org/jira/browse/HBASE-8755
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, wal
>Reporter: Feng Honghua
>Assignee: stack
>Priority: Critical
> Attachments: 8755-syncer.patch, 8755trunkV2.txt, 8755v8.txt, 
> 8755v9.txt, HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
> HBASE-8755-0.96-v0.patch, HBASE-8755-trunk-V0.patch, 
> HBASE-8755-trunk-V1.patch, HBASE-8755-trunk-v4.patch, 
> HBASE-8755-trunk-v6.patch, HBASE-8755-trunk-v7.patch, HBASE-8755-v5.patch, 
> thread.out
>
>
> In current write model, each write handler thread (executing put()) will 
> individually go through a full 'append (hlog local buffer) => HLog writer 
> append (write to hdfs) => HLog writer sync (sync hdfs)' cycle for each write, 
> which incurs heavy race condition on updateLock and flushLock.
> The only optimization where checking if current syncTillHere > txid in 
> expectation for other thread help write/sync its own txid to hdfs and 
> omitting the write/sync actually help much less than expectation.
> Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
> proposed a new write thread model for writing hdfs sequence file and the 
> prototype implementation shows a 4X improvement for throughput (from 17000 to 
> 7+). 
> I apply this new write thread model in HLog and the performance test in our 
> test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
> RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
> even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
> write throughput then is 31002). I can provide the detailed performance test 
> results if anyone is interested.
> The change for new write thread model is as below:
>  1> All put handler threads append the edits to HLog's local pending buffer; 
> (it notifies AsyncWriter thread that there is new edits in local buffer)
>  2> All put handler threads wait in HLog.syncer() function for underlying 
> threads to finish the sync that contains its txid;
>  3> An single AsyncWriter thread is responsible for retrieve all the buffered 
> edits in HLog's local pending buffer and write to the hdfs 
> (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
> writes to hdfs that needs a sync)
>  4> An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
> to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
> that sync watermark increases)
>  5> An single AsyncNotifier thread is responsible for notifying all pending 
> put handler threads which are waiting in the HLog.syncer() function
>  6> No LogSyncer thread any more (since there is always 
> AsyncWriter/AsyncFlusher threads do the same job it does)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846979#comment-13846979
 ] 

stack commented on HBASE-8369:
--

bq. We are going to have to start to go down the Facebook path of forking HBase 
for things like this then and our contribution will become less useful over 
time.

Why not add in the changes to core you need to support this feature into 0.94?

Lets not break up over whether a couple of mapreduce classes are in core or not.

> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846981#comment-13846981
 ] 

Andrew Purtell commented on HBASE-8369:
---

bq. Like an appropriate HRegion constructor, etc.

Something like that could go into 0.96 surely. [~stack] ?

bq. We are going to have to start to go down the Facebook path of forking HBase 
for things like this then and our contribution will become less useful over 
time. So be it.

To put this politely (I have strong opinions) the FB fork was a matter of a 
tight internal deployment schedule as opposed to any unwillingness of the 
community to work with their contributions. 

The addition of a couple of extra classes to a private build does not make a 
fork, just like the enhancements to reduce byte copies for "smart clients" that 
Jesse was working that went mainly into Phoenix didn't produce a fork. If and 
when the time comes that truly an incompatible change must be introduced that 
constitutes a real break, we should definitely look hard at that.


> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846984#comment-13846984
 ] 

Hadoop QA commented on HBASE-8755:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12618499/8755v8.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:red}-1 hadoop1.0{color}.  The patch failed to compile against the 
hadoop 1.0 profile.
Here is snippet of errors:
{code}[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) 
on project hbase-server: Compilation failure: Compilation failure:
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java:[436,42]
 unclosed string literal
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java:[436,63]
 ';' expected
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java:[437,17]
 illegal start of expression
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java:[437,23]
 ';' expected
[ERROR] -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) 
on project hbase-server: Compilation failure
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:213)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
--
Caused by: org.apache.maven.plugin.CompilationFailureException: Compilation 
failure
at 
org.apache.maven.plugin.AbstractCompilerMojo.execute(AbstractCompilerMojo.java:729)
at org.apache.maven.plugin.CompilerMojo.execute(CompilerMojo.java:128)
at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
... 19 more{code}

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8148//console

This message is automatically generated.

> A new write thread model for HLog to improve the overall HBase write 
> throughput
> ---
>
> Key: HBASE-8755
> URL: https://issues.apache.org/jira/browse/HBASE-8755
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, wal
>Reporter: Feng Honghua
>Assignee: stack
>Priority: Critical
> Attachments: 8755-syncer.patch, 8755trunkV2.txt, 8755v8.txt, 
> HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, HBASE-8755-0.96-v0.patch, 
> HBASE-8755-trunk-V0.patch, HBASE-8755-trunk-V1.patch, 
> HBASE-8755-trunk-v4.patch, HBASE-8755-trunk-v6.patch, 
> HBASE-8755-trunk-v7.patch, HBASE-8755-v5.patch, thread.out
>
>
> In current write model, each write handler thread (executing put()) will 
> individually go through a full 'append (hlog local buffer) => HLog writer 
> append (write to hdfs) => HLog writer sync (sync hdfs)' cycle for each write, 
> which incurs heavy race condition on updateLock and flushLock.
> The only optimization where checking if current syncTillHere > txid in 
> expectation for other thread help write/sync its own txid to hdfs and 
> omitting the write/sync actually help much less than expectation.
> Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
> proposed a new write thread model for writing hdfs sequence file and the 
> prototype implementation shows a 4X improvement for throughput (from 17000 to 
> 7+). 
> I apply this new write thread model in HLog and the performance test in our 
> test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
> RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
> even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
> write throughput then is 31002). I can provide the detailed performance test 
> results if a

[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846976#comment-13846976
 ] 

Lars Hofhansl commented on HBASE-8369:
--

bq. anyone who wants to do this in 0.96 can just take the two mapreduce classes 
and include them in their mapreduce job

That could work as long as we put the necessary hooks into HBase itself - the 
small changes to the existing classes. Like an appropriate HRegion constructor, 
etc.

> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10076) Backport MapReduce over snapshot files [0.94]

2013-12-12 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846971#comment-13846971
 ] 

Jesse Yates commented on HBASE-10076:
-

thanks for the feedback Enis! I'll update the patch, in the event that someone 
wants it for their installation, or have it ready if we can get it into 0.96 
(as per discussion on HBASE-8369).

> Backport MapReduce over snapshot files [0.94]
> -
>
> Key: HBASE-10076
> URL: https://issues.apache.org/jira/browse/HBASE-10076
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.15
>
> Attachments: hbase-10076-v0.patch
>
>
> MapReduce over Snapshots would be valuable on 0.94.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HBASE-10149) TestZKPermissionsWatcher.testPermissionsWatcher test failure

2013-12-12 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-10149:
--

 Summary: TestZKPermissionsWatcher.testPermissionsWatcher test 
failure
 Key: HBASE-10149
 URL: https://issues.apache.org/jira/browse/HBASE-10149
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell
 Fix For: 0.98.0


{noformat}
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher.testPermissionsWatcher(TestZKPermissionsWatcher.java:119)
{noformat}

In testPermissionsWatcher we are not always waiting long enough for the 
propagation of the permissions change for user "george" to take place.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846968#comment-13846968
 ] 

Lars Hofhansl commented on HBASE-8369:
--

We will use this in our setup, which is based on 0.94. Like us some folks are 
stuck at a certain release of HBase, which does not mean they are not 
interested in new features.

We are going to have to start to go down the Facebook path of forking HBase for 
things like this then and our contribution will become less useful over time. 
So be it.


> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10107) [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on Jenkins

2013-12-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846965#comment-13846965
 ] 

Andrew Purtell commented on HBASE-10107:


Ok, I will do that momentarily

> [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on 
> Jenkins
> ---
>
> Key: HBASE-10107
> URL: https://issues.apache.org/jira/browse/HBASE-10107
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
>
> TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation will fail up on Jenkins 
> in builds using "JDK 7 (latest)" but not those using "JDK 6 (latest)". The 
> stacktrace:
> {noformat}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:119)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10107) [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on Jenkins

2013-12-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846962#comment-13846962
 ] 

stack commented on HBASE-10107:
---

bq So do we disable this test then because some OSes are not sane?

+1 at least for now (unfortunately)

> [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on 
> Jenkins
> ---
>
> Key: HBASE-10107
> URL: https://issues.apache.org/jira/browse/HBASE-10107
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
>
> TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation will fail up on Jenkins 
> in builds using "JDK 7 (latest)" but not those using "JDK 6 (latest)". The 
> stacktrace:
> {noformat}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:119)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846963#comment-13846963
 ] 

Elliott Clark commented on HBASE-8369:
--

I'm -0.5 on getting this into older releases.  To me it seems like the 
community drew a line in the sand on what features made it into 0.96.  We held 
that line for some things (security, etc).  So it seems weird that we would 
pull this into older releases without extraordinary circumstances, and we were 
strict other places.

We've started faster release trains and 0.98 is on the way.  I would vote for 
holding a very strict no new features into old releases now that we have 
addressed concerns over release timings.

> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10146) Bump HTrace version to 2.04

2013-12-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846961#comment-13846961
 ] 

stack commented on HBASE-10146:
---

+1

> Bump HTrace version to 2.04
> ---
>
> Key: HBASE-10146
> URL: https://issues.apache.org/jira/browse/HBASE-10146
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.96.1, 0.99.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HBASE-10146-0.patch
>
>
> 2.04 has been released with a bug fix for what happens when htrace fails.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10076) Backport MapReduce over snapshot files [0.94]

2013-12-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846960#comment-13846960
 ] 

Andrew Purtell commented on HBASE-10076:


See 
https://issues.apache.org/jira/browse/HBASE-8369?focusedCommentId=13846957&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846957

> Backport MapReduce over snapshot files [0.94]
> -
>
> Key: HBASE-10076
> URL: https://issues.apache.org/jira/browse/HBASE-10076
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.15
>
> Attachments: hbase-10076-v0.patch
>
>
> MapReduce over Snapshots would be valuable on 0.94.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846958#comment-13846958
 ] 

stack commented on HBASE-8369:
--

I want to hold to no new features post release of a branch unless extraordinary 
circumstances.  This one is very nice but doesn't seem to qualify as 
'extraordinary' given anyone who wants to do this in 0.96 can just take the two 
mapreduce classes and include them in their mapreduce job -- or go get the 0.98 
jar (won't that work?).

> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846957#comment-13846957
 ] 

Andrew Purtell commented on HBASE-8369:
---

No, I think the incongruity of having something in 0.94, missing in 0.96, and 
again in 0.98 is bad practice. If we are blocked getting it in to 0.96, then it 
has to start > 0.96. 

> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Comment Edited] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload

2013-12-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846947#comment-13846947
 ] 

Andrew Purtell edited comment on HBASE-5349 at 12/13/13 12:07 AM:
--

bq. seems if we tweak block cache dynamically, it's more prone to trigger a gc 
that moment, right?

This is largely out of our control, until and unless the JVM exposes some knobs 
for GC and early warning signals. Between memstore and blockcache we can't just 
look at JVM heap occupancy, we will be keeping it pretty full. 

We can provide a null tuner to remove this as a factor when getting to the 
bottom of excessive GCs though.


was (Author: apurtell):
bq. seems if we tweak block cache dynamically, it's more prone to trigger a gc 
that moment, right?

This is largely out of our control, until and unless the JVM exposes some knobs 
for GC and early warning signals. Between memstore and blockcache we can't just 
look at occupancy. 

We can provide a null tuner to remove this as a factor when getting to the 
bottom of excessive GCs though.

> Automagically tweak global memstore and block cache sizes based on workload
> ---
>
> Key: HBASE-5349
> URL: https://issues.apache.org/jira/browse/HBASE-5349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: Anoop Sam John
> Fix For: 0.99.0
>
> Attachments: HBASE-5349_V2.patch, HBASE-5349_V3.patch, 
> HBASE-5349_V4.patch, HBASE-5349_V5.patch, WIP_HBASE-5349.patch
>
>
> Hypertable does a neat thing where it changes the size given to the CellCache 
> (our MemStores) and Block Cache based on the workload. If you need an image, 
> scroll down at the bottom of this link: 
> http://www.hypertable.com/documentation/architecture/
> That'd be one less thing to configure.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload

2013-12-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846947#comment-13846947
 ] 

Andrew Purtell commented on HBASE-5349:
---

bq. seems if we tweak block cache dynamically, it's more prone to trigger a gc 
that moment, right?

This is largely out of our control, until and unless the JVM exposes some knobs 
for GC and early warning signals. Between memstore and blockcache we can't just 
look at occupancy. 

We can provide a null tuner to remove this as a factor when getting to the 
bottom of excessive GCs though.

> Automagically tweak global memstore and block cache sizes based on workload
> ---
>
> Key: HBASE-5349
> URL: https://issues.apache.org/jira/browse/HBASE-5349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: Anoop Sam John
> Fix For: 0.99.0
>
> Attachments: HBASE-5349_V2.patch, HBASE-5349_V3.patch, 
> HBASE-5349_V4.patch, HBASE-5349_V5.patch, WIP_HBASE-5349.patch
>
>
> Hypertable does a neat thing where it changes the size given to the CellCache 
> (our MemStores) and Block Cache based on the workload. If you need an image, 
> scroll down at the bottom of this link: 
> http://www.hypertable.com/documentation/architecture/
> That'd be one less thing to configure.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

2013-12-12 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-8755:
-

Attachment: 8755v8.txt

Same as [~fenghh]'s patch only it checks for null writer before using it -- 
this is currently in the code and seems to make this patch work again (i'm 
testing)  -- and adds this on tail of each Async* thread:

+  } catch (Exception e) {
+LOG.error("UNEXPECTED", e);
   } finally {

Also renames the threads from AsyncHLog* to WAL.Async.  Minor.

> A new write thread model for HLog to improve the overall HBase write 
> throughput
> ---
>
> Key: HBASE-8755
> URL: https://issues.apache.org/jira/browse/HBASE-8755
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, wal
>Reporter: Feng Honghua
>Assignee: stack
>Priority: Critical
> Attachments: 8755-syncer.patch, 8755trunkV2.txt, 8755v8.txt, 
> HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, HBASE-8755-0.96-v0.patch, 
> HBASE-8755-trunk-V0.patch, HBASE-8755-trunk-V1.patch, 
> HBASE-8755-trunk-v4.patch, HBASE-8755-trunk-v6.patch, 
> HBASE-8755-trunk-v7.patch, HBASE-8755-v5.patch, thread.out
>
>
> In current write model, each write handler thread (executing put()) will 
> individually go through a full 'append (hlog local buffer) => HLog writer 
> append (write to hdfs) => HLog writer sync (sync hdfs)' cycle for each write, 
> which incurs heavy race condition on updateLock and flushLock.
> The only optimization where checking if current syncTillHere > txid in 
> expectation for other thread help write/sync its own txid to hdfs and 
> omitting the write/sync actually help much less than expectation.
> Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
> proposed a new write thread model for writing hdfs sequence file and the 
> prototype implementation shows a 4X improvement for throughput (from 17000 to 
> 7+). 
> I apply this new write thread model in HLog and the performance test in our 
> test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
> RS, from 22000 to 7 for 5 RS), the 1 RS write throughput (1K row-size) 
> even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
> write throughput then is 31002). I can provide the detailed performance test 
> results if anyone is interested.
> The change for new write thread model is as below:
>  1> All put handler threads append the edits to HLog's local pending buffer; 
> (it notifies AsyncWriter thread that there is new edits in local buffer)
>  2> All put handler threads wait in HLog.syncer() function for underlying 
> threads to finish the sync that contains its txid;
>  3> An single AsyncWriter thread is responsible for retrieve all the buffered 
> edits in HLog's local pending buffer and write to the hdfs 
> (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
> writes to hdfs that needs a sync)
>  4> An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
> to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
> that sync watermark increases)
>  5> An single AsyncNotifier thread is responsible for notifying all pending 
> put handler threads which are waiting in the HLog.syncer() function
>  6> No LogSyncer thread any more (since there is always 
> AsyncWriter/AsyncFlusher threads do the same job it does)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10148) [VisibilityController] Tolerate regions in recovery

2013-12-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846922#comment-13846922
 ] 

Ted Yu commented on HBASE-10148:


Riding over the recovery is feasible.
HRegion has this method:
{code}
  public boolean isRecovering() {
{code}

> [VisibilityController] Tolerate regions in recovery
> ---
>
> Key: HBASE-10148
> URL: https://issues.apache.org/jira/browse/HBASE-10148
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Andrew Purtell
>Assignee: Anoop Sam John
> Fix For: 0.98.0
>
>
> Ted Yu reports that enabling distributed log replay by default, like:
> {noformat}
> Index: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
> ===
> --- hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
> (revision 1550575)
> +++ hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
> (working copy)
> @@ -794,7 +794,7 @@
>/** Conf key that enables unflushed WAL edits directly being replayed to 
> region servers */
>public static final String DISTRIBUTED_LOG_REPLAY_KEY = 
> "hbase.master.distributed.log.replay";
> -  public static final boolean DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG = false;
> +  public static final boolean DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG = true;
>public static final String DISALLOW_WRITES_IN_RECOVERING =
>"hbase.regionserver.disallow.writes.when.recovering";
>public static final boolean DEFAULT_DISALLOW_WRITES_IN_RECOVERING_CONFIG = 
> false;
> {noformat}
> causes TestVisibilityController#testAddVisibilityLabelsOnRSRestart to fail. 
> It reveals an issue with label operations if the label table is recovering:
> {noformat}
> 2013-12-12 14:53:53,133 DEBUG [RpcServer.handler=2,port=58108] 
> visibility.VisibilityController(1046): Adding the label XYZ2013-12-12 
> 14:53:53,137 ERROR [RpcServer.handler=2,port=58108] 
> visibility.VisibilityController(1074): 
> org.apache.hadoop.hbase.exceptions.RegionInRecoveryException: 
> hbase:labels,,138626648.f14a399ba85cbb42c2c3b7547bf17c65. is recovering
> 2013-12-12 14:53:53,151 DEBUG [main] visibility.TestVisibilityLabels(405): 
> response from addLabels: result {
>   exception {
> name: "org.apache.hadoop.hbase.exceptions.RegionInRecoveryException"
> value: "org.apache.hadoop.hbase.exceptions.RegionInRecoveryException: 
> hbase:labels,,138626648.f14a399ba85cbb42c2c3b7547bf17c65. is recovering 
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:)
>  at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1763) at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1749) at 
> org.apache.hadoop.hbase.security.visibility.VisibilityController.getExistingLabelsWithAuths(VisibilityController.java:1096)
>  at 
> org.apache.hadoop.hbase.security.visibility.VisibilityController.postBatchMutate(VisibilityController.java:672)"
> {noformat}
> Should we try to ride over this?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846918#comment-13846918
 ] 

Jesse Yates commented on HBASE-8369:


It would be also be odd to have it deprecated in 0.94, gone in 0.96 and then 
back again in 0.98 - makes users wonder about what happened in the middle there 
(since they probably won't read this jira to get context). 

Guess if we are explicit in the javadocs then it would be alright, just funky

> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10005) TestVisibilityLabels fails occasionally

2013-12-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846915#comment-13846915
 ] 

Andrew Purtell commented on HBASE-10005:


Ok [~yuzhih...@gmail.com], I opened HBASE-10148

> TestVisibilityLabels fails occasionally
> ---
>
> Key: HBASE-10005
> URL: https://issues.apache.org/jira/browse/HBASE-10005
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Ted Yu
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 0.98.0
>
> Attachments: 
> 10005-TEST-org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.xml,
>  10005-org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.txt, 
> HBASE-10005.patch, HBASE-10005_addendum.patch
>
>
> I got the following test failures running test suite on hadoop-2 where 
> distributed log replay was turned on :
> {code}
> testAddVisibilityLabelsOnRSRestart(org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels)
>   Time elapsed: 0.019 sec  <<< FAILURE!
> java.lang.AssertionError: The count should be 8 expected:<8> but was:<6>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testAddVisibilityLabelsOnRSRestart(TestVisibilityLabels.java:408)
> ...
> testClearUserAuths(org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels)
>   Time elapsed: 0.002 sec  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testClearUserAuths(TestVisibilityLabels.java:505)
> {code}
> Logs to be attached



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HBASE-10148) [VisibilityController] Tolerate regions in recovery

2013-12-12 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-10148:
--

 Summary: [VisibilityController] Tolerate regions in recovery
 Key: HBASE-10148
 URL: https://issues.apache.org/jira/browse/HBASE-10148
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell
Assignee: Anoop Sam John
 Fix For: 0.98.0


Ted Yu reports that enabling distributed log replay by default, like:

{noformat}
Index: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
===
--- hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java  
(revision 1550575)
+++ hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java  
(working copy)
@@ -794,7 +794,7 @@

   /** Conf key that enables unflushed WAL edits directly being replayed to 
region servers */
   public static final String DISTRIBUTED_LOG_REPLAY_KEY = 
"hbase.master.distributed.log.replay";
-  public static final boolean DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG = false;
+  public static final boolean DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG = true;
   public static final String DISALLOW_WRITES_IN_RECOVERING =
   "hbase.regionserver.disallow.writes.when.recovering";
   public static final boolean DEFAULT_DISALLOW_WRITES_IN_RECOVERING_CONFIG = 
false;
{noformat}

causes TestVisibilityController#testAddVisibilityLabelsOnRSRestart to fail. It 
reveals an issue with label operations if the label table is recovering:

{noformat}
2013-12-12 14:53:53,133 DEBUG [RpcServer.handler=2,port=58108] 
visibility.VisibilityController(1046): Adding the label XYZ2013-12-12 
14:53:53,137 ERROR [RpcServer.handler=2,port=58108] 
visibility.VisibilityController(1074): 
org.apache.hadoop.hbase.exceptions.RegionInRecoveryException: 
hbase:labels,,138626648.f14a399ba85cbb42c2c3b7547bf17c65. is recovering
2013-12-12 14:53:53,151 DEBUG [main] visibility.TestVisibilityLabels(405): 
response from addLabels: result {
  exception {
name: "org.apache.hadoop.hbase.exceptions.RegionInRecoveryException"
value: "org.apache.hadoop.hbase.exceptions.RegionInRecoveryException: 
hbase:labels,,138626648.f14a399ba85cbb42c2c3b7547bf17c65. is recovering at 
org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:)
 at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1763) 
at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1749) 
at 
org.apache.hadoop.hbase.security.visibility.VisibilityController.getExistingLabelsWithAuths(VisibilityController.java:1096)
 at 
org.apache.hadoop.hbase.security.visibility.VisibilityController.postBatchMutate(VisibilityController.java:672)"
{noformat}

Should we try to ride over this?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10107) [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on Jenkins

2013-12-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846911#comment-13846911
 ] 

Andrew Purtell commented on HBASE-10107:


So do we disable this test then because some OSes are not sane? If you look at 
the line of the test that is failing, there appears to be some extraneous issue 
with Kerberos. 

> [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on 
> Jenkins
> ---
>
> Key: HBASE-10107
> URL: https://issues.apache.org/jira/browse/HBASE-10107
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
>
> TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation will fail up on Jenkins 
> in builds using "JDK 7 (latest)" but not those using "JDK 6 (latest)". The 
> stacktrace:
> {noformat}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:119)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846909#comment-13846909
 ] 

Enis Soztutar commented on HBASE-8369:
--

bq. We can always just maintain it locally in our own repo.
BTW, I am not against to commit this to 0.94, although 0.96 may or may not have 
it. Just that we should be very explicit about it (with javadoc, or maybe 
@deprecated notification?) 

> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10107) [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on Jenkins

2013-12-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846908#comment-13846908
 ] 

Ted Yu commented on HBASE-10107:


>From https://builds.apache.org/job/HBase-TRUNK/4720/consoleFull :
bq. Building remotely on ubuntu1 in workspace 
/home/jenkins/jenkins-slave/workspace/HBase-TRUNK

> [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on 
> Jenkins
> ---
>
> Key: HBASE-10107
> URL: https://issues.apache.org/jira/browse/HBASE-10107
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
>
> TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation will fail up on Jenkins 
> in builds using "JDK 7 (latest)" but not those using "JDK 6 (latest)". The 
> stacktrace:
> {noformat}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:119)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10005) TestVisibilityLabels fails occasionally

2013-12-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846904#comment-13846904
 ] 

Ted Yu commented on HBASE-10005:


Andy:
Usage of openjdk was redherring.
You can reproduce the test failure with the following change:
{code}
Index: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
===
--- hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java  
(revision 1550575)
+++ hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java  
(working copy)
@@ -794,7 +794,7 @@

   /** Conf key that enables unflushed WAL edits directly being replayed to 
region servers */
   public static final String DISTRIBUTED_LOG_REPLAY_KEY = 
"hbase.master.distributed.log.replay";
-  public static final boolean DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG = false;
+  public static final boolean DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG = true;
   public static final String DISALLOW_WRITES_IN_RECOVERING =
   "hbase.regionserver.disallow.writes.when.recovering";
   public static final boolean DEFAULT_DISALLOW_WRITES_IN_RECOVERING_CONFIG = 
false;
{code}
As shown in test output above, when testAddVisibilityLabelsOnRSRestart opens 
scanner to scan labels table, the region is still in recovery - not ready for 
scan.

> TestVisibilityLabels fails occasionally
> ---
>
> Key: HBASE-10005
> URL: https://issues.apache.org/jira/browse/HBASE-10005
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Ted Yu
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 0.98.0
>
> Attachments: 
> 10005-TEST-org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.xml,
>  10005-org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.txt, 
> HBASE-10005.patch, HBASE-10005_addendum.patch
>
>
> I got the following test failures running test suite on hadoop-2 where 
> distributed log replay was turned on :
> {code}
> testAddVisibilityLabelsOnRSRestart(org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels)
>   Time elapsed: 0.019 sec  <<< FAILURE!
> java.lang.AssertionError: The count should be 8 expected:<8> but was:<6>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testAddVisibilityLabelsOnRSRestart(TestVisibilityLabels.java:408)
> ...
> testClearUserAuths(org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels)
>   Time elapsed: 0.002 sec  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testClearUserAuths(TestVisibilityLabels.java:505)
> {code}
> Logs to be attached



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10076) Backport MapReduce over snapshot files [0.94]

2013-12-12 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846900#comment-13846900
 ] 

Enis Soztutar commented on HBASE-10076:
---

Looks all the pieces are here :) 
- remove  System.out.println("In restore!");
- we should remove the scanMetrics from ClientScanner from the original patch:
{code}
 - protected ScanMetrics scanMetrics = null;
{code}
- Any interest in bringing in the new test TestCellUtil. testOverlappingKeys() 
? Not needed that much, just checking
- It would be good to have IntegrationTestTableSnapshotInputFormat in the same 
package (mapreduce) 
- Not sure about the chang in TableInputFormatBase. Is this needed? Let's leave 
this out otherwise. 
- In the original patch, TableMapReduceUtil. initTableMapperJob() now accepts 
an initCredentials param, because we do not want to get tokens from HBase at 
all. Otherwise, if hbase is used with security, offline clusters won't work. 


> Backport MapReduce over snapshot files [0.94]
> -
>
> Key: HBASE-10076
> URL: https://issues.apache.org/jira/browse/HBASE-10076
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.15
>
> Attachments: hbase-10076-v0.patch
>
>
> MapReduce over Snapshots would be valuable on 0.94.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10107) [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on Jenkins

2013-12-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846899#comment-13846899
 ] 

Andrew Purtell commented on HBASE-10107:


Can't reproduce locally on Ubuntu 13.10 x86_64 with Oracle JRE 7u25 after 10 
repetitions. I didn't see it with OpenJDK 7u45 either. Will continue to 100 but 
not expecting to see anything.

> [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on 
> Jenkins
> ---
>
> Key: HBASE-10107
> URL: https://issues.apache.org/jira/browse/HBASE-10107
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
>
> TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation will fail up on Jenkins 
> in builds using "JDK 7 (latest)" but not those using "JDK 6 (latest)". The 
> stacktrace:
> {noformat}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:119)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10107) [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on Jenkins

2013-12-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846887#comment-13846887
 ] 

Andrew Purtell commented on HBASE-10107:


I'm not seeing it on something based on CentOS. So are the ASF Jenkins builds 
also running Ubuntu?

> [JDK7] TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation failing on 
> Jenkins
> ---
>
> Key: HBASE-10107
> URL: https://issues.apache.org/jira/browse/HBASE-10107
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0
>
>
> TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation will fail up on Jenkins 
> in builds using "JDK 7 (latest)" but not those using "JDK 6 (latest)". The 
> stacktrace:
> {noformat}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:119)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10005) TestVisibilityLabels fails occasionally

2013-12-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846881#comment-13846881
 ] 

Andrew Purtell commented on HBASE-10005:


bq. Java: jre-1.7.0-openjdk.x86_64

I don't see this failure anywhere, but have Oracle JVMs for 6 and 7 set up, so 
that could be the difference.

[~yuzhih...@gmail.com], do you know anyone running OpenJDK in production?



> TestVisibilityLabels fails occasionally
> ---
>
> Key: HBASE-10005
> URL: https://issues.apache.org/jira/browse/HBASE-10005
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Ted Yu
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 0.98.0
>
> Attachments: 
> 10005-TEST-org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.xml,
>  10005-org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.txt, 
> HBASE-10005.patch, HBASE-10005_addendum.patch
>
>
> I got the following test failures running test suite on hadoop-2 where 
> distributed log replay was turned on :
> {code}
> testAddVisibilityLabelsOnRSRestart(org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels)
>   Time elapsed: 0.019 sec  <<< FAILURE!
> java.lang.AssertionError: The count should be 8 expected:<8> but was:<6>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testAddVisibilityLabelsOnRSRestart(TestVisibilityLabels.java:408)
> ...
> testClearUserAuths(org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels)
>   Time elapsed: 0.002 sec  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testClearUserAuths(TestVisibilityLabels.java:505)
> {code}
> Logs to be attached



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10005) TestVisibilityLabels fails occasionally

2013-12-12 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846884#comment-13846884
 ] 

Andrew Purtell commented on HBASE-10005:


bq. Clarification: testAddVisibilityLabelsOnRSRestart fails when log replay is 
turned on.

Wait. We are crossing comments on JIRA again. Please provide steps for 
reproducing the problem, I think we are missing some detail there.

> TestVisibilityLabels fails occasionally
> ---
>
> Key: HBASE-10005
> URL: https://issues.apache.org/jira/browse/HBASE-10005
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Ted Yu
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 0.98.0
>
> Attachments: 
> 10005-TEST-org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.xml,
>  10005-org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.txt, 
> HBASE-10005.patch, HBASE-10005_addendum.patch
>
>
> I got the following test failures running test suite on hadoop-2 where 
> distributed log replay was turned on :
> {code}
> testAddVisibilityLabelsOnRSRestart(org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels)
>   Time elapsed: 0.019 sec  <<< FAILURE!
> java.lang.AssertionError: The count should be 8 expected:<8> but was:<6>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testAddVisibilityLabelsOnRSRestart(TestVisibilityLabels.java:408)
> ...
> testClearUserAuths(org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels)
>   Time elapsed: 0.002 sec  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testClearUserAuths(TestVisibilityLabels.java:505)
> {code}
> Logs to be attached



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846883#comment-13846883
 ] 

Lars Hofhansl commented on HBASE-8369:
--

We have a backport patch already in HBASE-10076. We can always just maintain it 
locally in our own repo.

This is very low risk change, though... Almost no existing classes changed. And 
would be a good story for HBase's bad scan performance (even with M/R). What do 
you say [~stack]? :)


> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-10005) TestVisibilityLabels fails occasionally

2013-12-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846876#comment-13846876
 ] 

Ted Yu commented on HBASE-10005:


Clarification:
testAddVisibilityLabelsOnRSRestart fails when log replay is turned on.
I added some debug log:
{code}
2013-12-12 14:53:53,131 DEBUG [RpcServer.handler=2,port=58108] 
visibility.VisibilityController(1271): The list of auths are [system]
2013-12-12 14:53:53,133 DEBUG [RpcServer.handler=2,port=58108] 
visibility.VisibilityController(1046): Adding the label ABC
2013-12-12 14:53:53,133 DEBUG [RpcServer.handler=2,port=58108] 
visibility.VisibilityController(1046): Adding the label XYZ2013-12-12 
14:53:53,137 ERROR [RpcServer.handler=2,port=58108] 
visibility.VisibilityController(1074): 
org.apache.hadoop.hbase.exceptions.RegionInRecoveryException: 
hbase:labels,,138626648.f14a399ba85cbb42c2c3b7547bf17c65. is recovering
2013-12-12 14:53:53,151 DEBUG [main] visibility.TestVisibilityLabels(405): 
response from addLabels: result {
  exception {
name: 
"org.apache.hadoop.hbase.exceptions.RegionInRecoveryException"
value: "org.apache.hadoop.hbase.exceptions.RegionInRecoveryException: 
hbase:labels,,138626648.f14a399ba85cbb42c2c3b7547bf17c65. is 
recovering\n\tat 
org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:)\n\tat
 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1763)\n\tat
 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1749)\n\tat
 
org.apache.hadoop.hbase.security.visibility.VisibilityController.getExistingLabelsWithAuths(VisibilityController.java:1096)\n\tat
 
org.apache.hadoop.hbase.security.visibility.VisibilityController.postBatchMutate(VisibilityController.java:672)\n\tat
 
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutate(RegionCoprocessorHost.java:1069)\n\tat
 
org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2401)\n\tat
 
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2087)\n\tat
 
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2037)\n\tat
 
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2041)\n\tat
 
org.apache.hadoop.hbase.security.visibility.VisibilityController.addLabels(VisibilityController.java:1059)\n\tat
 
org.apache.hadoop.hbase.protobuf.generated.VisibilityLabelsProtos$VisibilityLabelsService$1.addLabels(VisibilityLabelsProtos.java:5014)\n\tat
 
org.apache.hadoop.hbase.protobuf.generated.VisibilityLabelsProtos$VisibilityLabelsService.callMethod(VisibilityLabelsProtos.java:5178)\n\tat
 
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:5357)\n\tat
 
org.apache.hadoop.hbase.regionserver.HRegionServer.execService(HRegionServer.java:3275)\n\tat
 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:28458)\n\tat
 org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2008)\n\tat 
org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:92)\n\tat 
org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)\n\tat
 
org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)\n\tat
 
org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)\n\tat
 java.lang.Thread.run(Thread.java:744)\n"
  }
}
{code}

> TestVisibilityLabels fails occasionally
> ---
>
> Key: HBASE-10005
> URL: https://issues.apache.org/jira/browse/HBASE-10005
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Ted Yu
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 0.98.0
>
> Attachments: 
> 10005-TEST-org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.xml,
>  10005-org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.txt, 
> HBASE-10005.patch, HBASE-10005_addendum.patch
>
>
> I got the following test failures running test suite on hadoop-2 where 
> distributed log replay was turned on :
> {code}
> testAddVisibilityLabelsOnRSRestart(org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels)
>   Time elapsed: 0.019 sec  <<< FAILURE!
> java.lang.AssertionError: The count should be 8 expected:<8> but was:<6>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testAddVisibilityLabelsOnRSRestart(TestVisibilityLabels.java:408)
> ...
> testClearUserAuths(org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels)
>   Time elapsed: 0.002 sec  <<< FAILURE!
> java.lang.AssertionError

[jira] [Updated] (HBASE-10106) Remove some unnecessary code from TestOpenTableInCoprocessor

2013-12-12 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-10106:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to all marked branches.

> Remove some unnecessary code from TestOpenTableInCoprocessor
> 
>
> Key: HBASE-10106
> URL: https://issues.apache.org/jira/browse/HBASE-10106
> Project: HBase
>  Issue Type: Test
>Affects Versions: 0.98.0, 0.96.0, 0.94.15, 0.99.0
>Reporter: Benoit Sigoure
>Assignee: Benoit Sigoure
>Priority: Trivial
> Attachments: hbase-10106-0.94.patch, hbase-10106.txt
>
>
> {code}
> diff --git 
> a/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestOpenTableInCoprocessor.java
>  
> b/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestOpenTableInCoprocessor.java
> index 7bc2a78..67b97ce 100644
> --- 
> a/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestOpenTableInCoprocessor.java
> +++ 
> b/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestOpenTableInCoprocessor.java
> @@ -69,8 +69,6 @@ public class TestOpenTableInCoprocessor {
>  public void prePut(final ObserverContext 
> e, final Put put,
>  final WALEdit edit, final Durability durability) throws IOException {
>HTableInterface table = e.getEnvironment().getTable(otherTable);
> -  Put p = new Put(new byte[] { 'a' });
> -  p.add(family, null, new byte[] { 'a' });
>table.put(put);
>table.flushCommits();
>completed[0] = true;
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10146) Bump HTrace version to 2.04

2013-12-12 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-10146:
--

Affects Version/s: 0.99.0
   0.96.1
   0.98.0
   Status: Patch Available  (was: Open)

> Bump HTrace version to 2.04
> ---
>
> Key: HBASE-10146
> URL: https://issues.apache.org/jira/browse/HBASE-10146
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.96.1, 0.99.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HBASE-10146-0.patch
>
>
> 2.04 has been released with a bug fix for what happens when htrace fails.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread Bryan Keller (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846864#comment-13846864
 ] 

Bryan Keller commented on HBASE-8369:
-

FWIW, you can use the 0.94 patch I submitted without modifying the 0.94 release 
if a couple of minor changes are implemented (mostly naming). I have been using 
it in production for a while now with 0.94. Perhaps it could be tweaked and 
offered separately in a "contrib" directory or something in 0.94, along with 
the caveat about file permissions.

> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-5487) Generic framework for Master-coordinated tasks

2013-12-12 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846861#comment-13846861
 ] 

Jonathan Hsieh commented on HBASE-5487:
---

Matteo and Aleks bring up an interesting case that any new master design should 
handle.  HBASE-10136

> Generic framework for Master-coordinated tasks
> --
>
> Key: HBASE-5487
> URL: https://issues.apache.org/jira/browse/HBASE-5487
> Project: HBase
>  Issue Type: New Feature
>  Components: master, regionserver, Zookeeper
>Affects Versions: 0.94.0
>Reporter: Mubarak Seyed
>Assignee: Sergey Shelukhin
>Priority: Critical
> Attachments: Entity management in Master - part 1.pdf, Entity 
> management in Master - part 1.pdf, Is the FATE of Assignment Manager 
> FATE.pdf, Region management in Master.pdf, Region management in Master5.docx, 
> hbckMasterV2-long.pdf, hbckMasterV2b-long.pdf
>
>
> Need a framework to execute master-coordinated tasks in a fault-tolerant 
> manner. 
> Master-coordinated tasks such as online-scheme change and delete-range 
> (deleting region(s) based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for 
> master-coordinated tasks
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core 
> components



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10146) Bump HTrace version to 2.04

2013-12-12 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-10146:
--

Attachment: HBASE-10146-0.patch

> Bump HTrace version to 2.04
> ---
>
> Key: HBASE-10146
> URL: https://issues.apache.org/jira/browse/HBASE-10146
> Project: HBase
>  Issue Type: Bug
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HBASE-10146-0.patch
>
>
> 2.04 has been released with a bug fix for what happens when htrace fails.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-9047) Tool to handle finishing replication when the cluster is offline

2013-12-12 Thread Demai Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846854#comment-13846854
 ] 

Demai Ni commented on HBASE-9047:
-

[~lhofhansl],[~stack], thanks. I will upload a 0.94 patch tonight. 

> Tool to handle finishing replication when the cluster is offline
> 
>
> Key: HBASE-9047
> URL: https://issues.apache.org/jira/browse/HBASE-9047
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.96.0
>Reporter: Jean-Daniel Cryans
>Assignee: Demai Ni
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-9047-0.94.9-v0.PATCH, HBASE-9047-trunk-v0.patch, 
> HBASE-9047-trunk-v1.patch, HBASE-9047-trunk-v2.patch, 
> HBASE-9047-trunk-v3.patch, HBASE-9047-trunk-v4.patch, 
> HBASE-9047-trunk-v4.patch, HBASE-9047-trunk-v5.patch, 
> HBASE-9047-trunk-v6.patch, HBASE-9047-trunk-v7.patch, 
> HBASE-9047-trunk-v7.patch
>
>
> We're having a discussion on the mailing list about replicating the data on a 
> cluster that was shut down in an offline fashion. The motivation could be 
> that you don't want to bring HBase back up but still need that data on the 
> slave.
> So I have this idea of a tool that would be running on the master cluster 
> while it is down, although it could also run at any time. Basically it would 
> be able to read the replication state of each master region server, finish 
> replicating what's missing to all the slave, and then clear that state in 
> zookeeper.
> The code that handles replication does most of that already, see 
> ReplicationSourceManager and ReplicationSource. Basically when 
> ReplicationSourceManager.init() is called, it will check all the queues in ZK 
> and try to grab those that aren't attached to a region server. If the whole 
> cluster is down, it will grab all of them.
> The beautiful thing here is that you could start that tool on all your 
> machines and the load will be spread out, but that might not be a big concern 
> if replication wasn't lagging since it would take a few seconds to finish 
> replicating the missing data for each region server.
> I'm guessing when starting ReplicationSourceManager you'd give it a fake 
> region server ID, and you'd tell it not to start its own source.
> FWIW the main difference in how replication is handled between Apache's HBase 
> and Facebook's is that the latter is always done separately of HBase itself. 
> This jira isn't about doing that.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HBASE-10146) Bump HTrace version to 2.04

2013-12-12 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-10146:
--

Summary: Bump HTrace version to 2.04  (was: Bump HTrace version)

> Bump HTrace version to 2.04
> ---
>
> Key: HBASE-10146
> URL: https://issues.apache.org/jira/browse/HBASE-10146
> Project: HBase
>  Issue Type: Bug
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>
> 2.04 has been released with a bug fix for what happens when htrace fails.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HBASE-10147) Canary additions

2013-12-12 Thread stack (JIRA)
stack created HBASE-10147:
-

 Summary: Canary additions
 Key: HBASE-10147
 URL: https://issues.apache.org/jira/browse/HBASE-10147
 Project: HBase
  Issue Type: Improvement
Reporter: stack


I've been using the canary to quickly identify the dodgy machine in my cluster. 
 It is useful for this.  What would  make it better would be:

+ Rather than saying how long it took to get a region after you have gotten the 
region, it'd be sweet to log BEFORE you went to get the region the regionname 
and the server it is on.  I ask for this because as is, I have to wait for the 
canary to timeout which can be a while.
+ Second ask is that when I pass the -t, that when it fails, it says what it 
failed against -- what region and hopefully what server location (might be 
hard).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HBASE-8369) MapReduce over snapshot files

2013-12-12 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846851#comment-13846851
 ] 

Enis Soztutar commented on HBASE-8369:
--

It will be super confusing for users if it comes in 0.94, but not in 0.96. 
Either 0.94 version should come with a very visible warning that this feature 
won't be available in 0.96, or it should not come at all. 

> MapReduce over snapshot files
> -
>
> Key: HBASE-8369
> URL: https://issues.apache.org/jira/browse/HBASE-8369
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce, snapshots
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: HBASE-8369-0.94.patch, HBASE-8369-0.94_v2.patch, 
> HBASE-8369-0.94_v3.patch, HBASE-8369-0.94_v4.patch, HBASE-8369-0.94_v5.patch, 
> HBASE-8369-trunk_v1.patch, HBASE-8369-trunk_v2.patch, 
> HBASE-8369-trunk_v3.patch, hbase-8369_v0.patch, hbase-8369_v11.patch, 
> hbase-8369_v5.patch, hbase-8369_v6.patch, hbase-8369_v7.patch, 
> hbase-8369_v8.patch, hbase-8369_v9.patch
>
>
> The idea is to add an InputFormat, which can run the mapreduce job over 
> snapshot files directly bypassing hbase server layer. The IF is similar in 
> usage to TableInputFormat, taking a Scan object from the user, but instead of 
> running from an online table, it runs from a table snapshot. We do one split 
> per region in the snapshot, and open an HRegion inside the RecordReader. A 
> RegionScanner is used internally for doing the scan without any HRegionServer 
> bits. 
> Users have been asking and searching for ways to run MR jobs by reading 
> directly from hfiles, so this allows new use cases if reading from stale data 
> is ok:
>  - Take snapshots periodically, and run MR jobs only on snapshots.
>  - Export snapshots to remote hdfs cluster, run the MR jobs at that cluster 
> without HBase cluster.
>  - (Future use case) Combine snapshot data with online hbase data: Scan from 
> yesterday's snapshot, but read today's data from online hbase cluster. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


  1   2   3   >