[ 
https://issues.apache.org/jira/browse/HBASE-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461184#comment-13461184
 ] 

Alexander Alten-Lorenz commented on HBASE-6694:
-----------------------------------------------

Confirmed that the patch is working. Job.xml contains:

I created over a whole day CF's:
hbase shell> for r in 1 .. 10 do for c in 1 .. 100000000 do put 'test1', 
"row-#{r}", "cf1:c#{c}", "1" end end

===========

With -Dhbase.export.scanner.batch=100:

HBASE_CLASSPATH="/usr/lib/hadoop-0.20-mapreduce/hadoop-core-2.0.0-mr1-cdh4.0.1.jar"
 bin/hbase org.apache.hadoop.hbase.mapreduce.Export 
-Dhbase.export.scanner.batch=100 test1 /home/hdfs/test2.export
---
{code}
12/09/22 01:17:23 DEBUG mapreduce.TableInputFormatBase: getSplits: split -> 0 
-> hadoop4.internal:,
12/09/22 01:17:24 WARN conf.Configuration: fs.default.name is deprecated. 
Instead, use fs.defaultFS
12/09/22 01:17:24 WARN conf.Configuration: io.bytes.per.checksum is deprecated. 
Instead, use dfs.bytes-per-checksum
12/09/22 01:17:25 INFO mapred.JobClient: Running job: job_201209212254_0010
12/09/22 01:17:26 INFO mapred.JobClient:  map 0% reduce 0%
12/09/22 01:17:59 INFO mapred.JobClient:  map 100% reduce 0%
12/09/22 01:18:02 INFO mapred.JobClient: Job complete: job_201209212254_0010
12/09/22 01:18:03 INFO mapred.JobClient: Counters: 24
12/09/22 01:18:03 INFO mapred.JobClient:   File System Counters
12/09/22 01:18:03 INFO mapred.JobClient:     FILE: Number of bytes read=0
12/09/22 01:18:03 INFO mapred.JobClient:     FILE: Number of bytes written=84332
12/09/22 01:18:03 INFO mapred.JobClient:     FILE: Number of read operations=0
12/09/22 01:18:03 INFO mapred.JobClient:     FILE: Number of large read 
operations=0
12/09/22 01:18:03 INFO mapred.JobClient:     FILE: Number of write operations=0
12/09/22 01:18:03 INFO mapred.JobClient:     HDFS: Number of bytes read=70
12/09/22 01:18:03 INFO mapred.JobClient:     HDFS: Number of bytes 
written=62070728
12/09/22 01:18:03 INFO mapred.JobClient:     HDFS: Number of read operations=1
12/09/22 01:18:03 INFO mapred.JobClient:     HDFS: Number of large read 
operations=0
12/09/22 01:18:03 INFO mapred.JobClient:     HDFS: Number of write operations=1
12/09/22 01:18:03 INFO mapred.JobClient:   Job Counters 
12/09/22 01:18:03 INFO mapred.JobClient:     Launched map tasks=1
12/09/22 01:18:03 INFO mapred.JobClient:     Data-local map tasks=1
12/09/22 01:18:03 INFO mapred.JobClient:     Total time spent by all maps in 
occupied slots (ms)=35760
12/09/22 01:18:03 INFO mapred.JobClient:     Total time spent by all reduces in 
occupied slots (ms)=0
12/09/22 01:18:03 INFO mapred.JobClient:     Total time spent by all maps 
waiting after reserving slots (ms)=0
12/09/22 01:18:03 INFO mapred.JobClient:     Total time spent by all reduces 
waiting after reserving slots (ms)=0
12/09/22 01:18:03 INFO mapred.JobClient:   Map-Reduce Framework
12/09/22 01:18:03 INFO mapred.JobClient:     Map input records=15258
12/09/22 01:18:03 INFO mapred.JobClient:     Map output records=15258
12/09/22 01:18:03 INFO mapred.JobClient:     Input split bytes=70
12/09/22 01:18:03 INFO mapred.JobClient:     Spilled Records=0
12/09/22 01:18:03 INFO mapred.JobClient:     CPU time spent (ms)=5970
12/09/22 01:18:03 INFO mapred.JobClient:     Physical memory (bytes) 
snapshot=106557440
12/09/22 01:18:03 INFO mapred.JobClient:     Virtual memory (bytes) 
snapshot=570249216
12/09/22 01:18:03 INFO mapred.JobClient:     Total committed heap usage 
(bytes)=42663936
{code}

Export readable in hdfs.

==========

Without -D switch:

RS timed out:
{code}
2012-09-22 01:27:27,937 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
task not yet acquired 
/hbase/splitlog/hdfs%3A%2F%2Fhadoop4%3A8020%2Fhbase%2F.logs%2Fhadoop4.internal%2C60020%2C1348269190284-splitting%2Fhadoop4.internal%252C60020%252C1348269190284.1348269200784
 ver = 0
2012-09-22 01:27:28,938 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
total tasks = 1 unassigned = 1
2012-09-22 01:27:28,938 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
resubmitting unassigned task(s) after timeout
2012-09-22 01:27:29,237 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
task not yet acquired 
/hbase/splitlog/hdfs%3A%2F%2Fhadoop4%3A8020%2Fhbase%2F.logs%2Fhadoop4.internal%2C60020%2C1348269190284-splitting%2Fhadoop4.internal%252C60020%252C1348269190284.1348269200784
 ver = 0
2012-09-22 01:27:29,239 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
task /hbase/splitlog/RESCAN0000000057 entered state done 
hadoop4.internal,60000,1348269184131
2012-09-22 01:27:29,239 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
task /hbase/splitlog/RESCAN0000000058 entered state done 
hadoop4.internal,60000,1348269184131
2012-09-22 01:27:29,282 DEBUG 
org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
/hbase/splitlog/RESCAN0000000057
2012-09-22 01:27:29,282 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
deleted task without in memory state /hbase/splitlog/RESCAN0000000057
2012-09-22 01:27:29,283 DEBUG 
org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
/hbase/splitlog/RESCAN0000000058
2012-09-22 01:27:29,283 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
deleted task without in memory state /hbase/splitlog/RESCAN0000000058
{code}

====
Test environment:
Virtual machine, 2GB RAM, 512MB exported Heap for Hbase, Hadoop cluster mode, 
HBase pseudo distributed. I would say, it worked.
                
> Test scanner batching in export job feature HBASE-6372 AND report on 
> improvement HBASE-6372 adds
> ------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6694
>                 URL: https://issues.apache.org/jira/browse/HBASE-6694
>             Project: HBase
>          Issue Type: Task
>            Reporter: stack
>            Assignee: Alexander Alten-Lorenz
>         Attachments: HBASE-6694.patch
>
>
> From tail of HBASE-6372, Jon had raised issue that test added did not 
> actually test the feature.  This issue is about adding a test of HBASE-6372.  
> We should also have numbers for the improvement that HBASE-6372 brings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to