Looks like it was a time out issue I seen the bonding log messages more then once so.

Does anyone know what the timeout setting name is on completebulkload?

Billy



"Billy Pearson" <sa...@pearsonwholesale.com> wrote in message news:jnslmk$9ic$1...@dough.gmane.org...
ok I got a MR job I am trying to import into hbase it works with small input loads with in secs and the command line returns when Iadd more input to the map reduce job the completebulkload hangs on the command line never returning.

when I run a large completebulkload it keeps trying to copy the input files many times

cut out of regionserver log

2012-05-02 18:56:11,629 INFO org.apache.hadoop.hbase.regionserver.Store: File hdfs://node1/bulkoutput/idjuice/436678602442940864 on different filesystem than destination store - moving to this filesystem. 2012-05-02 18:56:12,287 INFO org.apache.hadoop.hbase.regionserver.Store: Copied to temporary path on dst filesystem: hdfs://node1.hadoop.compspy.com/hbase/Repo/751e4b3c4a27e680e8d481be3e11507e/.tmp/6957718427566497105 2012-05-02 18:56:12,288 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming bulk load file hdfs://node1.hadoop.compspy.com/hbase/Repo/751e4b3c4a27e680e8d481be3e11507e/.tmp/6957718427566497105 to hdfs://node1.hadoop.compspy.com/hbase/Repo/751e4b3c4a27e680e8d481be3e11507e/idjuice/499207355457162362 2012-05-02 18:56:12,310 INFO org.apache.hadoop.hbase.regionserver.Store: Moved hfile hdfs://node1.hadoop.compspy.com/hbase/Repo/751e4b3c4a27e680e8d481be3e11507e/.tmp/6957718427566497105 into store directory hdfs://node1.hadoop.compspy.com/hbase/Repo/751e4b3c4a27e680e8d481be3e11507e/idjuice - updating store file list.

I can see the same input file load over and over in the logs

The final size with TableMapReduceUtil.initTableReducerJob of the table was around 680m on all the stores total reported by the regionserver I ran the import on a larger dataset same data format same mr job just more input and I killed the regionserver and truncate the table and restart the whole cluster
the after the table gets to 5GB


the mapreduce job maps output is set by HFileOutputFormat.configureIncrementalLoad(job, table);


the only difference I can see in the two sets of files is one has 1 file to import per column family and the larget one has 8 file for each column family.

Any suggestions on where the problem could be?
kind of odd that the small data input would work so easy and the large one would just run out of control.

running
hbase-0.90.4-cdh3u3.jar
export HBASE_CLASSPATH=/etc/hbase/conf:/etc/hadoop/conf:/etc/zookeeper





Reply via email to