Data not loaded in table via ImportTSV
Hi, The background thread is this : http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3ce689a42b73c5a545ad77332a4fc75d8c1efbd80...@vshinmsmbx01.vshodc.lntinfotech.com%3E I'm referring to the HBase doc. http://hbase.apache.org/book/ops_mgt.html#importtsv Accordingly, my command is : HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.94.6.1.jar importtsv '-Dimporttsv.separator=;' -Dimporttsv.columns=HBASE_ROW_KEY,CUSTOMER_INFO:NAME,CUSTOMER_INFO:EMAIL,CUSTOMER_INFO:ADDRESS,CUSTOMER_INFO:MOBILE -Dimporttsv.bulk.output=hdfs://cldx-1139-1033:9000/hbase/storefileoutput CUSTOMERS hdfs://cldx-1139-1033:9000/hbase/copiedFromLocal/customer.txt /*classpath echoed here*/ 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/home/hduser/hadoop_ecosystem/apache_hadoop/hadoop_installation/hadoop-1.0.4/libexec/../lib/native/Linux-amd64-64 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:java.compiler=NA 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:os.version=3.2.0-23-generic 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:user.name=hduser 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/hduser 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/hduser/hadoop_ecosystem/apache_hbase/hbase_installation/hbase-0.94.6.1/bin 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=cldx-1140-1034:2181 sessionTimeout=18 watcher=hconnection 13/04/16 17:18:43 INFO zookeeper.ClientCnxn: Opening socket connection to server cldx-1140-1034/172.25.6.71:2181. Will not attempt to authenticate using SASL (unknown error) 13/04/16 17:18:43 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 5483@cldx-1139-1033 13/04/16 17:18:43 INFO zookeeper.ClientCnxn: Socket connection established to cldx-1140-1034/172.25.6.71:2181, initiating session 13/04/16 17:18:43 INFO zookeeper.ClientCnxn: Session establishment complete on server cldx-1140-1034/172.25.6.71:2181, sessionid = 0x13def2889530023, negotiated timeout = 18 13/04/16 17:18:44 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=cldx-1140-1034:2181 sessionTimeout=18 watcher=catalogtracker-on-org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@34d03009 13/04/16 17:18:44 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 5483@cldx-1139-1033 13/04/16 17:18:44 INFO zookeeper.ClientCnxn: Opening socket connection to server cldx-1140-1034/172.25.6.71:2181. Will not attempt to authenticate using SASL (unknown error) 13/04/16 17:18:44 INFO zookeeper.ClientCnxn: Socket connection established to cldx-1140-1034/172.25.6.71:2181, initiating session 13/04/16 17:18:44 INFO zookeeper.ClientCnxn: Session establishment complete on server cldx-1140-1034/172.25.6.71:2181, sessionid = 0x13def2889530024, negotiated timeout = 18 13/04/16 17:18:44 INFO zookeeper.ZooKeeper: Session: 0x13def2889530024 closed 13/04/16 17:18:44 INFO zookeeper.ClientCnxn: EventThread shut down 13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Looking up current regions for table org.apache.hadoop.hbase.client.HTable@238cfdf 13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Configuring 1 reduce partitions to match current region count 13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Writing partition information to hdfs://cldx-1139-1033:9000/user/hduser/partitions_4159cd24-b8ff-4919-854b-a7d1da5069ad 13/04/16 17:18:44 INFO util.NativeCodeLoader: Loaded the native-hadoop library 13/04/16 17:18:44 INFO zlib.ZlibFactory: Successfully loaded initialized native-zlib library 13/04/16 17:18:44 INFO compress.CodecPool: Got brand-new compressor 13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Incremental table output configured. 13/04/16 17:18:47 INFO input.FileInputFormat: Total input paths to process : 1 13/04/16 17:18:47 WARN snappy.LoadSnappy: Snappy native library not loaded 13/04/16 17:18:47 INFO mapred.JobClient: Running job: job_201304091909_0010 13/04/16 17:18:48 INFO mapred.JobClient: map 0% reduce 0% 13/04/16 17:19:07 INFO mapred.JobClient: map 100% reduce 0% 13/04/16 17:19:19 INFO mapred.JobClient: map 100% reduce 100% 13/04/16 17:19:24 INFO mapred.JobClient: Job complete: job_201304091909_0010 13/04/16 17:19:24 INFO mapred.JobClient: Counters: 30 13/04/16 17:19:24 INFO mapred.JobClient: Job Counters 13/04/16 17:19:24 INFO mapred.JobClient: Launched reduce tasks=1 13/04/16 17:19:24 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=16567 13/04/16 17:19:24 INFO mapred.JobClient: Total time spent by all reduces waiting
RE: Data not loaded in table via ImportTSV
Hi Have you used the tool, LoadIncrementalHFiles after the ImportTSV? -Anoop- From: Omkar Joshi [omkar.jo...@lntinfotech.com] Sent: Tuesday, April 16, 2013 12:01 PM To: user@hbase.apache.org Subject: Data not loaded in table via ImportTSV Hi, The background thread is this : http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3ce689a42b73c5a545ad77332a4fc75d8c1efbd80...@vshinmsmbx01.vshodc.lntinfotech.com%3E I'm referring to the HBase doc. http://hbase.apache.org/book/ops_mgt.html#importtsv Accordingly, my command is : HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.94.6.1.jar importtsv '-Dimporttsv.separator=;' -Dimporttsv.columns=HBASE_ROW_KEY,CUSTOMER_INFO:NAME,CUSTOMER_INFO:EMAIL,CUSTOMER_INFO:ADDRESS,CUSTOMER_INFO:MOBILE -Dimporttsv.bulk.output=hdfs://cldx-1139-1033:9000/hbase/storefileoutput CUSTOMERS hdfs://cldx-1139-1033:9000/hbase/copiedFromLocal/customer.txt /*classpath echoed here*/ 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/home/hduser/hadoop_ecosystem/apache_hadoop/hadoop_installation/hadoop-1.0.4/libexec/../lib/native/Linux-amd64-64 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:java.compiler=NA 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:os.version=3.2.0-23-generic 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:user.name=hduser 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/hduser 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/hduser/hadoop_ecosystem/apache_hbase/hbase_installation/hbase-0.94.6.1/bin 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=cldx-1140-1034:2181 sessionTimeout=18 watcher=hconnection 13/04/16 17:18:43 INFO zookeeper.ClientCnxn: Opening socket connection to server cldx-1140-1034/172.25.6.71:2181. Will not attempt to authenticate using SASL (unknown error) 13/04/16 17:18:43 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 5483@cldx-1139-1033 13/04/16 17:18:43 INFO zookeeper.ClientCnxn: Socket connection established to cldx-1140-1034/172.25.6.71:2181, initiating session 13/04/16 17:18:43 INFO zookeeper.ClientCnxn: Session establishment complete on server cldx-1140-1034/172.25.6.71:2181, sessionid = 0x13def2889530023, negotiated timeout = 18 13/04/16 17:18:44 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=cldx-1140-1034:2181 sessionTimeout=18 watcher=catalogtracker-on-org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@34d03009 13/04/16 17:18:44 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 5483@cldx-1139-1033 13/04/16 17:18:44 INFO zookeeper.ClientCnxn: Opening socket connection to server cldx-1140-1034/172.25.6.71:2181. Will not attempt to authenticate using SASL (unknown error) 13/04/16 17:18:44 INFO zookeeper.ClientCnxn: Socket connection established to cldx-1140-1034/172.25.6.71:2181, initiating session 13/04/16 17:18:44 INFO zookeeper.ClientCnxn: Session establishment complete on server cldx-1140-1034/172.25.6.71:2181, sessionid = 0x13def2889530024, negotiated timeout = 18 13/04/16 17:18:44 INFO zookeeper.ZooKeeper: Session: 0x13def2889530024 closed 13/04/16 17:18:44 INFO zookeeper.ClientCnxn: EventThread shut down 13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Looking up current regions for table org.apache.hadoop.hbase.client.HTable@238cfdf 13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Configuring 1 reduce partitions to match current region count 13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Writing partition information to hdfs://cldx-1139-1033:9000/user/hduser/partitions_4159cd24-b8ff-4919-854b-a7d1da5069ad 13/04/16 17:18:44 INFO util.NativeCodeLoader: Loaded the native-hadoop library 13/04/16 17:18:44 INFO zlib.ZlibFactory: Successfully loaded initialized native-zlib library 13/04/16 17:18:44 INFO compress.CodecPool: Got brand-new compressor 13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Incremental table output configured. 13/04/16 17:18:47 INFO input.FileInputFormat: Total input paths to process : 1 13/04/16 17:18:47 WARN snappy.LoadSnappy: Snappy native library not loaded 13/04/16 17:18:47 INFO mapred.JobClient: Running job: job_201304091909_0010 13/04/16 17:18:48 INFO mapred.JobClient: map 0% reduce 0% 13/04/16 17:19:07 INFO mapred.JobClient: map 100% reduce 0% 13/04/16 17:19:19 INFO mapred.JobClient: map 100% reduce 100% 13/04/16 17:19:24 INFO mapred.JobClient: Job complete: job_201304091909_0010 13/04/16 17:19:24 INFO
RE: Data not loaded in table via ImportTSV
Hi Anoop, Actually, I got confused after reading the doc. - I thought a simple importtsv command(which also takes table name as the argument) would suffice. But as you pointed out, completebulkload is required. HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.94.6.1.jar completebulkload hdfs://cldx-1139-1033:9000/hbase/storefileoutput PRODUCTS Thanks for the help ! Regards, Omkar Joshi -Original Message- From: Anoop Sam John [mailto:anoo...@huawei.com] Sent: Tuesday, April 16, 2013 12:26 PM To: user@hbase.apache.org Subject: RE: Data not loaded in table via ImportTSV Hi Have you used the tool, LoadIncrementalHFiles after the ImportTSV? -Anoop- From: Omkar Joshi [omkar.jo...@lntinfotech.com] Sent: Tuesday, April 16, 2013 12:01 PM To: user@hbase.apache.org Subject: Data not loaded in table via ImportTSV Hi, The background thread is this : http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3ce689a42b73c5a545ad77332a4fc75d8c1efbd80...@vshinmsmbx01.vshodc.lntinfotech.com%3E I'm referring to the HBase doc. http://hbase.apache.org/book/ops_mgt.html#importtsv Accordingly, my command is : HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.94.6.1.jar importtsv '-Dimporttsv.separator=;' -Dimporttsv.columns=HBASE_ROW_KEY,CUSTOMER_INFO:NAME,CUSTOMER_INFO:EMAIL,CUSTOMER_INFO:ADDRESS,CUSTOMER_INFO:MOBILE -Dimporttsv.bulk.output=hdfs://cldx-1139-1033:9000/hbase/storefileoutput CUSTOMERS hdfs://cldx-1139-1033:9000/hbase/copiedFromLocal/customer.txt /*classpath echoed here*/ 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/home/hduser/hadoop_ecosystem/apache_hadoop/hadoop_installation/hadoop-1.0.4/libexec/../lib/native/Linux-amd64-64 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:java.compiler=NA 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:os.version=3.2.0-23-generic 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:user.name=hduser 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/hduser 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/hduser/hadoop_ecosystem/apache_hbase/hbase_installation/hbase-0.94.6.1/bin 13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=cldx-1140-1034:2181 sessionTimeout=18 watcher=hconnection 13/04/16 17:18:43 INFO zookeeper.ClientCnxn: Opening socket connection to server cldx-1140-1034/172.25.6.71:2181. Will not attempt to authenticate using SASL (unknown error) 13/04/16 17:18:43 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 5483@cldx-1139-1033 13/04/16 17:18:43 INFO zookeeper.ClientCnxn: Socket connection established to cldx-1140-1034/172.25.6.71:2181, initiating session 13/04/16 17:18:43 INFO zookeeper.ClientCnxn: Session establishment complete on server cldx-1140-1034/172.25.6.71:2181, sessionid = 0x13def2889530023, negotiated timeout = 18 13/04/16 17:18:44 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=cldx-1140-1034:2181 sessionTimeout=18 watcher=catalogtracker-on-org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@34d03009 13/04/16 17:18:44 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 5483@cldx-1139-1033 13/04/16 17:18:44 INFO zookeeper.ClientCnxn: Opening socket connection to server cldx-1140-1034/172.25.6.71:2181. Will not attempt to authenticate using SASL (unknown error) 13/04/16 17:18:44 INFO zookeeper.ClientCnxn: Socket connection established to cldx-1140-1034/172.25.6.71:2181, initiating session 13/04/16 17:18:44 INFO zookeeper.ClientCnxn: Session establishment complete on server cldx-1140-1034/172.25.6.71:2181, sessionid = 0x13def2889530024, negotiated timeout = 18 13/04/16 17:18:44 INFO zookeeper.ZooKeeper: Session: 0x13def2889530024 closed 13/04/16 17:18:44 INFO zookeeper.ClientCnxn: EventThread shut down 13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Looking up current regions for table org.apache.hadoop.hbase.client.HTable@238cfdf 13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Configuring 1 reduce partitions to match current region count 13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Writing partition information to hdfs://cldx-1139-1033:9000/user/hduser/partitions_4159cd24-b8ff-4919-854b-a7d1da5069ad 13/04/16 17:18:44 INFO util.NativeCodeLoader: Loaded the native-hadoop library 13/04/16 17:18:44 INFO zlib.ZlibFactory: Successfully loaded initialized native-zlib library 13/04/16 17:18:44 INFO