Hello

I am using pig version pig 0.8.0

A = load '/passwd' using PigStorage(':');B = foreach A generate $0 as id,
$2 as value;dump B;

the result of first part is here:

(twilli,6259)
(saamodt,6260)
(hailu268,6261)
(oddsen,6262)
(neuhaus,6263)
(zoila,6264)
(elinmn,6265)
(diego,6266)
(fsudmann,6267)
(yanliang,6268)
(nestor,6269)

As i understood the problem is at the second part

store B into 'table2' using org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:a','-loadKey');

I am suspecting that problem is for row key i am not sure how it can manage the row key . what i want is first item should be the row key and second item should be the column of hbase table.

when i run the query i have got the following result on my task tracker:

grunt> A = load '/passwd' using PigStorage(':');B = foreach A generate $0 as id, $2 as value;store B into 'table2' using org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:a','-loadKey'); 2011-04-27 10:29:29,785 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN 2011-04-27 10:29:29,785 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used. 2011-04-27 10:29:29,913 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT 2011-04-27 10:29:29,913 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:host.name=haisen10.ux.uis.no 2011-04-27 10:29:29,913 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.6.0_23 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Sun Microsystems Inc. 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.home=/opt/jdk1.6.0_23/jre 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=/etc/hbase/conf:/usr/lib/pig/bin/../conf:/opt/jdk/lib/tools.jar:/usr/lib/pig/bin/../pig-0.8.0-cdh3u0-core.jar:/usr/lib/pig/bin/../build/pig-*-SNAPSHOT.jar:/usr/lib/pig/bin/../lib/ant-contrib-1.0b3.jar:/usr/lib/pig/bin/../lib/automaton.jar:/usr/lib/pig/bin/../build/ivy/lib/Pig/*.jar:/usr/lib/hadoop/hadoop-core-0.20.2-cdh3u0.jar:/usr/lib/hadoop/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2-cdh3u0.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.LICENSE.txt:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jdiff:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.jar:/usr/lib/hadoop/lib/jetty-servlet-tester-6.1.26.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/jsp-2.1:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/kfs-0.2.LICENSE.txt:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/etc/hbase/conf::/usr/lib/hadoop/conf 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/opt/jdk1.6.0_23/jre/lib/amd64/server:/opt/jdk1.6.0_23/jre/lib/amd64:/opt/jdk1.6.0_23/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/tmp 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=<NA> 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.name=Linux 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.arch=amd64 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.version=2.6.18-194.32.1.el5.centos.plus 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.name=haisen 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.home=/home/ekstern/haisen 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/import/br1raid6a1c1/haisen 2011-04-27 10:29:29,915 [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=haisen11:2181 sessionTimeout=180000 watcher=hconnection 2011-04-27 10:29:29,923 [main-SendThread()] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server haisen11/152.94.1.130:2181 2011-04-27 10:29:29,926 [main-SendThread(haisen11:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to haisen11/152.94.1.130:2181, initiating session 2011-04-27 10:29:29,936 [main-SendThread(haisen11:2181)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server haisen11/152.94.1.130:2181, sessionid = 0x12f8c18a1340177, negotiated timeout = 40000 2011-04-27 10:29:29,972 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@67f31652; hsa=haisen10.ux.uis.no:60020 2011-04-27 10:29:30,018 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is haisen10.ux.uis.no:60020 2011-04-27 10:29:30,020 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cache hit for row <> in tableName .META.: location server haisen10.ux.uis.no:60020, location region name .META.,,1.1028785192 2011-04-27 10:29:30,024 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=table2,,00000000000000 for max=10 rows 2011-04-27 10:29:30,028 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e. is haisen6.ux.uis.no:60020 2011-04-27 10:29:30,030 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cache hit for row <> in tableName table2: location server haisen6.ux.uis.no:60020, location region name table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e. 2011-04-27 10:29:30,031 [main] INFO org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table instance for table2 2011-04-27 10:29:30,068 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: B: Store(table2:org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:a','-loadKey')) - scope-6 Operator Key: scope-6) 2011-04-27 10:29:30,085 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 2011-04-27 10:29:30,122 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 2011-04-27 10:29:30,122 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1 2011-04-27 10:29:30,187 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job 2011-04-27 10:29:30,204 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2011-04-27 10:29:31,684 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job 2011-04-27 10:29:31,709 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2011-04-27 10:29:32,059 [Thread-7] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=haisen11:2181 sessionTimeout=180000 watcher=hconnection 2011-04-27 10:29:32,060 [Thread-7-SendThread()] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server haisen11/152.94.1.130:2181 2011-04-27 10:29:32,061 [Thread-7-SendThread(haisen11:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to haisen11/152.94.1.130:2181, initiating session 2011-04-27 10:29:32,063 [Thread-7-SendThread(haisen11:2181)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server haisen11/152.94.1.130:2181, sessionid = 0x12f8c18a1340178, negotiated timeout = 40000 2011-04-27 10:29:32,070 [Thread-7] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Lookedup root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@1f248f2b; hsa=haisen10.ux.uis.no:60020 2011-04-27 10:29:32,074 [Thread-7] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for .META.,,1.1028785192 is haisen10.ux.uis.no:60020 2011-04-27 10:29:32,074 [Thread-7] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cache hit for row <> in tableName .META.: location server haisen10.ux.uis.no:60020, location region name .META.,,1.1028785192 2011-04-27 10:29:32,076 [Thread-7] DEBUG org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=table2,,00000000000000 for max=10 rows 2011-04-27 10:29:32,080 [Thread-7] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cached location for table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e. is haisen6.ux.uis.no:60020 2011-04-27 10:29:32,081 [Thread-7] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Cache hit for row <> in tableName table2: location server haisen6.ux.uis.no:60020, location region name table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e. 2011-04-27 10:29:32,082 [Thread-7] INFO org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table instance for table2 2011-04-27 10:29:32,102 [Thread-7] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2011-04-27 10:29:32,102 [Thread-7] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2011-04-27 10:29:32,110 [Thread-7] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2011-04-27 10:29:32,211 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2011-04-27 10:29:32,953 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201104251150_0071 2011-04-27 10:29:32,954 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://haisen11:50030/jobdetails.jsp?jobid=job_201104251150_0071 2011-04-27 10:29:52,654 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_201104251150_0071 has failed! Stop running all dependent jobs 2011-04-27 10:29:52,666 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2011-04-27 10:29:52,674 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed! 2011-04-27 10:29:52,677 [main] INFO org.apache.pig.tools.pigstats.PigStats - Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features 0.20.2-cdh3u0 0.8.0-cdh3u0 haisen 2011-04-27 10:29:30 2011-04-27 10:29:52 UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_201104251150_0071 A,B MAP_ONLY Message: Job failed! Error - NA table2,

Input(s):
Failed to read data from "/passwd"

Output(s):
Failed to produce result in "table2"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_201104251150_0071


2011-04-27 10:29:52,677 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!


thank you

Byambajargal

On 4/27/11 06:07, Bill Graham wrote:
What version of Pig are you running and what errors are you seeing on
the task trackers?

On Tue, Apr 26, 2011 at 4:46 AM, byambajargal<byambaa.0...@gmail.com>  wrote:
Hello ...
I have a question for you

I am doing a pig job as following that read from hdfs simply to store hbase
when i start the job first part works fine and second part was failure.
Could you give me a direction how to move data from hdfs to Hbase


  A = load '/passwd' using PigStorage(':');B = foreach A generate $0 as id,
$2 as value;dump B;
  store B into 'table2' using
  org.apache.pig.backend.hadoop.hbase.HBaseStorage( 'cf:a' , '-loadKey');

thank you for your help

Byambajargal



On 4/25/11 18:26, Dmitriy Ryaboy wrote:
The first element of the relation you store must be the row key. You
aren't
loading the row key, so load>    store isn't working.
Try
my_data = LOAD 'hbase://table1' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1', '-loadKey') ;

On Mon, Apr 25, 2011 at 5:32 AM,
byambajargal<byambaa.0...@gmail.com>wrote:

Hello guys

I am running cloudere distribution cdh3u0 on my cluster with Pig and
Hbase.
i can read data from hbase using the following pig query:

my_data = LOAD 'hbase://table1' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1') ;dump my_data

but when i try to store data into hbase as same way the job was failure.

store my_data into 'hbase://table2' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1');

the table1 and the table2 has same structure and same column.


the table i have:

hbase(main):029:0* scan 'table1'
ROW                 COLUMN+CELL
  row1               column=cf:1, timestamp=1303731834050, value=value1
  row2               column=cf:1, timestamp=1303731849901, value=value2
  row3               column=cf:1, timestamp=1303731858637, value=value3
3 row(s) in 0.0470 seconds


thanks

Byambajargal





Reply via email to