Hello
I am using pig version pig 0.8.0
A = load '/passwd' using PigStorage(':');B = foreach A generate $0 as id,
$2 as value;dump B;
the result of first part is here:
(twilli,6259)
(saamodt,6260)
(hailu268,6261)
(oddsen,6262)
(neuhaus,6263)
(zoila,6264)
(elinmn,6265)
(diego,6266)
(fsudmann,6267)
(yanliang,6268)
(nestor,6269)
As i understood the problem is at the second part
store B into 'table2' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:a','-loadKey');
I am suspecting that problem is for row key i am not sure how it can
manage the row key .
what i want is first item should be the row key and second item should
be the column of hbase table.
when i run the query i have got the following result on my task tracker:
grunt> A = load '/passwd' using PigStorage(':');B = foreach A generate
$0 as id, $2 as value;store B into 'table2' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:a','-loadKey');
2011-04-27 10:29:29,785 [main] INFO
org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
script: UNKNOWN
2011-04-27 10:29:29,785 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
pig.usenewlogicalplan is set to true. New logical plan will be used.
2011-04-27 10:29:29,913 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011
22:27 GMT
2011-04-27 10:29:29,913 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:host.name=haisen10.ux.uis.no
2011-04-27 10:29:29,913 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:java.version=1.6.0_23
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:java.vendor=Sun Microsystems Inc.
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:java.home=/opt/jdk1.6.0_23/jre
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client
environment:java.class.path=/etc/hbase/conf:/usr/lib/pig/bin/../conf:/opt/jdk/lib/tools.jar:/usr/lib/pig/bin/../pig-0.8.0-cdh3u0-core.jar:/usr/lib/pig/bin/../build/pig-*-SNAPSHOT.jar:/usr/lib/pig/bin/../lib/ant-contrib-1.0b3.jar:/usr/lib/pig/bin/../lib/automaton.jar:/usr/lib/pig/bin/../build/ivy/lib/Pig/*.jar:/usr/lib/hadoop/hadoop-core-0.20.2-cdh3u0.jar:/usr/lib/hadoop/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2-cdh3u0.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.LICENSE.txt:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jdiff:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.jar:/usr/lib/hadoop/lib/jetty-servlet-tester-6.1.26.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/jsp-2.1:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/kfs-0.2.LICENSE.txt:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/etc/hbase/conf::/usr/lib/hadoop/conf
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client
environment:java.library.path=/opt/jdk1.6.0_23/jre/lib/amd64/server:/opt/jdk1.6.0_23/jre/lib/amd64:/opt/jdk1.6.0_23/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:java.io.tmpdir=/tmp
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:java.compiler=<NA>
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:os.name=Linux
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:os.arch=amd64
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:os.version=2.6.18-194.32.1.el5.centos.plus
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:user.name=haisen
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:user.home=/home/ekstern/haisen
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:user.dir=/import/br1raid6a1c1/haisen
2011-04-27 10:29:29,915 [main] INFO org.apache.zookeeper.ZooKeeper -
Initiating client connection, connectString=haisen11:2181
sessionTimeout=180000 watcher=hconnection
2011-04-27 10:29:29,923 [main-SendThread()] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
haisen11/152.94.1.130:2181
2011-04-27 10:29:29,926 [main-SendThread(haisen11:2181)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
haisen11/152.94.1.130:2181, initiating session
2011-04-27 10:29:29,936 [main-SendThread(haisen11:2181)] INFO
org.apache.zookeeper.ClientCnxn - Session establishment complete on
server haisen11/152.94.1.130:2181, sessionid = 0x12f8c18a1340177,
negotiated timeout = 40000
2011-04-27 10:29:29,972 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Lookedup root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@67f31652;
hsa=haisen10.ux.uis.no:60020
2011-04-27 10:29:30,018 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cached location for .META.,,1.1028785192 is haisen10.ux.uis.no:60020
2011-04-27 10:29:30,020 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cache hit for row <> in tableName .META.: location server
haisen10.ux.uis.no:60020, location region name .META.,,1.1028785192
2011-04-27 10:29:30,024 [main] DEBUG
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
row=table2,,00000000000000 for max=10 rows
2011-04-27 10:29:30,028 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cached location for
table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e. is
haisen6.ux.uis.no:60020
2011-04-27 10:29:30,030 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cache hit for row <> in tableName table2: location server
haisen6.ux.uis.no:60020, location region name
table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e.
2011-04-27 10:29:30,031 [main] INFO
org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table
instance for table2
2011-04-27 10:29:30,068 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name:
B:
Store(table2:org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:a','-loadKey'))
- scope-6 Operator Key: scope-6)
2011-04-27 10:29:30,085 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler
- File concatenation threshold: 100 optimistic? false
2011-04-27 10:29:30,122 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1
2011-04-27 10:29:30,122 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 1
2011-04-27 10:29:30,187 [main] INFO
org.apache.pig.tools.pigstats.ScriptState - Pig script settings are
added to the job
2011-04-27 10:29:30,204 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-04-27 10:29:31,684 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job
2011-04-27 10:29:31,709 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission.
2011-04-27 10:29:32,059 [Thread-7] INFO org.apache.zookeeper.ZooKeeper
- Initiating client connection, connectString=haisen11:2181
sessionTimeout=180000 watcher=hconnection
2011-04-27 10:29:32,060 [Thread-7-SendThread()] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
haisen11/152.94.1.130:2181
2011-04-27 10:29:32,061 [Thread-7-SendThread(haisen11:2181)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
haisen11/152.94.1.130:2181, initiating session
2011-04-27 10:29:32,063 [Thread-7-SendThread(haisen11:2181)] INFO
org.apache.zookeeper.ClientCnxn - Session establishment complete on
server haisen11/152.94.1.130:2181, sessionid = 0x12f8c18a1340178,
negotiated timeout = 40000
2011-04-27 10:29:32,070 [Thread-7] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Lookedup root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@1f248f2b;
hsa=haisen10.ux.uis.no:60020
2011-04-27 10:29:32,074 [Thread-7] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cached location for .META.,,1.1028785192 is haisen10.ux.uis.no:60020
2011-04-27 10:29:32,074 [Thread-7] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cache hit for row <> in tableName .META.: location server
haisen10.ux.uis.no:60020, location region name .META.,,1.1028785192
2011-04-27 10:29:32,076 [Thread-7] DEBUG
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
row=table2,,00000000000000 for max=10 rows
2011-04-27 10:29:32,080 [Thread-7] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cached location for
table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e. is
haisen6.ux.uis.no:60020
2011-04-27 10:29:32,081 [Thread-7] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cache hit for row <> in tableName table2: location server
haisen6.ux.uis.no:60020, location region name
table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e.
2011-04-27 10:29:32,082 [Thread-7] INFO
org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table
instance for table2
2011-04-27 10:29:32,102 [Thread-7] INFO
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input
paths to process : 1
2011-04-27 10:29:32,102 [Thread-7] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
input paths to process : 1
2011-04-27 10:29:32,110 [Thread-7] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
input paths (combined) to process : 1
2011-04-27 10:29:32,211 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2011-04-27 10:29:32,953 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- HadoopJobId: job_201104251150_0071
2011-04-27 10:29:32,954 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- More information at:
http://haisen11:50030/jobdetails.jsp?jobid=job_201104251150_0071
2011-04-27 10:29:52,654 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- job job_201104251150_0071 has failed! Stop running all dependent jobs
2011-04-27 10:29:52,666 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2011-04-27 10:29:52,674 [main] ERROR
org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2011-04-27 10:29:52,677 [main] INFO
org.apache.pig.tools.pigstats.PigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt
Features
0.20.2-cdh3u0 0.8.0-cdh3u0 haisen 2011-04-27 10:29:30
2011-04-27 10:29:52 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_201104251150_0071 A,B MAP_ONLY Message: Job failed!
Error - NA table2,
Input(s):
Failed to read data from "/passwd"
Output(s):
Failed to produce result in "table2"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_201104251150_0071
2011-04-27 10:29:52,677 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Failed!
thank you
Byambajargal
On 4/27/11 06:07, Bill Graham wrote:
What version of Pig are you running and what errors are you seeing on
the task trackers?
On Tue, Apr 26, 2011 at 4:46 AM, byambajargal<byambaa.0...@gmail.com> wrote:
Hello ...
I have a question for you
I am doing a pig job as following that read from hdfs simply to store hbase
when i start the job first part works fine and second part was failure.
Could you give me a direction how to move data from hdfs to Hbase
A = load '/passwd' using PigStorage(':');B = foreach A generate $0 as id,
$2 as value;dump B;
store B into 'table2' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage( 'cf:a' , '-loadKey');
thank you for your help
Byambajargal
On 4/25/11 18:26, Dmitriy Ryaboy wrote:
The first element of the relation you store must be the row key. You
aren't
loading the row key, so load> store isn't working.
Try
my_data = LOAD 'hbase://table1' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1', '-loadKey') ;
On Mon, Apr 25, 2011 at 5:32 AM,
byambajargal<byambaa.0...@gmail.com>wrote:
Hello guys
I am running cloudere distribution cdh3u0 on my cluster with Pig and
Hbase.
i can read data from hbase using the following pig query:
my_data = LOAD 'hbase://table1' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1') ;dump my_data
but when i try to store data into hbase as same way the job was failure.
store my_data into 'hbase://table2' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1');
the table1 and the table2 has same structure and same column.
the table i have:
hbase(main):029:0* scan 'table1'
ROW COLUMN+CELL
row1 column=cf:1, timestamp=1303731834050, value=value1
row2 column=cf:1, timestamp=1303731849901, value=value2
row3 column=cf:1, timestamp=1303731858637, value=value3
3 row(s) in 0.0470 seconds
thanks
Byambajargal