Hi all,

We use pig 0.20.1 from a release distribution (no trunk).

The hbase-0.20.1/conf/hbase-site.xml is included in CLASSPATH.
The hbase-site.xml content (is this correct?):
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://localhost:9000/hbase</value>
    <description>The directory shared by region servers.
    </description>
  </property>
</configuration>

Is my URL to the database correct ('hbase://test') and which pig version
should I use (mapreduce or local?)
Now I get different results and a stacktrace (from the commands below).
Maybe this is helpful; the bold part in the stacktrace indicates a wrong URL
to the databasetable, is this an correct interpretation of the stacktrace?

pig -x mapreduce
grunt> B = load 'hbase://test' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('data') as (col_a);
grunt> dump B;
still the same result:
Retrying connect to server: localhost/127.0.0.1:60000 <
http://127.0.0.1:60000>. Already tried 0 time(s).

pig-0.5.0/bin/pig -x local
grunt> B = load 'hbase://test' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('data') as (col_a);
grunt> dump B;
2009-11-20 17:15:01,725 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1002: Unable to store alias B
Details at logfile: /Users/jorislops/Desktop/pig_1258733695965.log

Pig Stack Trace
---------------
ERROR 1002: Unable to store alias B

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
open iterator for alias B
    at org.apache.pig.PigServer.openIterator(PigServer.java:475)
    at
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532)
    at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
    at org.apache.pig.Main.main(Main.java:363)
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002:
Unable to store alias B
    at org.apache.pig.PigServer.store(PigServer.java:530)
    at org.apache.pig.PigServer.openIterator(PigServer.java:458)
    ... 6 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0:
Wrong FS: hbase://test, expected: file:///
    at
org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:184)
    at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:773)
    at org.apache.pig.PigServer.store(PigServer.java:522)
    ... 7 more
*Caused by: java.lang.IllegalArgumentException: Wrong FS: hbase://test,
expected: file:///*
    at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:305)
    at
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:47)
    at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:357)
    at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:643)
    at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:203)
    at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:131)
    at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:147)
    at org.apache.pig.impl.io.FileLocalizer.fullPath(FileLocalizer.java:532)
    at org.apache.pig.impl.io.FileLocalizer.open(FileLocalizer.java:346)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad.setUp(POLoad.java:103)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad.getNext(POLoad.java:131)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
    at
org.apache.pig.backend.local.executionengine.physicalLayer.counters.POCounter.getNext(POCounter.java:71)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117)
    at
org.apache.pig.backend.local.executionengine.LocalPigLauncher.runPipeline(LocalPigLauncher.java:146)
    at
org.apache.pig.backend.local.executionengine.LocalPigLauncher.launchPig(LocalPigLauncher.java:109)
    at
org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:165)
    ... 9 more

Thanks,
Joris









2009/11/19 Jeff Zhang <zjf...@gmail.com>

> Hi Morris,
>
> Do you use the pig in trunk ?
> If you want to use hbase, you should put hbase configuration in
> hbase-site.xml, and put this file on your classpath.
>
>
> Jeff Zhang
>
>
> On Thu, Nov 19, 2009 at 8:20 AM, Morris Swertz <m.a.swe...@rug.nl> wrote:
>
> >
> > Hi all,
> >
> > I try to load data from HBase into pig with HBaseStorage. Something is
> > going wrong because no data from HBase (test table) shows up in Pig; only
> > errors.
> >
> > I configured the Hadoop and HBase in Pseudo-Distributed Operation mode.
> >
> > What follows are the commands that I did and the output it produced.
> >
> >
> > //try with pig in remote mode!
> >
> > pig -x mapreduce
> >
> > B = load 'hbase://test' using
> > org.apache.pig.backend.hadoop.hbase.HBaseStorage('data') as (col_a);
> >
> > dump B;
> >
> > output:
> >
> > 009-11-19 13:56:02,810 [main] INFO
> >
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> > - MR plan size before optimization: 1
> >
> > 2009-11-19 13:56:02,810 [main] INFO
> >
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> > - MR plan size after optimization: 1
> >
> > 2009-11-19 13:56:04,708 [main] INFO
> >
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> > - Setting up single store job
> >
> > 2009-11-19 13:56:04,729 [main] INFO
> >  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
> > with processName=JobTracker, sessionId= - already initialized
> >
> > 2009-11-19 13:56:04,739 [Thread-5] WARN
>  org.apache.hadoop.mapred.JobClient
> > - Use GenericOptionsParser for parsing the arguments. Applications should
> > implement Tool for the same.
> >
> > 2009-11-19 13:56:05,024 [Thread-5] INFO
> >  org.apache.pig.backend.hadoop.hbase.HBaseStorage - tablename:
> > file:/Users/jorislops/Desktop/pig-0.5.0/test
> >
> > 2009-11-19 13:56:05,231 [main] INFO
> >
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > - 0% complete
> >
> > 2009-11-19 13:56:06,222 [Thread-5] INFO  org.apache.hadoop.ipc.Client -
> > Retrying connect to server: localhost/127.0.0.1:60000 <
> > http://127.0.0.1:60000>. Already tried 0 time(s).
> >
> > 2009-11-19 13:56:06,222 [Thread-5] INFO  org.apache.hadoop.ipc.Client -
> > Retrying connect to server: localhost/127.0.0.1:60000 <
> > http://127.0.0.1:60000>. Already tried 1 time(s).
> >
> >
> >
> > //port 60000 is used by a java program
> >
> >
> >
> > pig -x local
> >
> > B = load 'test' using
> > org.apache.pig.backend.hadoop.hbase.HBaseStorage('data') as (col_a);
> >
> > dump B;
> >
> > output:
> >
> > 2009-11-19 13:53:18,425 [main] INFO
> >  org.apache.pig.backend.local.executionengine.LocalPigLauncher -
> > Successfully stored result in: "file:/tmp/temp-1663248768/tmp-1939618752"
> >
> > 2009-11-19 13:53:18,436 [main] INFO
> >  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Records
> > written : 0
> >
> > 2009-11-19 13:53:18,436 [main] INFO
> >  org.apache.pig.backend.local.executionengine.LocalPigLauncher - Bytes
> > written : 0
> >
> > 2009-11-19 13:53:18,436 [main] INFO
> >  org.apache.pig.backend.local.executionengine.LocalPigLauncher - 100%
> > complete!
> >
> > 2009-11-19 13:53:18,436 [main] INFO
> >  org.apache.pig.backend.local.executionengine.LocalPigLauncher -
> Success!!
> >
> > //there is nothing in /tmp/temp-1663248768/tmp-1939618752 (it's empty)
> >
> >
> >
> > I tried different paths to the HBase table 'hbase://test', 'test',
> > hbase://localhost:60000/test
> >
> >
> >
> > How I stated the system (Hadoop + HBase) is started and I verified that's
> > working as I expected.
> >
> >
> >
> > bin/hadoop namenode -format
> >
> > bin/start-all.sh
> >
> > //both Namenode and Jobtrackter are running verified by
> > http://localhost:50070 and http://localhost:500040
> >
> >
> >
> > bin/start-hbase.sh
> >
> > //both mater and regionserver are running check by localhost:60010
> > localhost:20 localhost:30
> >
> > //also zookeeper Quorum is started at port localhost:2181
> >
> >
> >
> > //fill a test table in hbase
> >
> > hbase-0.20.1/bin/hbase shell
> >
> > create 'test', 'data'
> >
> > put 'test', 'row1', 'data', 'value1'
> >
> > scan 'test'
> >
> > //localhost:60010 show that the test table is in HBase.
> >
> >
> >
> > Hope that someone knows the solution.
> >
> > Thanks,
> >
> > Joris
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>

Reply via email to