Re: HBaseStorage in pig 0.8

Corbin Hoenes Mon, 22 Nov 2010 11:55:36 -0800

Yes this was the problem.

I think HBaseStorage class is fine.  I just needed to configure our hadoop 
cluster to "talk" to hbase correctly...if I were writing a java MR job I have 
to do the same thing.


Some better documentation / example on how to use HBaseStorage is all we need.

On Nov 22, 2010, at 12:10 PM, Dmitriy Ryaboy wrote:

> Why is it connecting to localhost?
> Sounds like you don't have the appropriate config files on the path.
> Hm, maybe  we should serialize those in the constructor so that you don't
> have to have them on the JT classpath (I have them on the JT classpath so
> this never came up). Can you confirm that this is the problem?
> 
> D
> 
> On Fri, Nov 19, 2010 at 10:33 PM, Corbin Hoenes <cor...@tynt.com> wrote:
> 
>> Hey Jeff,
>> 
>> It wasn't starting a job but I got a bit further by registering the pig8
>> jar in my pig script.  It seemed to have a bunch of dependencies on google
>> common collections; zookeeper etc... built into that jar.
>> 
>> Now I am seeing this in the web ui logs:
>> 2010-11-19 23:19:44,200 INFO org.apache.zookeeper.ClientCnxn: Attempting
>> connection to server localhost/127.0.0.1:2181
>> 2010-11-19 23:19:44,201 WARN org.apache.zookeeper.ClientCnxn: Exception
>> closing session 0x0 to sun.nio.ch.selectionkeyi...@65efb4be
>> java.net.ConnectException: Connection refused
>>       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>       at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>       at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:885)
>> 2010-11-19 23:19:44,201 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>> exception during shutdown input
>> java.nio.channels.ClosedChannelException
>>       at
>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>>       at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>       at
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:951)
>>       at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
>> 2010-11-19 23:19:44,201 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>> exception during shutdown output
>> java.nio.channels.ClosedChannelException
>>       at
>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>>       at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>       at
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>>       at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
>> 2010-11-19 23:19:44,303 WARN
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to create /hbase
>> -- check quorum servers, currently=localhost:2181
>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for /hbase
>> Looks like it doesn't know where my hbase/conf/hbase-site.xml file is?  Not
>> sure how would this get passed to the HBaseStorage class?
>> 
>> On Nov 19, 2010, at 5:09 PM, Jeff Zhang wrote:
>> 
>>> Does the mapreduce job start ? Could you check the logs on hadoop side ?
>>> 
>>> 
>>> On Sat, Nov 20, 2010 at 7:56 AM, Corbin Hoenes <cor...@tynt.com> wrote:
>>>> We are trying to use the HBaseStorage LoadFunc in pig 0.8 and getting an
>> exception.
>>>> 
>>>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable
>> to open iterator for alias raw
>>>> at org.apache.pig.PigServer.openIterator(PigServer.java:754)
>>>> at
>> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
>>>> at
>> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
>>>> at
>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
>>>> at
>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
>>>> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
>>>> at org.apache.pig.Main.run(Main.java:465)
>>>> at org.apache.pig.Main.main(Main.java:107)
>>>> Caused by: java.io.IOException: Couldn't retrieve job.
>>>> at org.apache.pig.PigServer.store(PigServer.java:818)
>>>> at org.apache.pig.PigServer.openIterator(PigServer.java:728)
>>>> ... 7 more
>>>> 
>>>> 
>>>> Other jobs seem to work.
>>>> 
>>>> What are the requirements for getting hbase storage to work?
>>>> 
>>>> This is what I am doing:
>>>> 1 - added hbase config and hadoop config to my PIG_CLASSPATH
>>>> 2 - pig this script:
>>>> 
>>>> REGISTER ../lib/hbase-0.20.6.jar
>>>> 
>>>> raw = LOAD 'hbase://piggytest' USING
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('content:field1
>> anchor:field1a anchor:field2a') as (content_field1, anchor_field1a,
>> anchor_field2a);
>>>> 
>>>> dump raw;
>>>> 
>>>> ---
>>>> what else am I missing?
>>> 
>>> 
>>> 
>>> --
>>> Best Regards
>>> 
>>> Jeff Zhang
>> 
>>

Re: HBaseStorage in pig 0.8

Reply via email to