Yes this was the problem. I think HBaseStorage class is fine. I just needed to configure our hadoop cluster to "talk" to hbase correctly...if I were writing a java MR job I have to do the same thing.
Some better documentation / example on how to use HBaseStorage is all we need. On Nov 22, 2010, at 12:10 PM, Dmitriy Ryaboy wrote: > Why is it connecting to localhost? > Sounds like you don't have the appropriate config files on the path. > Hm, maybe we should serialize those in the constructor so that you don't > have to have them on the JT classpath (I have them on the JT classpath so > this never came up). Can you confirm that this is the problem? > > D > > On Fri, Nov 19, 2010 at 10:33 PM, Corbin Hoenes <cor...@tynt.com> wrote: > >> Hey Jeff, >> >> It wasn't starting a job but I got a bit further by registering the pig8 >> jar in my pig script. It seemed to have a bunch of dependencies on google >> common collections; zookeeper etc... built into that jar. >> >> Now I am seeing this in the web ui logs: >> 2010-11-19 23:19:44,200 INFO org.apache.zookeeper.ClientCnxn: Attempting >> connection to server localhost/127.0.0.1:2181 >> 2010-11-19 23:19:44,201 WARN org.apache.zookeeper.ClientCnxn: Exception >> closing session 0x0 to sun.nio.ch.selectionkeyi...@65efb4be >> java.net.ConnectException: Connection refused >> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >> at >> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:885) >> 2010-11-19 23:19:44,201 WARN org.apache.zookeeper.ClientCnxn: Ignoring >> exception during shutdown input >> java.nio.channels.ClosedChannelException >> at >> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638) >> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:951) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922) >> 2010-11-19 23:19:44,201 WARN org.apache.zookeeper.ClientCnxn: Ignoring >> exception during shutdown output >> java.nio.channels.ClosedChannelException >> at >> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649) >> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922) >> 2010-11-19 23:19:44,303 WARN >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to create /hbase >> -- check quorum servers, currently=localhost:2181 >> org.apache.zookeeper.KeeperException$ConnectionLossException: >> KeeperErrorCode = ConnectionLoss for /hbase >> Looks like it doesn't know where my hbase/conf/hbase-site.xml file is? Not >> sure how would this get passed to the HBaseStorage class? >> >> On Nov 19, 2010, at 5:09 PM, Jeff Zhang wrote: >> >>> Does the mapreduce job start ? Could you check the logs on hadoop side ? >>> >>> >>> On Sat, Nov 20, 2010 at 7:56 AM, Corbin Hoenes <cor...@tynt.com> wrote: >>>> We are trying to use the HBaseStorage LoadFunc in pig 0.8 and getting an >> exception. >>>> >>>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable >> to open iterator for alias raw >>>> at org.apache.pig.PigServer.openIterator(PigServer.java:754) >>>> at >> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612) >>>> at >> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) >>>> at >> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) >>>> at >> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141) >>>> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76) >>>> at org.apache.pig.Main.run(Main.java:465) >>>> at org.apache.pig.Main.main(Main.java:107) >>>> Caused by: java.io.IOException: Couldn't retrieve job. >>>> at org.apache.pig.PigServer.store(PigServer.java:818) >>>> at org.apache.pig.PigServer.openIterator(PigServer.java:728) >>>> ... 7 more >>>> >>>> >>>> Other jobs seem to work. >>>> >>>> What are the requirements for getting hbase storage to work? >>>> >>>> This is what I am doing: >>>> 1 - added hbase config and hadoop config to my PIG_CLASSPATH >>>> 2 - pig this script: >>>> >>>> REGISTER ../lib/hbase-0.20.6.jar >>>> >>>> raw = LOAD 'hbase://piggytest' USING >> org.apache.pig.backend.hadoop.hbase.HBaseStorage('content:field1 >> anchor:field1a anchor:field2a') as (content_field1, anchor_field1a, >> anchor_field2a); >>>> >>>> dump raw; >>>> >>>> --- >>>> what else am I missing? >>> >>> >>> >>> -- >>> Best Regards >>> >>> Jeff Zhang >> >>