[ 
https://issues.apache.org/jira/browse/PIG-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13635902#comment-13635902
 ] 

Daniel Dai commented on PIG-2786:
---------------------------------

[~ndimiduk], I tried your patch, frontend is ok, backend I get
{code}
Error: java.lang.ClassNotFoundException: com.google.protobuf.Message
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
        at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.<clinit>(HbaseObjectWritable.java:265)
        at org.apache.hadoop.hbase.ipc.Invocation.write(Invocation.java:139)
        at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.sendParam(HBaseClient.java:612)
        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:975)
        at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
        at $Proxy7.getProtocolVersion(Unknown Source)
        at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138)
        at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:990)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:885)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:987)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:889)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:846)
        at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:234)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:174)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:133)
        at 
org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:201)
        at 
org.apache.pig.backend.hadoop.hbase.HBaseStorage.getOutputFormat(HBaseStorage.java:843)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:89)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.<init>(PigOutputCommitter.java:67)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:279)
        at org.apache.hadoop.mapred.Task.initialize(Task.java:515)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
{code}

The issue is we need to ship protobuf.jar and zookeeper to the backend. This is 
through "-Dpig.additional.jars". 

But ship more jars every time seems adding overhead for non-Hbase Pig jobs 
unnecessarily. I would suggest either:
1. Use the flag "-useHBase", which is similar to "-useHCatalog"
2. Enrich the LoadFunc/StoreFunc to ship jars automatically if needed

Prefer #2 since that would solve all similar problems. May need some effort but 
would benefit in long term.
                
> enhance Pig launcher script wrt. HBase integration
> --------------------------------------------------
>
>                 Key: PIG-2786
>                 URL: https://issues.apache.org/jira/browse/PIG-2786
>             Project: Pig
>          Issue Type: Improvement
>          Components: grunt
>    Affects Versions: 0.10.0
>            Reporter: Roman Shaposhnik
>            Assignee: Roman Shaposhnik
>            Priority: Minor
>              Labels: hbase
>         Attachments: 0001-PIG-2786-launch-script-should-locate-HBase.patch
>
>
> The current bin/pig script suffers from a couple of issues as far as 
> integration with HBase is concerned:
>   # it only detects ZK/HBase jars under a PIG_HOME/share/.. layout
>   # it doesn't detect HBase dependencies
> The proposal here would be to ask HBase itself for its classpath

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to