Just to close the loop on this... Did not have time to experiment with other EMR versions, so just going with emr-4.9.2 for the near future since Pig Phoenix storage works as expected when running the script from the command line.
However, made an action item for a future date to try submitting the Pig script as an EMR step to see if I get better results. Thanks, Steve On Mon, Aug 21, 2017 at 4:48 PM, Steve Terrell <sterr...@oculus360.us> wrote: > Thanks for the extra info! Will let everyone know if I solve this. > > On Mon, Aug 21, 2017 at 4:24 PM, anil gupta <anilgupt...@gmail.com> wrote: > >> And forgot to mention that we invoke our pig scripts through oozie. >> >> On Mon, Aug 21, 2017 at 2:20 PM, anil gupta <anilgupt...@gmail.com> >> wrote: >> >>> Sorry, cant share the pig script. >>> Here is what we are registering: >>> REGISTER /usr/lib/phoenix/phoenix-4.7.0-HBase-1.2-client.jar; >>> REGISTER /usr/lib/pig/lib/piggybank.jar; >>> >>> >>> Following is the classpath of Hadoop and Yarn: >>> [hadoop@ip-52-143 ~]$ hadoop classpath >>> /etc/hadoop/conf:/usr/lib/hadoop/lib/*:/usr/lib/hadoop/.//*: >>> /usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/*:/usr/lib/ >>> hadoop-hdfs/.//*:/usr/lib/hadoop-yarn/lib/*:/usr/lib/hadoop- >>> yarn/.//*:/usr/lib/hadoop-mapreduce/lib/*:/usr/lib/hadoo >>> p-mapreduce/.//*::/etc/tez/conf:/usr/lib/tez/*:/usr/lib/ >>> tez/lib/*:/usr/lib/hadoop-lzo/lib/*:/usr/share/aws/aws-java- >>> sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs >>> /lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/ >>> emr/ddb/lib/emr-ddb-hadoop.jar:/usr/share/aws/emr/goodies >>> /lib/emr-hadoop-goodies.jar:/usr/share/aws/emr/kinesis/lib/ >>> emr-kinesis-hadoop.jar:/usr/share/aws/emr/cloudwatch-sink/ >>> lib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/* >>> [hadoop@ip-52-143 ~]$ yarn classpath >>> /etc/hadoop/conf:/etc/hadoop/conf:/etc/hadoop/conf:/usr/lib/ >>> hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop-hdfs/./:/u >>> sr/lib/hadoop-hdfs/lib/*:/usr/lib/hadoop-hdfs/.//*:/usr/lib/ >>> hadoop-yarn/lib/*:/usr/lib/hadoop-yarn/.//*:/usr/lib/hado >>> op-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/.//*::/etc/ >>> tez/conf:/usr/lib/tez/*:/usr/lib/tez/lib/*:/usr/lib/hadoop- >>> lzo/lib/*:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/ >>> emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/ >>> emr/emrfs/auxlib/*:/usr/share/aws/emr/ddb/lib/emr-ddb-hadoop >>> .jar:/usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar:/ >>> usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar:/usr/ >>> share/aws/emr/cloudwatch-sink/lib/*:/usr/share/aws/emr/ >>> security/conf:/usr/share/aws/emr/security/lib/*:/usr/lib/ >>> hadoop-yarn/.//*:/usr/lib/hadoop-yarn/lib/* >>> >>> >>> >>> On Mon, Aug 21, 2017 at 11:21 AM, Steve Terrell <sterr...@oculus360.us> >>> wrote: >>> >>>> Hmm... just repeated my test on emr-5.2.0. This time I went with the >>>> default EMR console selections for master and core nodes (2 of them). >>>> >>>> When running my simple Pig Phoenix store script, still getting the >>>> errors I got for other 5.x.x versions: >>>> 2017-08-21 17:50:52,431 [ERROR] [main] |app.DAGAppMaster|: Error >>>> starting DAGAppMaster >>>> java.lang.NoSuchMethodError: org.apache.hadoop.yarn.api.rec >>>> ords.ContainerId.fromString(Ljava/lang/String;)Lorg/apache/h >>>> adoop/yarn/api/records/ContainerId; >>>> at org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(Con >>>> verterUtils.java:179) >>>> at org.apache.tez.dag.app.DAGAppMaster.main(DAGAppMaster.java:2304) >>>> >>>> The simple test script: >>>> REGISTER /usr/lib/phoenix/phoenix-client.jar; >>>> A = load '/steve/a.txt' as (TXT:chararray); >>>> store A into 'hbase://A_TABLE' using org.apache.phoenix.pig.Phoenix >>>> HBaseStorage('10.0.100.51','-batchSize 2500'); >>>> >>>> Calling directly from the command line like >>>> pig try.pig >>>> >>>> Maybe other people are calling their Phoenix Pig script some other way >>>> (EMR steps) or with different parameters? Details where this works would >>>> really help out a lot. >>>> >>>> Thanks, >>>> Steve >>>> >>>> On Mon, Aug 21, 2017 at 10:23 AM, Steve Terrell <sterr...@oculus360.us> >>>> wrote: >>>> >>>>> Anil, >>>>> >>>>> That's good news (about 5.2). Any chance I could see your >>>>> >>>>> - full pig command line >>>>> - PIG_CLASSPATH env variable >>>>> - pig script or at least the REGISTER and PhoenixHBaseStorage() >>>>> lines? >>>>> >>>>> Might help me figure out what I'm doing wrong or differently. >>>>> >>>>> One thing I did not mention because I thought it should not matter is >>>>> that to avoid extra costs while testing, I was only running a master node >>>>> with no slaves (no task or core nodes). Maybe lack of slaves causes >>>>> problems not normally seen. Interesting... >>>>> >>>>> Thanks so much, >>>>> Steve >>>>> >>>>> >>>>> On Sun, Aug 20, 2017 at 11:04 AM, anil gupta <anilgupt...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hey Steve, >>>>>> >>>>>> We are currently using EMR5.2 and pig-phoenix is working fine for us. >>>>>> We are gonna try EMR5.8 next week. >>>>>> >>>>>> HTH, >>>>>> Anil >>>>>> >>>>>> On Fri, Aug 18, 2017 at 9:00 AM, Steve Terrell <sterr...@oculus360.us >>>>>> > wrote: >>>>>> >>>>>>> More info... >>>>>>> >>>>>>> By trial and error, I tested different EMR versions and made a >>>>>>> little incomplete list of which ones support Pig Phoenix storage and >>>>>>> which >>>>>>> ones don't: >>>>>>> >>>>>>> emr-5.8.0 JacksonJaxbJsonProvider error >>>>>>> emr-5.6.0 JacksonJaxbJsonProvider error >>>>>>> emr-5.4.0 JacksonJaxbJsonProvider error >>>>>>> emr-5.3.1 ContainerId.fromString() error >>>>>>> emr-5.3.0 ContainerId.fromString() error >>>>>>> emr-5.0.0 ContainerId.fromString() error >>>>>>> emr-4.9.2 Works! >>>>>>> emr-4.7.0 Works! >>>>>>> >>>>>>> I ran out of time trying to get 5.8.0 working, so will start using >>>>>>> 4.9.2. But I would like to switch to 5.8.0 if anyone has a solution. >>>>>>> Meanwhile, I hope this list saves other people some time and headache. >>>>>>> >>>>>>> Thanks, >>>>>>> Steve >>>>>>> >>>>>>> On Thu, Aug 17, 2017 at 2:40 PM, Steve Terrell < >>>>>>> sterr...@oculus360.us> wrote: >>>>>>> >>>>>>>> I'm running EMR 5.8.0 with these applications installed: >>>>>>>> Pig 0.16.0, Phoenix 4.11.0, HBase 1.3.1 >>>>>>>> >>>>>>>> Here is my pig script (try.pig): >>>>>>>> >>>>>>>> REGISTER /usr/lib/phoenix/phoenix-4.11.0-HBase-1.3-client.jar; >>>>>>>> A = load '/steve/a.txt' as (TXT:chararray); >>>>>>>> store A into 'hbase://A_TABLE' using org.apache.phoenix.pig.Phoenix >>>>>>>> HBaseStorage('10.0.100.51','-batchSize 2500'); >>>>>>>> >>>>>>>> I run it like this from the command line: >>>>>>>> pig try.pig >>>>>>>> >>>>>>>> When it fails, I dig into the hadoop task logs and find this: >>>>>>>> 2017-08-17 19:11:37,539 [ERROR] [main] |app.DAGAppMaster|: Error >>>>>>>> starting DAGAppMaster >>>>>>>> java.lang.NoClassDefFoundError: org/apache/phoenix/shaded/org/ >>>>>>>> codehaus/jackson/jaxrs/JacksonJaxbJsonProvider >>>>>>>> at java.lang.ClassLoader.defineClass1(Native Method) >>>>>>>> at java.lang.ClassLoader.defineClass(ClassLoader.java:763) >>>>>>>> at java.security.SecureClassLoader.defineClass(SecureClassLoade >>>>>>>> r.java:142) >>>>>>>> at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) >>>>>>>> at java.net.URLClassLoader.access$100(URLClassLoader.java:73) >>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:368) >>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:362) >>>>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:361) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>>>>>>> at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.se >>>>>>>> rviceInit(TimelineClientImpl.java:269) >>>>>>>> at org.apache.hadoop.service.AbstractService.init(AbstractServi >>>>>>>> ce.java:163) >>>>>>>> at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingServ >>>>>>>> ice.serviceInit(ATSHistoryLoggingService.java:102) >>>>>>>> at org.apache.hadoop.service.AbstractService.init(AbstractServi >>>>>>>> ce.java:163) >>>>>>>> at org.apache.hadoop.service.CompositeService.serviceInit(Compo >>>>>>>> siteService.java:107) >>>>>>>> at org.apache.tez.dag.history.HistoryEventHandler.serviceInit(H >>>>>>>> istoryEventHandler.java:73) >>>>>>>> at org.apache.hadoop.service.AbstractService.init(AbstractServi >>>>>>>> ce.java:163) >>>>>>>> at org.apache.tez.dag.app.DAGAppMaster.initServices(DAGAppMaste >>>>>>>> r.java:1922) >>>>>>>> at org.apache.tez.dag.app.DAGAppMaster.serviceInit(DAGAppMaster >>>>>>>> .java:624) >>>>>>>> at org.apache.hadoop.service.AbstractService.init(AbstractServi >>>>>>>> ce.java:163) >>>>>>>> at org.apache.tez.dag.app.DAGAppMaster$8.run(DAGAppMaster.java: >>>>>>>> 2557) >>>>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:422) >>>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro >>>>>>>> upInformation.java:1698) >>>>>>>> at org.apache.tez.dag.app.DAGAppMaster.initAndStartAppMaster(DA >>>>>>>> GAppMaster.java:2554) >>>>>>>> at org.apache.tez.dag.app.DAGAppMaster.main(DAGAppMaster.java:2359) >>>>>>>> Caused by: java.lang.ClassNotFoundException: >>>>>>>> org.apache.phoenix.shaded.org.codehaus.jackson.jaxrs.Jackson >>>>>>>> JaxbJsonProvider >>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>>>>>>> ... 28 more >>>>>>>> >>>>>>>> Has anyone been able to get >>>>>>>> org.apache.phoenix.pig.PhoenixHBaseStorage() >>>>>>>> to work on recent EMR versions? Please help if you can. >>>>>>>> >>>>>>>> Thank you, >>>>>>>> Steve >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks & Regards, >>>>>> Anil Gupta >>>>>> >>>>> >>>>> >>>> >>> >>> >>> -- >>> Thanks & Regards, >>> Anil Gupta >>> >> >> >> >> -- >> Thanks & Regards, >> Anil Gupta >> > >