A little bit more searching shows this: http://www.harshj.com/2010/04/25/writing-and-reading-avro-data-files-using-python/
On Thu, Feb 2, 2012 at 2:48 PM, Russell Jurney <russell.jur...@gmail.com>wrote: > The jars being used are: > > REGISTER /me/pig/build/ivy/lib/Pig/avro-1.5.3.jar > REGISTER /me/pig/build/ivy/lib/Pig/json-simple-1.1.jar > REGISTER /me/pig/contrib/piggybank/java/piggybank.jar > REGISTER /me/pig/build/ivy/lib/Pig/jackson-core-asl-1.7.3.jar > REGISTER /me/pig/build/ivy/lib/Pig/jackson-mapper-asl-1.7.3.jar > > On Thu, Feb 2, 2012 at 2:41 PM, James Baldassari <jbaldass...@gmail.com>wrote: > >> HI Russell, >> >> I'm not sure about the Python error, but the Java error looks like a >> classpath problem, not a schema parsing issue. The NoSuchMethodError in >> the stack trace indicates that Avro was trying to invoke a method in the >> Jackson library that wasn't present at run-time. My guess is that your >> program (or Pig?) either has two incompatible versions of the Jackson >> library on its classpath or maybe Avro's Jackson dependency has been >> excluded and a version that is incompatible with Avro is on the classpath. >> >> Which version of Avro is being used? Running 'mvn dependency:tree' in >> Avro trunk I see that it's depending on Jackson 1.8.6. Can you verify that >> only one version of Jackson is on the classpath and that it's the version >> that is required by whatever version of Avro is on the classpath? >> >> -James >> >> >> >> On Thu, Feb 2, 2012 at 5:21 PM, Russell Jurney >> <russell.jur...@gmail.com>wrote: >> >>> Correction: when I read the file in Python, I get the error below. It >>> looks like a unicode problem? Can one tell Avro how to handle this? >>> >>> Traceback (most recent call last): >>> File "./cat_avro", line 21, in <module> >>> for record in df_reader: >>> File >>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/datafile.py", >>> line 354, in next >>> datum = self.datum_reader.read(self.datum_decoder) >>> File >>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>> line 445, in read >>> return self.read_data(self.writers_schema, self.readers_schema, >>> decoder) >>> File >>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>> line 490, in read_data >>> return self.read_record(writers_schema, readers_schema, decoder) >>> File >>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>> line 690, in read_record >>> field_val = self.read_data(field.type, readers_field.type, decoder) >>> File >>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>> line 488, in read_data >>> return self.read_union(writers_schema, readers_schema, decoder) >>> File >>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>> line 654, in read_union >>> return self.read_data(selected_writers_schema, readers_schema, >>> decoder) >>> File >>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>> line 458, in read_data >>> return self.read_data(writers_schema, s, decoder) >>> File >>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>> line 468, in read_data >>> return decoder.read_utf8() >>> File >>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>> line 233, in read_utf8 >>> return unicode(self.read_bytes(), "utf-8") >>> UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position 543: >>> invalid start byte >>> >>> >>> On Thu, Feb 2, 2012 at 2:06 PM, Russell Jurney < >>> russell.jur...@gmail.com> wrote: >>> >>>> I am writing Avro records in Ruby using the avro ruby gem in 1.8.7. I >>>> have problems with loading these files sometimes. As a result, I am unable >>>> to write large files that are readable. >>>> >>>> The exception I get is below. Anyone have an idea what this means? It >>>> looks like Avro is having trouble parsing the schema. The avro files parse >>>> in Ruby and Python, just not Pig. Are there more rigorous checks in Java? >>>> >>>> Pig Stack Trace >>>> --------------- >>>> ERROR 2998: Unhandled internal error. >>>> org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory; >>>> >>>> java.lang.NoSuchMethodError: >>>> org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory; >>>> at org.apache.avro.Schema.<clinit>(Schema.java:82) >>>> at >>>> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.<clinit>(AvroStorageUtils.java:49) >>>> at >>>> org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:163) >>>> at >>>> org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:144) >>>> at >>>> org.apache.pig.piggybank.storage.avro.AvroStorage.getSchema(AvroStorage.java:269) >>>> at >>>> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:150) >>>> at >>>> org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:109) >>>> at >>>> org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100) >>>> at >>>> org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:218) >>>> at >>>> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) >>>> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) >>>> at >>>> org.apache.pig.newplan.logical.visitor.CastLineageSetter.<init>(CastLineageSetter.java:57) >>>> at org.apache.pig.PigServer$Graph.compile(PigServer.java:1679) >>>> at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1610) >>>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1582) >>>> at org.apache.pig.PigServer.registerQuery(PigServer.java:584) >>>> at >>>> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942) >>>> at >>>> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) >>>> at >>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) >>>> at >>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) >>>> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) >>>> at org.apache.pig.Main.run(Main.java:495) >>>> at org.apache.pig.Main.main(Main.java:111) >>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>> at >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>> at >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>>> >>>> ================================================================================ >>>> >>>> -- >>>> Russell Jurney >>>> twitter.com/rjurney >>>> russell.jur...@gmail.com >>>> datasyndrome.com >>>> >>> >>> >>> >>> -- >>> Russell Jurney >>> twitter.com/rjurney >>> russell.jur...@gmail.com >>> datasyndrome.com >>> >> >> > > > -- > Russell Jurney > twitter.com/rjurney > russell.jur...@gmail.com > datasyndrome.com > -- Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com