[jira] Assigned: (AVRO-517) Resolving Decoder fails in some cases
[ https://issues.apache.org/jira/browse/AVRO-517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvalluvan M. G. reassigned AVRO-517: Assignee: Thiruvalluvan M. G. > Resolving Decoder fails in some cases > - > > Key: AVRO-517 > URL: https://issues.apache.org/jira/browse/AVRO-517 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.3.2 >Reporter: Scott Carey >Assignee: Thiruvalluvan M. G. >Priority: Critical > > User reports that reading an 'actual' schema of > string, string, int > fails when using an expected schema of: > string, string > Sample code and details in the comments. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Questions re integrating Avro into Cascading process
Hi all, We're looking at creating a Cascading Scheme for Avro, and have got a few questions below. These are very general, as this is more of a scoping phase (as in, are we crazy to try this) so apologies in advance for lack of detail. For context, Cascading is an open source project that provides a workflow API on top of Hadoop. The key unit of data is a tuple, which corresponds to a record - you have fields (names) and values. Cascading uses a generalized "tap" concept for reading & writing tuples, where a tap uses a scheme to handle the low-level mapping from Cascading-land to/from the storage format. So the goal here is to define a Cascading Scheme that will run on 0.18.3 and later versions of Hadoop, and provide general support for reading/writing tuples from/to an Avro-format Hadoop part-x file. We grabbed the recently committed AvroXXX code from org.apache.avro.mapred (thanks Doug & Scott), and began building the Cascading scheme to bridge between AvroWrapper keys and Cascading tuples. 1. What's the best approach if we want to dynamically define the Avro schema, based on a list of field names and types (classes)? This assumes it's possible to dynamically define & use a schema, of course. 2. How much has the new Hadoop map-reduce support code been tested? 3. Will there be issues with running in 0.18.3, 0.19.2, etc? I saw some discussion about Hadoop using the older Jackson 1.0.1 jar, and that then creating problems. Anything else? 4. The key integration point, besides the fields+classes to schema issue above, is mapping between Cascading tuples and AvroWrapper If we're using (I assume) the generic format, any input on how we'd do this two-way conversion? Thanks! -- Ken Ken Krugler +1 530-210-6378 http://bixolabs.com e l a s t i c w e b m i n i n g
[jira] Created: (AVRO-518) make check in c++ is broken because of typo & missing boost_filesystem library
make check in c++ is broken because of typo & missing boost_filesystem library -- Key: AVRO-518 URL: https://issues.apache.org/jira/browse/AVRO-518 Project: Avro Issue Type: Bug Components: c++ Environment: linux w/boost 1.42 Reporter: John Plevyak Attachments: avro-cpp-buffer-jp-v1.patch "make check" in c++ is broken because of typo & missing boost_filesystem library. The typo is inverting BOOST and HAVE in api/buffer/detail/BufferDetailIterator.hh -#ifdef BOOST_HAVE_ASIO +#ifdef HAVE_BOOST_ASIO The missing library requires adding a new m4 macro. I will include a patch. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (AVRO-518) make check in c++ is broken because of typo & missing boost_filesystem library
[ https://issues.apache.org/jira/browse/AVRO-518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Plevyak updated AVRO-518: -- Attachment: avro-cpp-buffer-jp-v1.patch minimal patch to get "make check" to work on trunk > make check in c++ is broken because of typo & missing boost_filesystem library > -- > > Key: AVRO-518 > URL: https://issues.apache.org/jira/browse/AVRO-518 > Project: Avro > Issue Type: Bug > Components: c++ > Environment: linux w/boost 1.42 >Reporter: John Plevyak > Attachments: avro-cpp-buffer-jp-v1.patch > > > "make check" in c++ is broken because of typo & missing boost_filesystem > library. > The typo is inverting BOOST and HAVE in > api/buffer/detail/BufferDetailIterator.hh > -#ifdef BOOST_HAVE_ASIO > +#ifdef HAVE_BOOST_ASIO > The missing library requires adding a new m4 macro. > I will include a patch. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (AVRO-517) Resolving Decoder fails in some cases
[ https://issues.apache.org/jira/browse/AVRO-517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857417#action_12857417 ] Scott Carey commented on AVRO-517: -- Sample code that shows this issue: {code} import java.io.File; import java.io.IOException; import org.apache.avro.Schema; import org.apache.avro.file.DataFileReader; import org.apache.avro.file.DataFileWriter; import org.apache.avro.generic.GenericData; import org.apache.avro.generic.GenericDatumReader; import org.apache.avro.generic.GenericDatumWriter; import org.apache.avro.generic.GenericData.Record; import org.apache.avro.util.Utf8; public class AddressBook { String fileName = "AddressBook.db"; String prefix = "{\"type\":\"record\",\"name\": \"Person\",\"fields\":["; String suffix = "]}"; String fieldFirst = "{\"name\":\"First\",\"type\":\"string\"}"; String fieldLast = "{\"name\":\"Last\",\"type\":\"string\"}"; String fieldAge = "{\"name\":\"Age\",\"type\":\"int\"}"; Schema personSchema = Schema.parse(prefix + fieldFirst + "," + fieldLast + "," + fieldAge + suffix); Schema ageSchema = Schema.parse(prefix + fieldAge + suffix); Schema extractSchema = Schema.parse(prefix + fieldFirst + "," + fieldLast + suffix); /** * @param args * @throws IOException */ public static void main(String[] args) throws IOException { AddressBook ab = new AddressBook(); ab.init(); ab.browseAge(); ab.browseName(); } public void init() throws IOException { DataFileWriter writer = new DataFileWriter( new GenericDatumWriter(personSchema)).create( personSchema, new File(fileName)); try { writer.append(createPerson("Dante", "Hicks", 27)); writer.append(createPerson("Randal", "Graves", 20)); writer.append(createPerson("Steve", "Jobs", 31)); } finally { writer.close(); } } private Record createPerson(String first, String last, int age) { Record person = new GenericData.Record(personSchema); person.put("First", new Utf8(first)); person.put("Last", new Utf8(last)); person.put("Age", age); return person; } public void browseAge() throws IOException { GenericDatumReader dr = new GenericDatumReader(); dr.setExpected(ageSchema); DataFileReader reader = new DataFileReader(new File( fileName), dr); try { while (reader.hasNext()) { Record person = reader.next(); System.out.println(person.get("Age").toString()); } } finally { reader.close(); } } public void browseName() throws IOException { GenericDatumReader dr = new GenericDatumReader(); dr.setExpected(extractSchema); DataFileReader reader = new DataFileReader(new File( fileName), dr); try { while (reader.hasNext()) { Record person = reader.next(); System.out.println(person.get("First").toString() + " " + person.get("Last").toString() + "\t"); } } finally { reader.close(); } } } {code} User comments: {quote} Hi, 27 20 31 Dante Hicks Exception in thread "main" org.apache.avro.AvroRuntimeException: java.io.EOFException at org.apache.avro.file.DataFileStream.next(DataFileStream.java:184) at cn.znest.test.avro.AddressBook.browseName(AddressBook.java:91) at cn.znest.test.avro.AddressBook.main(AddressBook.java:43) Caused by: java.io.EOFException at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:163) at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:262) at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:93) at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:277) at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:271) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:83) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:105) at org.apach
[jira] Created: (AVRO-517) Resolving Decoder fails in some cases
Resolving Decoder fails in some cases - Key: AVRO-517 URL: https://issues.apache.org/jira/browse/AVRO-517 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.3.2 Reporter: Scott Carey Priority: Critical User reports that reading an 'actual' schema of string, string, int fails when using an expected schema of: string, string Sample code and details in the comments. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira