[ 
https://issues.apache.org/jira/browse/AVRO-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845405#action_12845405
 ] 

Matt Massie commented on AVRO-465:
----------------------------------

Jeff-

I'd like to understand this problem a little better.  The C implementation 
shouldn't require you know the file schema ahead of time.

If you pass in NULL for the reader's schema, then the writer's schema will be 
used.  This is a documentation bug since I don't explicitly explain this 
anywhere.

Can you please try to read the data file with the reader's schema set to NULL?

Btw, the relevant code is in datum_read.c around line 303

{code}
if (readers_schema == NULL) {
     readers_schema = writers_schema;
} else if (!avro_schema_match(writers_schema, readers_schema)) {
     return EINVAL;
}
{code}


> C implementation requires you to know a file's schema before reading
> --------------------------------------------------------------------
>
>                 Key: AVRO-465
>                 URL: https://issues.apache.org/jira/browse/AVRO-465
>             Project: Avro
>          Issue Type: Bug
>          Components: c
>    Affects Versions: 1.3.0
>            Reporter: Jeff Hodges
>         Attachments: AVRO-465-schema_for_reader.patch
>
>
> The C implementation gives the user no way of reading the objects in a data 
> file without knowing the file's schema ahead of time.
> While it does fill in the writers_schema part of the avro_file_reader_t on 
> read, this field is not available to the API as it is left out of avro.h. Two 
> options persent itself: 1) preserve the API as is and add a 
> avro_schema_from_file_reader() function or 2) move the avro_file_reader_t and 
> avro_file_writer_t structs to avro.h.
> A third option, that I don't approve of, is providing a function that reads 
> from a datafile but uses the writers_schema in the reader given instead of 
> requiring another schema to be passed into it. This is problematic because 
> anyone using the API would have fewer debugging and testing options when 
> dealing with interop datasets. Any problem that occurs might just be the 
> schema in the file being off, or whatever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to