[ 
https://issues.apache.org/jira/browse/AVRO-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875513#comment-16875513
 ] 

Kengo Seki edited comment on AVRO-2429 at 6/29/19 1:40 PM:
-----------------------------------------------------------

[~Fokko] The file I attached is a valid avro file, as follows:
{code:java}
$ java -jar avro-tools-1.9.0.jar getschema /tmp/uuid.avro 
19/06/29 22:25:06 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
{
  "type" : "record",
  "name" : "foo",
  "fields" : [ {
    "name" : "bar",
    "type" : {
      "type" : "string",
      "logicalType" : "uuid"
    }
  } ]
}
$ java -jar avro-tools-1.9.0.jar tojson /tmp/uuid.avro 
19/06/29 22:25:28 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
{"bar":"ee5ae08a-9a70-11e9-af01-47be1a54dc74"}
{code}
But it causes SchemaParseException with the latest Python2 bindings:
{code:java}
$ python -c "from avro.datafile import *; from avro.io import *; 
DataFileReader(open('/tmp/uuid.avro'), DatumReader())"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/sekikn/repo/avro/lang/py/src/avro/datafile.py", line 257, in 
__init__
    self.datum_reader.writers_schema = schema.parse(self.get_meta(SCHEMA_KEY))
  File "/home/sekikn/repo/avro/lang/py/src/avro/schema.py", line 986, in parse
    return make_avsc_object(json_data, names)
  File "/home/sekikn/repo/avro/lang/py/src/avro/schema.py", line 941, in 
make_avsc_object
    return RecordSchema(name, namespace, fields, names, type, doc, other_props)
  File "/home/sekikn/repo/avro/lang/py/src/avro/schema.py", line 754, in 
__init__
    field_objects = RecordSchema.make_field_objects(fields, names)
  File "/home/sekikn/repo/avro/lang/py/src/avro/schema.py", line 720, in 
make_field_objects
    other_props)
  File "/home/sekikn/repo/avro/lang/py/src/avro/schema.py", line 384, in 
__init__
    raise SchemaParseException(fail_msg)
avro.schema.SchemaParseException: Type property "{u'logicalType': u'uuid', 
u'type': u'string'}" not a valid Avro schema: Currently does not support uuid 
logical type
{code}
[According to the 
specification|http://avro.apache.org/docs/1.9.0/spec.html#Logical+Types], using 
the underlying type instead of unknown logical type seems better than raising 
an exception.
{quote}Language implementations must ignore unknown logical types when reading, 
and should use the underlying Avro type.
{quote}


was (Author: sekikn):
[~Fokko] The file I attached is a valid avro file, as follows:

{code}
$ java -jar avro-tools-1.9.0.jar getschema /tmp/uuid.avro 
19/06/29 22:25:06 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
{
  "type" : "record",
  "name" : "foo",
  "fields" : [ {
    "name" : "bar",
    "type" : {
      "type" : "string",
      "logicalType" : "uuid"
    }
  } ]
}
$ java -jar avro-tools-1.9.0.jar tojson /tmp/uuid.avro 
19/06/29 22:25:28 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
{"bar":"ee5ae08a-9a70-11e9-af01-47be1a54dc74"}
{code}

But it causes SchemaParseException with the latest Python2 bindings:

{code}
$ python -c "from avro.datafile import *; from avro.io import *; 
DataFileReader(open('/tmp/uuid.avro'), DatumReader())"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/sekikn/repo/avro/lang/py/src/avro/datafile.py", line 257, in 
__init__
    self.datum_reader.writers_schema = schema.parse(self.get_meta(SCHEMA_KEY))
  File "/home/sekikn/repo/avro/lang/py/src/avro/schema.py", line 986, in parse
    return make_avsc_object(json_data, names)
  File "/home/sekikn/repo/avro/lang/py/src/avro/schema.py", line 941, in 
make_avsc_object
    return RecordSchema(name, namespace, fields, names, type, doc, other_props)
  File "/home/sekikn/repo/avro/lang/py/src/avro/schema.py", line 754, in 
__init__
    field_objects = RecordSchema.make_field_objects(fields, names)
  File "/home/sekikn/repo/avro/lang/py/src/avro/schema.py", line 720, in 
make_field_objects
    other_props)
  File "/home/sekikn/repo/avro/lang/py/src/avro/schema.py", line 384, in 
__init__
    raise SchemaParseException(fail_msg)
avro.schema.SchemaParseException: Type property "{u'logicalType': u'uuid', 
u'type': u'string'}" not a valid Avro schema: Currently does not support uuid 
logical type
{code}

[According to the 
specification|http://avro.apache.org/docs/1.9.0/spec.html#Logical+Types], using 
the underlying type instead of unknown logical type seems better, rather than 
raising an exception.

{quote}
Language implementations must ignore unknown logical types when reading, and 
should use the underlying Avro type.
{quote} 

> Avro 1.9.0 fails when reading logical types other than "decimal"
> ----------------------------------------------------------------
>
>                 Key: AVRO-2429
>                 URL: https://issues.apache.org/jira/browse/AVRO-2429
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 1.9.0
>            Reporter: Chamikara Jayalath
>            Priority: Major
>         Attachments: uuid.avro
>
>
> [https://github.com/apache/avro/pull/82] added support for Avro "decimal" 
> logical type but also added an assertion that results in a reader failing for 
> other logical types. 
> [https://github.com/apache/avro/blob/master/lang/py/src/avro/schema.py#L821]
> I believe this is a regression since previously avro library used to read the 
> underlying primitive type instead of failing.
> Can we revert the behavior for logical types that are not "decimal" by 
> removing this assertion and reverting to the old (avro 1.8.1) behavior of 
> returning the primitive type ?
>  
> cc: [~Fokko] [~mtth]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to