Re: Avro SerDe Issue w/ Manual Partitions?

2016-03-06 Thread Chris Miller
For anyone running into this same issue, it looks like Avro deserialization is just broken when used with SparkSQL and partitioned schemas. I created an bug report with details and a simplified example on how to reproduce: https://issues.apache.org/jira/browse/SPARK-13709 -- Chris Miller On

Re: Avro SerDe Issue w/ Manual Partitions?

2016-03-03 Thread Chris Miller
One more thing -- just to set aside any question about my specific schema or data, I used the sample schema and data record from Oracle's documentation on Avro support. It's a pretty simple schema: https://docs.oracle.com/cd/E26161_02/html/GettingStartedGuide/jsonbinding-overview.html When I

Re: Avro SerDe Issue w/ Manual Partitions?

2016-03-03 Thread Chris Miller
No, the name of the field is *enum1* -- the name of the field's type is *enum1_values*. It should not be looking for enum1_values -- that's not the way the specification states that the standard works, and it's not how any other implementation reads Avro data. For what it's worth, if I change

Re: Avro SerDe Issue w/ Manual Partitions?

2016-03-03 Thread Igor Berman
your field name is *enum1_values* but you have data { "foo1": "test123", *"enum1"*: "BLUE" } i.e. since you defined enum and not union(null, enum) it tries to find value for enum1_values and doesn't find one... On 3 March 2016 at 11:30, Chris Miller wrote: > I've been

Re: Avro SerDe Issue w/ Manual Partitions?

2016-03-03 Thread Chris Miller
I've been digging into this a little deeper. Here's what I've found: test1.avsc: { "namespace": "com.cmiller", "name": "test1", "type": "record", "fields": [ { "name":"foo1", "type":"string" } ] } test2.avsc: {

Avro SerDe Issue w/ Manual Partitions?

2016-03-02 Thread Chris Miller
Hi, I have a strange issue occurring when I use manual partitions. If I create a table as follows, I am able to query the data with no problem: CREATE TABLE test1 ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT