[ https://issues.apache.org/jira/browse/DRILL-4056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jacques Nadeau resolved DRILL-4056. ----------------------------------- Resolution: Fixed Resolved in 45d0326ccbf9bad8936374174116ae8e17461cb0 > Avro deserialization corrupts data > ---------------------------------- > > Key: DRILL-4056 > URL: https://issues.apache.org/jira/browse/DRILL-4056 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Other > Affects Versions: 1.3.0 > Environment: Ubuntu 15.04 - Oracle Java > Reporter: Stefán Baxter > Assignee: Jason Altekruse > Fix For: 1.3.0 > > Attachments: test.zip > > > I have an Avro file that support the following data/schema: > {"field":"some", "classification":{"variant":"Gæst"}} > When I select 10 rows from this file I get: > +---------------------+ > | EXPR$0 | > +---------------------+ > | Gæst | > | Voksen | > | Voksen | > | Invitation KIF KBH | > | Invitation KIF KBH | > | Ordinarie pris KBH | > | Ordinarie pris KBH | > | Biljetter 200 krBH | > | Biljetter 200 krBH | > | Biljetter 200 krBH | > +---------------------+ > The bug is that the field values are incorrectly de-serialized and the value > from the previous row is retained if the subsequent row is shorter. > The sql query: > "select s.classification.variant variant from dfs.<some> as s limit 10;" > That way the "Ordinarie pris" becomes "Ordinarie pris KBH" because the > previous row had the value "Invitation KIF KBH". -- This message was sent by Atlassian JIRA (v6.3.4#6332)