[ 
https://issues.apache.org/jira/browse/TAJO-711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961875#comment-13961875
 ] 

David Chen commented on TAJO-711:
---------------------------------

I have an initial implementation of this done and have posted an RB. There are 
still a few changes and validation work I would like to do before this is fully 
ready:

 * Test the use of {{avro.schema.url}}. Currently, the tests only test 
{{avro.schema.literal}}.
 * Converting between Avro and Tajo is slightly tricky because data sets would 
usually use the Avro schema as the "true" schema. I would like to do some more 
validation to look for some more corner cases. For this ticket, I'll do a 
best-effort validation with flat schemas, though Avro support might not be 
truly battle-tested until TAJO-710 is done because most of our data here at 
LinkedIn have nested schemas.

Something else we would want to look at is schema evolution across partitions. 
I haven't looked too closely at TAJO-283 yet, but are we storing table 
properties into the partitions? For example, say that partitions i...j are 
created with Avro schema A, set by either the {{avro.schema.url}} or 
{{avro.schema.literal}} property. Now, partitions j+1...k are created with an 
evolved Avro schema A'. Does the current implementation of partitions in Tajo 
support storing such properties within the partitions? In any event, if this 
might an issue, we can create a separate ticket for this work.

> Add Avro storage support
> ------------------------
>
>                 Key: TAJO-711
>                 URL: https://issues.apache.org/jira/browse/TAJO-711
>             Project: Tajo
>          Issue Type: New Feature
>            Reporter: David Chen
>            Assignee: David Chen
>         Attachments: TAJO-711.patch
>
>
> Add {{FileScanner}} and {{FileAppender}} for reading from and writing to Avro.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to