[ 
https://issues.apache.org/jira/browse/SQOOP-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13580827#comment-13580827
 ] 

Jarek Jarcec Cecho commented on SQOOP-777:
------------------------------------------

Hi Hari,
thank you very much for taking over this issue. We've discussed the 
intermediate format a lot on our earlier design conference calls ([meeting 
minutes|https://cwiki.apache.org/confluence/display/SQOOP/Sqoop2+Weekly+Meeting+Minutes]
 are available on the wiki). The avro was one of the possibilities that we've 
explored. Outcome of those discussions was to use use text format that is very 
near what {{mysqldump}} and {{pg_dump}} are producing, so that Sqoop2 
performance can be comparable to those tools. The reasoning is that we did not 
want to force stream oriented connectors to fully parse all the incoming data 
when going throw the framework into avro format and rather let them "decorate" 
the stream as it's passing.

Jarcec
                
> Sqoop2: Implement intermediate data format representation policy 
> -----------------------------------------------------------------
>
>                 Key: SQOOP-777
>                 URL: https://issues.apache.org/jira/browse/SQOOP-777
>             Project: Sqoop
>          Issue Type: New Feature
>    Affects Versions: 2.0.0
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Hari Shreedharan
>             Fix For: 2.0.0
>
>
> We should enforce our intermediate data format policy to enforce as currently 
> each driver can do it differently and that might break things.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to