[ https://issues.apache.org/jira/browse/SQOOP-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13580827#comment-13580827 ]
Jarek Jarcec Cecho commented on SQOOP-777: ------------------------------------------ Hi Hari, thank you very much for taking over this issue. We've discussed the intermediate format a lot on our earlier design conference calls ([meeting minutes|https://cwiki.apache.org/confluence/display/SQOOP/Sqoop2+Weekly+Meeting+Minutes] are available on the wiki). The avro was one of the possibilities that we've explored. Outcome of those discussions was to use use text format that is very near what {{mysqldump}} and {{pg_dump}} are producing, so that Sqoop2 performance can be comparable to those tools. The reasoning is that we did not want to force stream oriented connectors to fully parse all the incoming data when going throw the framework into avro format and rather let them "decorate" the stream as it's passing. Jarcec > Sqoop2: Implement intermediate data format representation policy > ----------------------------------------------------------------- > > Key: SQOOP-777 > URL: https://issues.apache.org/jira/browse/SQOOP-777 > Project: Sqoop > Issue Type: New Feature > Affects Versions: 2.0.0 > Reporter: Jarek Jarcec Cecho > Assignee: Hari Shreedharan > Fix For: 2.0.0 > > > We should enforce our intermediate data format policy to enforce as currently > each driver can do it differently and that might break things. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira