I am trying to use sqoop 1.4.4 to import data from a mysql DB directly to S3
and I am running into an issue where if one of the file splits is larger than 5
GB then the import fails.
Details for this question are listed here in my SO post - I promise to follow
good cross-posting etiquette :)
http://stackoverflow.com/questions/25068747/sqoop-import-to-s3-hits-5-gb-limit
One of my main questions is should I be using sqoop 2 rather than sqoop 1.4.4?
Also, should I be sqooping to HDFS, then copying the data over to S3 for
permanent storage? Thanks!