Spark, Hive, and Pig can all fairly easily handle CSV to Avro/Parquet. -----Original Message----- From: Matt [mailto:[email protected]] Sent: Monday, March 28, 2016 2:09 PM To: [email protected] Subject: CSV to Parquet?
Can Sqoop perform conversion from flat / CSV data files into Parquet or Avro? I know Sqoop's primary purpose is RDBMS <-> Hadoop, but it seems like it could be a good file format conversion utility as well. If this is too far outside of Sqoop's mandate, are there other tools that automate bulk conversion to Parquet, hopefully with parallelism for performance?
