-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25222/
-----------------------------------------------------------
(Updated Sept. 2, 2014, 2:18 p.m.)
Review request for Sqoop.
Changes
-------
updated the description
Bugs: SQOOP-1395
https://issues.apache.org/jira/browse/SQOOP-1395
Repository: sqoop-trunk
Description (updated)
-------
If you import a table "users". Sqoop will generate an entity class named
"users.java". The class will be compiled, submitted and used by a mapreduce
job. If the target file format is Avro or Parquet, an Avro schema will be
generated as well. According to Avro specification, the entity class is
described as "record", the name of the "record" is "users".
For Parquet file format handling, we use the Kite SDK to manage Parquet file
reading and writing with minimal efforts. Kite requires an Avro schema and all
data records to be packed into GenericRecord instances. There will be a problem
here. Kite will read the schema first and try to instantiate a record regarding
its name. In this case, Kite will try to instantiate a "users" class.
Unfortunately, there is a "users.java" out there. This will cause mapreduce job
fail.
In order to solve this problem, I intend to keep the name of the entity class
and the Avro record different.
The patch will:
1. Create entity class with a suffix.
2. Use table name as Avro record name (instead of using the short class name of
the entity)
3. Remove the SqoopAvroRecord, as it is no longer required. (ClassWriter.java
is reverted to previous state)
Diffs
-----
src/java/org/apache/sqoop/avro/AvroUtil.java 4b37d58
src/java/org/apache/sqoop/lib/SqoopAvroRecord.java 80875d2
src/java/org/apache/sqoop/mapreduce/AvroImportMapper.java 6fc656f
src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 905ba8c
src/java/org/apache/sqoop/mapreduce/ParquetImportMapper.java cc2982c
src/java/org/apache/sqoop/mapreduce/ParquetJob.java a74432a
src/java/org/apache/sqoop/orm/AvroSchemaGenerator.java 806bace
src/java/org/apache/sqoop/orm/ClassWriter.java 4f9dedd
src/java/org/apache/sqoop/orm/TableClassName.java 88ab622
Diff: https://reviews.apache.org/r/25222/diff/
Testing
-------
Manually verified the unittests of Avro and Parquet file formats
Manually tested local Parquet and Avro import export.
File Attachments
----------------
SQOOP-1395.patch
https://reviews.apache.org/media/uploaded/files/2014/09/01/65cde72b-70a2-439c-86e5-eedfaada46f5__SQOOP-1395.patch
Thanks,
Qian Xu