----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66548/#review202879 -----------------------------------------------------------
Hi Daniel, I did not have time for a full review yet, though already found a few smaller flaws, and one conceptional one. Conceptional: I'd strongly suggest to extract ORC reading/writing only as a subtask as ACID Hive tables. The clean advantages would be: We would be able to test ORC file import/export without any Hive related changes, and thus testing the correctness with PrestoDB, or any other tool which can read read/write ORC files. What do you think? Smaller ones: See inline. Regards, Attila ivy.xml Lines 144-149 (patched) <https://reviews.apache.org/r/66548/#comment284910> Are we sure this is the total exclude list what we need, and no Hive or other library related "debris" are coming with it? ivy/libraries.properties Lines 62 (patched) <https://reviews.apache.org/r/66548/#comment284911> Shouldn't we aim for the nohive version of 1.4.3? src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java Lines 47-51 (patched) <https://reviews.apache.org/r/66548/#comment284912> Aren't we missing a super call here? src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java Lines 53 (patched) <https://reviews.apache.org/r/66548/#comment284913> Occording to the general code guidelines and Joschua Bloch (Effective Java) this is a very dangerous practice to initialize an object in a non-final initializer. Please correct this! - Attila Szabo On May 2, 2018, 12:12 p.m., daniel voros wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66548/ > ----------------------------------------------------------- > > (Updated May 2, 2018, 12:12 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3311 > https://issues.apache.org/jira/browse/SQOOP-3311 > > > Repository: sqoop-trunk > > > Description > ------- > > Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID > by default. This will probably result in increased usage of ACID tables and > the need to support importing into ACID tables with Sqoop. > > Currently the only table format supporting full ACID tables is ORC. > > The easiest and most effective way to support importing into these tables > would be to write out files as ORC and keep using LOAD DATA as we do for all > other Hive tables (supported since HIVE-17361). > > Workaround could be to create table as textfile (as before) and then CTAS > from that. This would push the responsibility of creating ORC format to Hive. > However it would result in writing every record twice; in text format and in > ORC. > > Note that ORC is only necessary for full ACID tables. Insert-only (aka. > micromanaged) ACID tables can use arbitrary file format. > > Supporting full ACID tables would also be the first step in making > "lastmodified" incremental imports work with Hive. > > > Diffs > ----- > > ivy.xml 6af94d9d > ivy/libraries.properties c44b50bc > src/docs/man/import-common-args.txt 22e3448e > src/docs/man/sqoop-import-all-tables.txt 6db38ad8 > src/docs/user/hcatalog.txt 2ae1d54d > src/docs/user/help.txt 8a0d1477 > src/docs/user/import-all-tables.txt fbad47b2 > src/docs/user/import.txt 2d074f49 > src/java/org/apache/sqoop/SqoopOptions.java d9984af3 > src/java/org/apache/sqoop/hive/TableDefWriter.java 27d988c5 > src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java a5962ba4 > src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java PRE-CREATION > src/java/org/apache/sqoop/tool/BaseSqoopTool.java 783651a4 > src/java/org/apache/sqoop/tool/ExportTool.java 060f2c07 > src/java/org/apache/sqoop/tool/ImportTool.java ee79d8b7 > src/java/org/apache/sqoop/util/OrcConversionContext.java PRE-CREATION > src/java/org/apache/sqoop/util/OrcUtil.java PRE-CREATION > src/test/org/apache/sqoop/TestAllTables.java 56d1f577 > src/test/org/apache/sqoop/TestOrcImport.java PRE-CREATION > src/test/org/apache/sqoop/hive/TestTableDefWriter.java 3ea61f64 > src/test/org/apache/sqoop/util/TestOrcConversionContext.java PRE-CREATION > src/test/org/apache/sqoop/util/TestOrcUtil.java PRE-CREATION > > > Diff: https://reviews.apache.org/r/66548/diff/7/ > > > Testing > ------- > > - added some unit tests > - tested basic Hive import scenarios on a cluster > > > Thanks, > > daniel voros > >