-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66548/#review203762
-----------------------------------------------------------



Hi Dani,

Thank you for submitting this new feature and sorry for the late review.
I have left some minor comments inline and I suggest adding some more 
documentation explaining what exactly is supported with the ORC files.
The implementation suggests that we only support HDFS and Hive import at this 
moment, so export is not covered yet. If this is true I think we should 
emphasize it in the documentation.


src/java/org/apache/sqoop/hive/TableDefWriter.java
Lines 194 (patched)
<https://reviews.apache.org/r/66548/#comment286060>

    I am not that familiar with Hive CREATE TABLE statement but as far as I 
understand 'STORED AS ORC' basically means that we will use 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, 
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, 
org.apache.hadoop.hive.ql.io.orc.OrcSerde is that correct?



src/java/org/apache/sqoop/hive/TableDefWriter.java
Line 197 (original), 200 (patched)
<https://reviews.apache.org/r/66548/#comment286061>

    Does ORC support different compression codecs? If yes, I think we should 
emphasize in the documentation (and/or implement a fail fast) that Sqoop does 
not support the compression-codec option with ORC files at the moment.



src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java
Lines 41 (patched)
<https://reviews.apache.org/r/66548/#comment286065>

    Can we just use NullWritable.get() instead of introducing a field called 
"nada"?



src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java
Lines 42 (patched)
<https://reviews.apache.org/r/66548/#comment286064>

    Can we make this field private?



src/java/org/apache/sqoop/util/OrcConversionContext.java
Lines 98 (patched)
<https://reviews.apache.org/r/66548/#comment286063>

    typo: tinyiny



src/java/org/apache/sqoop/util/OrcUtil.java
Lines 55 (patched)
<https://reviews.apache.org/r/66548/#comment286062>

    We use Hive types as ORC schema types here, is this always going to be 
correct?
    I am not too familiar with the ORC type, does it support all the Hive data 
types?



src/test/org/apache/sqoop/TestOrcImport.java
Lines 50 (patched)
<https://reviews.apache.org/r/66548/#comment286066>

    It seems that this test case only covers the HDFS import, can we add test 
cases which cover the Hive import too?


- Szabolcs Vasas


On May 2, 2018, 12:12 p.m., daniel voros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66548/
> -----------------------------------------------------------
> 
> (Updated May 2, 2018, 12:12 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3311
>     https://issues.apache.org/jira/browse/SQOOP-3311
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID 
> by default. This will probably result in increased usage of ACID tables and 
> the need to support importing into ACID tables with Sqoop.
> 
> Currently the only table format supporting full ACID tables is ORC.
> 
> The easiest and most effective way to support importing into these tables 
> would be to write out files as ORC and keep using LOAD DATA as we do for all 
> other Hive tables (supported since HIVE-17361).
> 
> Workaround could be to create table as textfile (as before) and then CTAS 
> from that. This would push the responsibility of creating ORC format to Hive. 
> However it would result in writing every record twice; in text format and in 
> ORC.
> 
> Note that ORC is only necessary for full ACID tables. Insert-only (aka. 
> micromanaged) ACID tables can use arbitrary file format.
> 
> Supporting full ACID tables would also be the first step in making 
> "lastmodified" incremental imports work with Hive.
> 
> 
> Diffs
> -----
> 
>   ivy.xml 1f587f3e 
>   ivy/libraries.properties 565a8bf5 
>   src/docs/man/import-common-args.txt 22e3448e 
>   src/docs/man/sqoop-import-all-tables.txt 6db38ad8 
>   src/docs/user/hcatalog.txt 2ae1d54d 
>   src/docs/user/help.txt 8a0d1477 
>   src/docs/user/import-all-tables.txt fbad47b2 
>   src/docs/user/import.txt 2d074f49 
>   src/java/org/apache/sqoop/SqoopOptions.java d9984af3 
>   src/java/org/apache/sqoop/hive/TableDefWriter.java 27d988c5 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java a5962ba4 
>   src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 783651a4 
>   src/java/org/apache/sqoop/tool/ExportTool.java 060f2c07 
>   src/java/org/apache/sqoop/tool/ImportTool.java ee79d8b7 
>   src/java/org/apache/sqoop/util/OrcConversionContext.java PRE-CREATION 
>   src/java/org/apache/sqoop/util/OrcUtil.java PRE-CREATION 
>   src/test/org/apache/sqoop/TestAllTables.java 16933a82 
>   src/test/org/apache/sqoop/TestOrcImport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestTableDefWriter.java 3ea61f64 
>   src/test/org/apache/sqoop/util/TestOrcConversionContext.java PRE-CREATION 
>   src/test/org/apache/sqoop/util/TestOrcUtil.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/66548/diff/8/
> 
> 
> Testing
> -------
> 
> - added some unit tests
> - tested basic Hive import scenarios on a cluster
> 
> 
> Thanks,
> 
> daniel voros
> 
>

Reply via email to