[
https://issues.apache.org/jira/browse/SQOOP-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089126#comment-13089126
]
[email protected] commented on SQOOP-318:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1597/
-----------------------------------------------------------
(Updated 2011-08-22 23:01:36.319406)
Review request for Sqoop.
Changes
-------
I added lzop to the CodecMap and modified the tests to reference the codec with
the short name. I added a blurb at the end of the Hive documentation describing
the splitting you get with the lzop codec. I also fixed the checkstyle issues.
Summary
-------
I added a check when generating the create table string to see if the LzopCodec
is in use. If it is, it outputs
STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
at the end of the create table command, otherwise it outputs the standard
STORED AS TEXTFILE
I also added a call to the DistributedLzoIndexer before the data is imported
into Hive.
This addresses bug SQOOP-318.
https://issues.apache.org/jira/browse/SQOOP-318
Diffs (updated)
-----
src/docs/user/hive.txt 059d7cb
src/java/com/cloudera/sqoop/hive/HiveImport.java 36c17ba
src/java/com/cloudera/sqoop/hive/TableDefWriter.java 7dd9135
src/java/com/cloudera/sqoop/io/CodecMap.java 8564164
src/test/com/cloudera/sqoop/hive/TestTableDefWriter.java 43b755e
Diff: https://reviews.apache.org/r/1597/diff
Testing
-------
It includes a test for the create table syntax. I manually tested calling the
indexer. I'm not sure how to automate that without making LZO required to build.
Thanks,
Joey
> Add support for splittable lzo files with Hive
> ----------------------------------------------
>
> Key: SQOOP-318
> URL: https://issues.apache.org/jira/browse/SQOOP-318
> Project: Sqoop
> Issue Type: Improvement
> Components: hive-integration
> Affects Versions: 1.3.0
> Reporter: Joey Echeverria
> Assignee: Joey Echeverria
> Priority: Minor
> Attachments: SQOOP-318-1.patch, SQOOP-318-2.patch
>
>
> When importing LZO compressed files into Hive, it would be useful to create
> the hive table with the com.hadoop.mapred.DeprecatedLzoTextInputFormat. It
> would also be nice to automatically run the DistributedIndexer so that the
> LZO files can be split.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira