[ 
https://issues.apache.org/jira/browse/SQOOP-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087887#comment-13087887
 ] 

[email protected] commented on SQOOP-318:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1597/
-----------------------------------------------------------

Review request for Sqoop.


Summary
-------

I added a check when generating the create table string to see if the LzopCodec 
is in use. If it is, it outputs

STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"

at the end of the create table command, otherwise it outputs the standard

STORED AS TEXTFILE

I also added a call to the DistributedLzoIndexer before the data is imported 
into Hive.


This addresses bug SQOOP-318.
    https://issues.apache.org/jira/browse/SQOOP-318


Diffs
-----

  src/java/com/cloudera/sqoop/hive/HiveImport.java 36c17ba 
  src/java/com/cloudera/sqoop/hive/TableDefWriter.java 7dd9135 
  src/test/com/cloudera/sqoop/hive/TestTableDefWriter.java 43b755e 

Diff: https://reviews.apache.org/r/1597/diff


Testing
-------

It includes a test for the create table syntax. I manually tested calling the 
indexer. I'm not sure how to automate that without making LZO required to build.


Thanks,

Joey



> Add support for splittable lzo files with Hive
> ----------------------------------------------
>
>                 Key: SQOOP-318
>                 URL: https://issues.apache.org/jira/browse/SQOOP-318
>             Project: Sqoop
>          Issue Type: Improvement
>          Components: hive-integration
>    Affects Versions: 1.3.0
>            Reporter: Joey Echeverria
>            Assignee: Joey Echeverria
>            Priority: Minor
>         Attachments: SQOOP-318-1.patch
>
>
> When importing LZO compressed files into Hive, it would be useful to create 
> the hive table with the com.hadoop.mapred.DeprecatedLzoTextInputFormat. It 
> would also be nice to automatically run the DistributedIndexer so that the 
> LZO files can be split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to