[
https://issues.apache.org/jira/browse/SQOOP-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jarek Jarcec Cecho updated SQOOP-312:
-------------------------------------
Fix Version/s: (was: 1.4.1-incubating)
> Support for hive dynamic partitions with SQOOP import
> -----------------------------------------------------
>
> Key: SQOOP-312
> URL: https://issues.apache.org/jira/browse/SQOOP-312
> Project: Sqoop
> Issue Type: New Feature
> Reporter: Bejoy KS
>
> Currently in order to populate hive table dynamic partitions using Sqoop
> import we need to perform the following steps.
> 1. Need to analyze the db table and identify the distinct values to be
> partitioned column
> 2. If there are n distinct values for the column then we need to create n
> different SQOOP import commands, each having the corresponding where clause
> to pick the specific data corresponding to the value along with
> --hive-partition-key <key-name/column name> and --hive-partition-value
> <value-string/column value>.
> This approach becomes a bottle neck in case of larger tables that spawns
> millions of rows. Such tables should be partitioned in hive and there could
> at lest 300 to 500 partitions, ie 300 to 500 Sqoop imports.
> We are currently overcoming this hurdle by the following tweak
> 1. Sqoop import the whole db table into a non partitioned hive table
> 2. Manually create a partition based hive table
> 3. Use hive QL to parse the data from non partitioned hive table to the
> corresponding partitions in the partitioned hive table.
> Expecting some parameters in SQOOP import to execute the following within
> SQOOP itself.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira