Thanks for the response. I was thinking to use Oraoop to automatically import Oracle partitions to Hive partitions. But, based on conversation below, I just learned its not possible.
From automation perspective, I think running one Sqoop job per partition and create same partition in Hive is better option. Gwen/David: Yes, it will be a good feature to have Oracle Partitions to Hive partitions. Any idea why there are no commits to Oraoop since 2012? Regards, Venkat -----Original Message----- From: Gwen Shapira [mailto:[email protected]] Sent: Tuesday, August 05, 2014 6:24 PM To: [email protected] Subject: Re: Import Partitions from Oracle to Hive Partitions Having OraOop automatically handle partitions in Hive will be a cool feature. I agree that this will be limited to OraOop for now. On Tue, Aug 5, 2014 at 5:08 PM, David Robson <[email protected]> wrote: > Yes now that you mention Sqoop is limited to one partition in Hive I do > remember that! I would think we could modify Sqoop to create subfolders for > each partition - instead of how it now creates a separate file for each > partition? This would probably be limited to the direct (OraOop) connector as > it is aware of partitions (existing connector doesn't read data dictionary > directly). > > In the meantime Venkat - you could look at the option I mentioned - then > manually move the files into separate folders - at least you'll have each > partition in a separate file rather than spread throughout all files. The > other thing you could look at is the option below - you could run one Sqoop > job per partition: > > Specify The Partitions To Import > > -Doraoop.import.partitions=PartitionA,PartitionB --table > OracleTableName > > Imports PartitionA and PartitionB of OracleTableName. > > Notes: > You can enclose an individual partition name in double quotes to > retain the letter case or if the name has special characters. > -Doraoop.import.partitions='"PartitionA",PartitionB' --table > OracleTableName If the partition name is not double quoted then its > name will be automatically converted to upper case, PARTITIONB for > above. > When using double quotes the entire list of partition names must be > enclosed in single quotes. > If the last partition name in the list is double quoted then there > must be a comma at the end of the list. > -Doraoop.import.partitions='"PartitionA","PartitionB",' --table > OracleTableName > > Name each partition to be included. There is no facility to provide a range > of partition names. > > There is no facility to define sub partitions. The entire partition is > included/excluded as per the filter. > > > -----Original Message----- > From: Gwen Shapira [mailto:[email protected]] > Sent: Wednesday, 6 August 2014 8:44 AM > To: [email protected] > Subject: Re: Import Partitions from Oracle to Hive Partitions > > Hive expects a directory for each partition, so getting data with OraOop will > require some post-processing - copy files into properly named directories and > adding the new partitions to a hive table. > > Sqoop has the --hive-partition-key and --hive-partition-value, but this > assumes that all the data sqooped will fit into a single partition. > > > On Tue, Aug 5, 2014 at 3:40 PM, David Robson <[email protected]> > wrote: >> Hi Venkat, >> >> >> >> I’m not sure what this will do in regards to Hive partitions – I’ll >> test it out when I get into the office and get back to you. But this >> option will make it so there is one file for each Oracle partition – >> which might be of interest to you. >> >> >> >> Match Hadoop Files to Oracle Table Partitions >> >> >> >> -Doraoop.chunk.method={ROWID|PARTITION} >> >> >> >> To import data from a partitioned table in such a way that the >> resulting HDFS folder structure in >> >> Hadoop will match the table’s partitions, set the chunk method to PARTITION. >> The alternative >> >> (default) chunk method is ROWID. >> >> >> >> Notes: >> >> l For the number of Hadoop files to match the number of Oracle >> partitions, set the number >> >> of mappers to be greater than or equal to the number of partitions. >> >> l If the table is not partitioned then value PARTITION will lead to >> an error. >> >> >> >> David >> >> >> >> >> >> From: Venkat, Ankam [mailto:[email protected]] >> Sent: Wednesday, 6 August 2014 3:56 AM >> To: '[email protected]' >> Subject: Import Partitions from Oracle to Hive Partitions >> >> >> >> I am trying to import partitions from Oracle table to Hive partitions. >> >> >> >> Can somebody provide the syntax using regular JDBC connector and >> Oraoop connector? >> >> >> >> Thanks in advance. >> >> >> >> Regards, >> >> Venkat >> >> >> >>
