Write your own Spark UDF. Apply it to all varchar columns.

Within this udf you can use the SimpleDateFormat parse method. If this method 
returns null you return the content as varchar if not you return a date. If the 
content is null you return null.

Alternatively you can define an insert function as pl/sql on Oracle side.

Another alternative is to read the Oracle metadata for the table at runtime and 
then adapt your conversion based on this. 

However, this may not be perfect depending on your use case. Can you please 
provide more details/examples? Do you aim at a generic hive to Oracle import 
tool using Spark? Sqoop would not be an alternative?

> On 20. Mar 2018, at 03:45, Gurusamy Thirupathy <thirug...@gmail.com> wrote:
> 
> Hi guha,
> 
> Thanks for your quick response, option a and b are in our table already. For 
> option b, again the same problem, we don't know which column is date.
> 
> 
> Thanks,
> -G
> 
>> On Sun, Mar 18, 2018 at 9:36 PM, Deepak Sharma <deepakmc...@gmail.com> wrote:
>> The other approach would to write to temp table and then merge the data.
>> But this may be expensive solution.
>> 
>> Thanks
>> Deepak
>> 
>>> On Mon, Mar 19, 2018, 08:04 Gurusamy Thirupathy <thirug...@gmail.com> wrote:
>>> Hi,
>>> 
>>> I am trying to read data from Hive as DataFrame, then trying to write the 
>>> DF into the Oracle data base. In this case, the date field/column in hive 
>>> is with Type Varchar(20)
>>> but the corresponding column type in Oracle is Date. While reading from 
>>> hive , the hive table names are dynamically decided(read from another 
>>> table) based on some job condition(ex. Job1). There are multiple tables 
>>> like this, so column and the table names are decided only run time. So I 
>>> can't do type conversion explicitly when read from Hive.
>>> 
>>> So is there any utility/api available in Spark to achieve this conversion 
>>> issue?
>>> 
>>> 
>>> Thanks,
>>> Guru
> 
> 
> 
> -- 
> Thanks,
> Guru

Reply via email to