Re: Hive to Oracle using Spark - Type(Date) conversion issue

Gurusamy Thirupathy Tue, 20 Mar 2018 20:09:08 -0700

HI Jorn,

Thanks for your sharing different options, yes we are trying to build a
generic tool for Hive to Spark export.
FYI, currently we are using sqoop, we are trying to migrate from sqoop to
spark.


Thanks
-G

On Tue, Mar 20, 2018 at 2:17 AM, Jörn Franke <jornfra...@gmail.com> wrote:

> Write your own Spark UDF. Apply it to all varchar columns.
>
> Within this udf you can use the SimpleDateFormat parse method. If this
> method returns null you return the content as varchar if not you return a
> date. If the content is null you return null.
>
> Alternatively you can define an insert function as pl/sql on Oracle side.
>
> Another alternative is to read the Oracle metadata for the table at
> runtime and then adapt your conversion based on this.
>
> However, this may not be perfect depending on your use case. Can you
> please provide more details/examples? Do you aim at a generic hive to
> Oracle import tool using Spark? Sqoop would not be an alternative?
>
> On 20. Mar 2018, at 03:45, Gurusamy Thirupathy <thirug...@gmail.com>
> wrote:
>
> Hi guha,
>
> Thanks for your quick response, option a and b are in our table already.
> For option b, again the same problem, we don't know which column is date.
>
>
> Thanks,
> -G
>
> On Sun, Mar 18, 2018 at 9:36 PM, Deepak Sharma <deepakmc...@gmail.com>
> wrote:
>
>> The other approach would to write to temp table and then merge the data.
>> But this may be expensive solution.
>>
>> Thanks
>> Deepak
>>
>> On Mon, Mar 19, 2018, 08:04 Gurusamy Thirupathy <thirug...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I am trying to read data from Hive as DataFrame, then trying to write
>>> the DF into the Oracle data base. In this case, the date field/column in
>>> hive is with Type Varchar(20)
>>> but the corresponding column type in Oracle is Date. While reading from
>>> hive , the hive table names are dynamically decided(read from another
>>> table) based on some job condition(ex. Job1). There are multiple tables
>>> like this, so column and the table names are decided only run time. So I
>>> can't do type conversion explicitly when read from Hive.
>>>
>>> So is there any utility/api available in Spark to achieve this
>>> conversion issue?
>>>
>>>
>>> Thanks,
>>> Guru
>>>
>>
>
>
> --
> Thanks,
> Guru
>
>


-- 
Thanks,
Guru

Re: Hive to Oracle using Spark - Type(Date) conversion issue

Reply via email to