HI Jorn, Thanks for your sharing different options, yes we are trying to build a generic tool for Hive to Spark export. FYI, currently we are using sqoop, we are trying to migrate from sqoop to spark.
Thanks -G On Tue, Mar 20, 2018 at 2:17 AM, Jörn Franke <jornfra...@gmail.com> wrote: > Write your own Spark UDF. Apply it to all varchar columns. > > Within this udf you can use the SimpleDateFormat parse method. If this > method returns null you return the content as varchar if not you return a > date. If the content is null you return null. > > Alternatively you can define an insert function as pl/sql on Oracle side. > > Another alternative is to read the Oracle metadata for the table at > runtime and then adapt your conversion based on this. > > However, this may not be perfect depending on your use case. Can you > please provide more details/examples? Do you aim at a generic hive to > Oracle import tool using Spark? Sqoop would not be an alternative? > > On 20. Mar 2018, at 03:45, Gurusamy Thirupathy <thirug...@gmail.com> > wrote: > > Hi guha, > > Thanks for your quick response, option a and b are in our table already. > For option b, again the same problem, we don't know which column is date. > > > Thanks, > -G > > On Sun, Mar 18, 2018 at 9:36 PM, Deepak Sharma <deepakmc...@gmail.com> > wrote: > >> The other approach would to write to temp table and then merge the data. >> But this may be expensive solution. >> >> Thanks >> Deepak >> >> On Mon, Mar 19, 2018, 08:04 Gurusamy Thirupathy <thirug...@gmail.com> >> wrote: >> >>> Hi, >>> >>> I am trying to read data from Hive as DataFrame, then trying to write >>> the DF into the Oracle data base. In this case, the date field/column in >>> hive is with Type Varchar(20) >>> but the corresponding column type in Oracle is Date. While reading from >>> hive , the hive table names are dynamically decided(read from another >>> table) based on some job condition(ex. Job1). There are multiple tables >>> like this, so column and the table names are decided only run time. So I >>> can't do type conversion explicitly when read from Hive. >>> >>> So is there any utility/api available in Spark to achieve this >>> conversion issue? >>> >>> >>> Thanks, >>> Guru >>> >> > > > -- > Thanks, > Guru > > -- Thanks, Guru