Re: Oracle 11g Hive 2.1 metastore backend

2018-06-07 Thread Arjun kr
Hi Mike,


Yes. The create table statements fail with below exception. Could you please 
check the version of datanucleus-rdbms.jar that you have?


Caused by: MetaException(message:java.lang.ClassCastException: 
org.datanucleus.store.rdbms.mapping.datastore.ClobImpl cannot be cast to 
oracle.sql.CLOB)

Alan, Thanks for your inputs.

Regards,

Arjun







From: Alan Gates 
Sent: Thursday, June 7, 2018 1:38 AM
To: user@hive.apache.org
Subject: Re: Oracle 11g Hive 2.1 metastore backend

We currently run our Oracle tests against 11g, but that is only for the 3.0 and 
beyond releases.  Given the error I am guessing this is a result of the Oracle 
version plus the datanucleus version, which we changed between 2.1 and 2.3.

Alan.

On Wed, Jun 6, 2018 at 12:12 PM Mich Talebzadeh 
mailto:mich.talebza...@gmail.com>> wrote:
My Hive is 2.3.2

my Oracle is 12.c

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing 
options

and these are hive connections

sessions as on JUN 06 2018 08:21 PM

LOGIN   SID/serial# LOGGED IN S HOST   OS PID Client PID
 PROGRAM   MEM/KB  Logical I/O Physical I/O
--- --- --- -- -- 
-- ---   
ACT INFO
--- ---
HIVEUSER46,3413906/06 07:29 rhes75 oracle/28441   hduser/1234   
 JDBC Thin Clien1,088   460
N
HIVEUSER325,856906/06 08:01 rhes75 oracle/28748   hduser/1234   
 JDBC Thin Clien1,088   440
N
HIVEUSER407,64925   06/06 07:29 rhes75 oracle/28437   hduser/1234   
 JDBC Thin Clien1,088   440
N


I have no issues

Is this your issue?

Caused by: MetaException(message:java.lang.ClassCastException: 
org.datanucleus.store.rdbms.mapping.datastore.ClobImpl cannot be cast to 
oracle.sql.CLOB)

HTH,

Mich


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com


Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.



On 6 June 2018 at 05:10, Arjun kr 
mailto:arjun...@outlook.com>> wrote:
Hi All,

Is anyone using Oracle 11g configured as Hive 2.1 metastore backend? I'm 
encountering below exception with Oracle 11g configured as Hive 2.1 metastore 
backend. Any help would be appreciated.

2018-05-23T13:05:03,219 DEBUG [main] transport.TSaslTransport: CLIENT: reading 
data length: 211
2018-05-23T13:05:03,220 DEBUG [main] transport.TSaslTransport: data length 
after unwrap: 179
2018-05-23T13:05:03,245 ERROR [main] exec.DDLTask: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:java.lang.ClassCastException: 
org.datanucleus.store.rdbms.mapping.datastore.ClobImpl cannot be cast to 
oracle.sql.CLOB)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:842)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:847)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3992)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:332)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2074)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1745)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1454)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1172)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1162)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:234)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:185)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:401)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:337)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:435)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:451)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:763)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:729)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:652)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:647)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.M

Re: Oracle 11g Hive 2.1 metastore backend

2018-06-06 Thread Alan Gates
We currently run our Oracle tests against 11g, but that is only for the 3.0
and beyond releases.  Given the error I am guessing this is a result of the
Oracle version plus the datanucleus version, which we changed between 2.1
and 2.3.

Alan.

On Wed, Jun 6, 2018 at 12:12 PM Mich Talebzadeh 
wrote:

> My Hive is 2.3.2
>
> my Oracle is 12.c
>
> Connected to:
> Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit
> Production
> With the Partitioning, OLAP, Advanced Analytics and Real Application
> Testing options
>
> and these are hive connections
>
> sessions as on JUN 06 2018 08:21 PM
>
> LOGIN   SID/serial# LOGGED IN S HOST   OS PID Client
> PID PROGRAM   MEM/KB  Logical I/O Physical I/O
> --- --- --- -- --
> -- ---   
> ACT INFO
> --- ---
> HIVEUSER46,3413906/06 07:29 rhes75 oracle/28441
> hduser/1234JDBC Thin Clien1,088   460
> N
> HIVEUSER325,856906/06 08:01 rhes75 oracle/28748
> hduser/1234JDBC Thin Clien1,088   440
> N
> HIVEUSER407,64925   06/06 07:29 rhes75 oracle/28437
> hduser/1234JDBC Thin Clien1,088   440
> N
>
>
> I have no issues
>
> Is this your issue?
>
> Caused by: MetaException(message:java.lang.ClassCastException:
> org.datanucleus.store.rdbms.mapping.datastore.ClobImpl cannot be cast to
> oracle.sql.CLOB)
>
> HTH,
>
> Mich
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 6 June 2018 at 05:10, Arjun kr  wrote:
>
>> Hi All,
>>
>> Is anyone using Oracle 11g configured as Hive 2.1 metastore backend? I'm
>> encountering below exception with Oracle 11g configured as Hive 2.1
>> metastore backend. Any help would be appreciated.
>>
>> 2018-05-23T13:05:03,219 DEBUG [main] transport.TSaslTransport: CLIENT:
>> reading data length: 211
>> 2018-05-23T13:05:03,220 DEBUG [main] transport.TSaslTransport: data
>> length after unwrap: 179
>> 2018-05-23T13:05:03,245 ERROR [main] exec.DDLTask:
>> org.apache.hadoop.hive.ql.metadata.HiveException:
>> MetaException(message:java.lang.ClassCastException:
>> org.datanucleus.store.rdbms.mapping.datastore.ClobImpl cannot be cast to
>> oracle.sql.CLOB)
>> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:842)
>> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:847)
>> at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3992)
>> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:332)
>> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>> at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2074)
>> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1745)
>> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1454)
>> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1172)
>> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1162)
>> at
>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:234)
>> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:185)
>> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:401)
>> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:337)
>> at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:435)
>> at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:451)
>> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:763)
>> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:729)
>> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:652)
>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:647)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:498)
>> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>> Caused by: MetaException(message:java.lang.ClassCastException:
>> org.datanucleus.store.rdbms.mapping.datastore.ClobImpl cannot be cast to
>> 

Re: Oracle 11g Hive 2.1 metastore backend

2018-06-06 Thread Mich Talebzadeh
My Hive is 2.3.2

my Oracle is 12.c

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application
Testing options

and these are hive connections

sessions as on JUN 06 2018 08:21 PM

LOGIN   SID/serial# LOGGED IN S HOST   OS PID Client
PID PROGRAM   MEM/KB  Logical I/O Physical I/O
--- --- --- -- --
-- ---   
ACT INFO
--- ---
HIVEUSER46,3413906/06 07:29 rhes75 oracle/28441
hduser/1234JDBC Thin Clien1,088   460
N
HIVEUSER325,856906/06 08:01 rhes75 oracle/28748
hduser/1234JDBC Thin Clien1,088   440
N
HIVEUSER407,64925   06/06 07:29 rhes75 oracle/28437
hduser/1234JDBC Thin Clien1,088   440
N


I have no issues

Is this your issue?

Caused by: MetaException(message:java.lang.ClassCastException:
org.datanucleus.store.rdbms.mapping.datastore.ClobImpl cannot be cast to
oracle.sql.CLOB)

HTH,

Mich

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 6 June 2018 at 05:10, Arjun kr  wrote:

> Hi All,
>
> Is anyone using Oracle 11g configured as Hive 2.1 metastore backend? I'm
> encountering below exception with Oracle 11g configured as Hive 2.1
> metastore backend. Any help would be appreciated.
>
> 2018-05-23T13:05:03,219 DEBUG [main] transport.TSaslTransport: CLIENT:
> reading data length: 211
> 2018-05-23T13:05:03,220 DEBUG [main] transport.TSaslTransport: data length
> after unwrap: 179
> 2018-05-23T13:05:03,245 ERROR [main] exec.DDLTask:
> org.apache.hadoop.hive.ql.metadata.HiveException:
> MetaException(message:java.lang.ClassCastException:
> org.datanucleus.store.rdbms.mapping.datastore.ClobImpl cannot be cast to
> oracle.sql.CLOB)
> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:842)
> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:847)
> at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3992)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:332)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(
> TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2074)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1745)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1454)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1172)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1162)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(
> CliDriver.java:234)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:185)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:401)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:337)
> at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:435)
> at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:451)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:763)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:729)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:652)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:647)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:java.lang.ClassCastException:
> org.datanucleus.store.rdbms.mapping.datastore.ClobImpl cannot be cast to
> oracle.sql.CLOB)
> at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_
> table_with_environment_context_result$create_table_
> with_environment_context_resultStandardScheme.read(
> ThriftHiveMetastore.java:41498)
> at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_
> table_with_environment_context_result$create_table_
> with_environment_context_resultStandardScheme.read(
> ThriftHiveMetastore.java:41466)
> at 

Oracle 11g Hive 2.1 metastore backend

2018-06-05 Thread Arjun kr
Hi All,

Is anyone using Oracle 11g configured as Hive 2.1 metastore backend? I'm 
encountering below exception with Oracle 11g configured as Hive 2.1 metastore 
backend. Any help would be appreciated.

2018-05-23T13:05:03,219 DEBUG [main] transport.TSaslTransport: CLIENT: reading 
data length: 211
2018-05-23T13:05:03,220 DEBUG [main] transport.TSaslTransport: data length 
after unwrap: 179
2018-05-23T13:05:03,245 ERROR [main] exec.DDLTask: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:java.lang.ClassCastException: 
org.datanucleus.store.rdbms.mapping.datastore.ClobImpl cannot be cast to 
oracle.sql.CLOB)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:842)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:847)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3992)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:332)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2074)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1745)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1454)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1172)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1162)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:234)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:185)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:401)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:337)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:435)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:451)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:763)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:729)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:652)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:647)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: MetaException(message:java.lang.ClassCastException: 
org.datanucleus.store.rdbms.mapping.datastore.ClobImpl cannot be cast to 
oracle.sql.CLOB)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:41498)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:41466)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result.read(ThriftHiveMetastore.java:41392)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:1183)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:1169)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2334)
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:93)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:747)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:735)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
at com.sun.proxy.$Proxy20.createTable(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2265)
at com.sun.proxy.$Proxy20.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:832)
... 26 more

Thanks,

Arjun



Reg:Sqoop Import-Oracle to Hive-Parquet

2016-11-29 Thread kishore kumar
Hi Experts,

We are trying to use parquet for importing data from oracle to hive, we are
encountering the below error, could anyone help me to resolve this issue ?

We are using sqoop version 1.4.6 and hive version 1.2.

Error:



16/11/28 21:21:46 INFO hive.metastore: Connected to metastore.

16/11/28 21:21:46 ERROR tool.ImportTool: Imported Failed: Cannot convert
unsupported type: timestamp


Thanks,

Kishore Kumar.

-- 
Thanks,
Kishore.


Re: Oracle to Hive

2013-07-10 Thread Michael Malak
Untested:

SELECT a.c100, a.c300, b.c400
  FROM t1 a
  JOIN t2 b
  ON a.c200 = b.c200
  JOIN (SELECT DISTINCT a.c100
          FROM t1 a2
          JOIN t2 b2
          ON a2.c200 = b2.c200
        WHERE b2.c400 = SYSDATE - 1) a3
  ON a.c100 = a3.c100
  WHERE b.c400 = SYSDATE - 1
   AND a.c300 = 0




 From: Raj Hadoop hadoop...@yahoo.com
To: Hive user@hive.apache.org 
Sent: Wednesday, July 10, 2013 3:30 PM
Subject: Oracle to Hive
 


 
All,
 
Can anyone give me tips on how to convert the following Oracle SQL to a Hive 
query.
 
 
SELECT a.c100, a.c300, b.c400
  FROM t1 a JOIN t2 b ON a.c200 = b.c200
 WHERE a.c100 IN (SELECT DISTINCT a.c100
 FROM t1 a JOIN t2 b ON a.c200 = b.c200
    WHERE b.c400 = SYSDATE - 1)
   AND b.c400 = SYSDATE - 1
   AND a.c300 = 0
 
 
The SYSDATE can be replaced by  
date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'-MM-dd') , 1) in Hive.
 
But I wanted to know the rest of the query. Any pointers or tips so that I can 
start on my own.
 
Thanks in advance.
 
Regards,
Raj

RE: Data mismatch when importing data from Oracle to Hive through Sqoop without an error

2013-03-08 Thread Ajit Kumar Shreevastava
Hi Venkat,

All most column have some value except these three.

Regards,
Ajit

-Original Message-
From: Venkat Ranganathan [mailto:vranganat...@hortonworks.com] 
Sent: Wednesday, March 06, 2013 9:36 PM
To: user@hive.apache.org
Cc: u...@sqoop.apache.org
Subject: Re: Data mismatch when importing data from Oracle to Hive through 
Sqoop without an error

Hi Ajit

Do you know if rest of the columns also are null when the three non null 
columns are null

Venkat

On Wed, Mar 6, 2013 at 12:35 AM, Ajit Kumar Shreevastava 
ajit.shreevast...@hcl.com wrote:
 Hi Abhijeet,



 Thanks for your response.

 If values that don't fit in double must be getting inserted as Null is 
 the case then count should not be mis-match in both the case.

 Here the null value inserted are extra value apart from the other 
 value which is already present in both Oracle Table and Hive table.



 Correct me if I am wrong in interpretation.



 Thanks and Regards,

 Ajit Kumar Shreevastava



 From: abhijeet gaikwad [mailto:abygaikwa...@gmail.com]
 Sent: Wednesday, March 06, 2013 1:46 PM
 To: user@hive.apache.org
 Cc: u...@sqoop.apache.org
 Subject: Re: Data mismatch when importing data from Oracle to Hive 
 through Sqoop without an error



 Sqoop maps numeric and decimal types (RDBMS) to double (Hive). I think 
 the values that don't fit in double must be getting inserted as NULL.
 You can see this warning in your logs.

 Thanks,
 Abhijeet

 On Wed, Mar 6, 2013 at 1:32 PM, Ajit Kumar Shreevastava 
 ajit.shreevast...@hcl.com wrote:

 Hi all,

 I have notice one interesting thing in the below result-set.

 I have fired one query in both Oracle and Hive shell and found the 
 following result set:à



 SQL select count(1) from bttn

   2  where bttn_id is null or data_inst_id is null or scr_id is null;



   COUNT(1)

 --

  0

 hive select count(1) from bttn

  where bttn_id is null or data_inst_id is null or scr_id is null;

 Total MapReduce jobs = 1

 Launching Job 1 out of 1

 Number of reduce tasks determined at compile time: 1

 In order to change the average load for a reducer (in bytes):

   set hive.exec.reducers.bytes.per.reducer=number

 In order to limit the maximum number of reducers:

   set hive.exec.reducers.max=number

 In order to set a constant number of reducers:

   set mapred.reduce.tasks=number

 Starting Job = job_201303051835_0020, Tracking URL =
 http://NHCLT-PC44-2:50030/jobdetails.jsp?jobid=job_201303051835_0020

 Kill Command = /home/hadoop/hadoop-1.0.3/bin/hadoop job  -kill
 job_201303051835_0020

 Hadoop job information for Stage-1: number of mappers: 1; number of
 reducers: 1

 2013-03-06 13:22:56,908 Stage-1 map = 0%,  reduce = 0%

 2013-03-06 13:23:05,928 Stage-1 map = 100%,  reduce = 0%, Cumulative 
 CPU 5.2 sec

 2013-03-06 13:23:06,931 Stage-1 map = 100%,  reduce = 0%, Cumulative 
 CPU 5.2 sec

 2013-03-06 13:23:07,934 Stage-1 map = 100%,  reduce = 0%, Cumulative 
 CPU 5.2 sec

 2013-03-06 13:23:08,938 Stage-1 map = 100%,  reduce = 0%, Cumulative 
 CPU 5.2 sec

 2013-03-06 13:23:09,941 Stage-1 map = 100%,  reduce = 0%, Cumulative 
 CPU 5.2 sec

 2013-03-06 13:23:10,944 Stage-1 map = 100%,  reduce = 0%, Cumulative 
 CPU 5.2 sec

 2013-03-06 13:23:11,947 Stage-1 map = 100%,  reduce = 0%, Cumulative 
 CPU 5.2 sec

 2013-03-06 13:23:12,956 Stage-1 map = 100%,  reduce = 0%, Cumulative 
 CPU 5.2 sec

 2013-03-06 13:23:13,959 Stage-1 map = 100%,  reduce = 0%, Cumulative 
 CPU 5.2 sec

 2013-03-06 13:23:14,962 Stage-1 map = 100%,  reduce = 33%, Cumulative 
 CPU
 5.2 sec

 2013-03-06 13:23:15,965 Stage-1 map = 100%,  reduce = 33%, Cumulative 
 CPU
 5.2 sec

 2013-03-06 13:23:16,969 Stage-1 map = 100%,  reduce = 33%, Cumulative 
 CPU
 5.2 sec

 2013-03-06 13:23:17,974 Stage-1 map = 100%,  reduce = 100%, Cumulative 
 CPU
 6.95 sec

 2013-03-06 13:23:18,977 Stage-1 map = 100%,  reduce = 100%, Cumulative 
 CPU
 6.95 sec

 2013-03-06 13:23:19,981 Stage-1 map = 100%,  reduce = 100%, Cumulative 
 CPU
 6.95 sec

 2013-03-06 13:23:20,985 Stage-1 map = 100%,  reduce = 100%, Cumulative 
 CPU
 6.95 sec

 2013-03-06 13:23:21,988 Stage-1 map = 100%,  reduce = 100%, Cumulative 
 CPU
 6.95 sec

 2013-03-06 13:23:22,995 Stage-1 map = 100%,  reduce = 100%, Cumulative 
 CPU
 6.95 sec

 2013-03-06 13:23:23,998 Stage-1 map = 100%,  reduce = 100%, Cumulative 
 CPU
 6.95 sec

 MapReduce Total cumulative CPU time: 6 seconds 950 msec

 Ended Job = job_201303051835_0020

 MapReduce Jobs Launched:

 Job 0: Map: 1  Reduce: 1   Cumulative CPU: 6.95 sec   HDFS Read: 184270926
 HDFS Write: 4 SUCCESS

 Total MapReduce CPU Time Spent: 6 seconds 950 msec

 OK

 986

 Time taken: 35.983 seconds

 hive



 and  739169 - 738183=986



 can anyone tell me why this happened as BTTN_ID  ,  DATA_INST_ID, 
 SCR_ID are not null constrains of BTTN table and also composite 
 Primary Key of the table?

 Also tell me how can I prevent this unnecessary data generation in 
 HIVE table.



 Regards

 Ajit Kumar Shreevastava



 From

Re: Data mismatch when importing data from Oracle to Hive through Sqoop without an error

2013-03-06 Thread Venkat Ranganathan
Hi Ajit

Do you know if rest of the columns also are null when the three non
null columns are null

Venkat

On Wed, Mar 6, 2013 at 12:35 AM, Ajit Kumar Shreevastava
ajit.shreevast...@hcl.com wrote:
 Hi Abhijeet,



 Thanks for your response.

 If values that don’t fit in double must be getting inserted as Null is the
 case then count should not be mis-match in both the case.

 Here the null value inserted are extra value apart from the other value
 which is already present in both Oracle Table and Hive table.



 Correct me if I am wrong in interpretation.



 Thanks and Regards,

 Ajit Kumar Shreevastava



 From: abhijeet gaikwad [mailto:abygaikwa...@gmail.com]
 Sent: Wednesday, March 06, 2013 1:46 PM
 To: user@hive.apache.org
 Cc: u...@sqoop.apache.org
 Subject: Re: Data mismatch when importing data from Oracle to Hive through
 Sqoop without an error



 Sqoop maps numeric and decimal types (RDBMS) to double (Hive). I think the
 values that don't fit in double must be getting inserted as NULL.
 You can see this warning in your logs.

 Thanks,
 Abhijeet

 On Wed, Mar 6, 2013 at 1:32 PM, Ajit Kumar Shreevastava
 ajit.shreevast...@hcl.com wrote:

 Hi all,

 I have notice one interesting thing in the below result-set.

 I have fired one query in both Oracle and Hive shell and found the following
 result set:à



 SQL select count(1) from bttn

   2  where bttn_id is null or data_inst_id is null or scr_id is null;



   COUNT(1)

 --

  0

 hive select count(1) from bttn

  where bttn_id is null or data_inst_id is null or scr_id is null;

 Total MapReduce jobs = 1

 Launching Job 1 out of 1

 Number of reduce tasks determined at compile time: 1

 In order to change the average load for a reducer (in bytes):

   set hive.exec.reducers.bytes.per.reducer=number

 In order to limit the maximum number of reducers:

   set hive.exec.reducers.max=number

 In order to set a constant number of reducers:

   set mapred.reduce.tasks=number

 Starting Job = job_201303051835_0020, Tracking URL =
 http://NHCLT-PC44-2:50030/jobdetails.jsp?jobid=job_201303051835_0020

 Kill Command = /home/hadoop/hadoop-1.0.3/bin/hadoop job  -kill
 job_201303051835_0020

 Hadoop job information for Stage-1: number of mappers: 1; number of
 reducers: 1

 2013-03-06 13:22:56,908 Stage-1 map = 0%,  reduce = 0%

 2013-03-06 13:23:05,928 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
 sec

 2013-03-06 13:23:06,931 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
 sec

 2013-03-06 13:23:07,934 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
 sec

 2013-03-06 13:23:08,938 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
 sec

 2013-03-06 13:23:09,941 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
 sec

 2013-03-06 13:23:10,944 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
 sec

 2013-03-06 13:23:11,947 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
 sec

 2013-03-06 13:23:12,956 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
 sec

 2013-03-06 13:23:13,959 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2
 sec

 2013-03-06 13:23:14,962 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU
 5.2 sec

 2013-03-06 13:23:15,965 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU
 5.2 sec

 2013-03-06 13:23:16,969 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU
 5.2 sec

 2013-03-06 13:23:17,974 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
 6.95 sec

 2013-03-06 13:23:18,977 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
 6.95 sec

 2013-03-06 13:23:19,981 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
 6.95 sec

 2013-03-06 13:23:20,985 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
 6.95 sec

 2013-03-06 13:23:21,988 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
 6.95 sec

 2013-03-06 13:23:22,995 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
 6.95 sec

 2013-03-06 13:23:23,998 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
 6.95 sec

 MapReduce Total cumulative CPU time: 6 seconds 950 msec

 Ended Job = job_201303051835_0020

 MapReduce Jobs Launched:

 Job 0: Map: 1  Reduce: 1   Cumulative CPU: 6.95 sec   HDFS Read: 184270926
 HDFS Write: 4 SUCCESS

 Total MapReduce CPU Time Spent: 6 seconds 950 msec

 OK

 986

 Time taken: 35.983 seconds

 hive



 and  739169 – 738183=986



 can anyone tell me why this happened as BTTN_ID  ,  DATA_INST_ID, SCR_ID
 are not null constrains of BTTN table and also composite Primary Key of the
 table?

 Also tell me how can I prevent this unnecessary data generation in HIVE
 table.



 Regards

 Ajit Kumar Shreevastava



 From: Ajit Kumar Shreevastava
 Sent: Wednesday, March 06, 2013 12:40 PM
 To: 'u...@sqoop.apache.org'
 Cc: user@hive.apache.org
 Subject: Data mismatch when importing data from Oracle to Hive through Sqoop
 without an error



 HI,



 I have a table BTTN in Oracle database having 738183 records:à



 SQL select count(1) from bttn;



   COUNT(1)

 --

 738183



 Now I want to import

Re: Data mismatch when importing data from Oracle to Hive through Sqoop without an error

2013-03-06 Thread Jarek Jarcec Cecho
Hi Ajit,
I've seen similar issue many times. Does your table have textual data? If so, 
can it happen that your textual data contains hive delimiters like new line 
characters? Because if so then Sqoop might create two lines in for one single 
row in the table that will be consequently seen as two rows in Hive. As Hive 
will implicitly convert any invalid values into NULL, it would also explain 
your NULL values.

Sqoop offers arguments --hive-drop-import-delims and --hive-delims-replacement 
to deal with this problem. More information can be found in Sqoop user guide 
[1].

Jarcec

Links:
1: 
http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_importing_data_into_hive

On Wed, Mar 06, 2013 at 10:50:26PM +0530, abhijeet gaikwad wrote:
 If you see Hadoop job counters in your logs - numbers are correct till that
 point. So normal import in HDFS is working fine. Only reasonable
 explanation is that there is an issue inserting data in Hive.
 
 Have a look at the file on HDFS it should contain the correct number of
 records with correct data. But if you do select * ... from Hive console
 it will give you NULL values for invalid data, as in the data that cannot
 be converted to the respective data type of the column of Hive table. This
 is because Hive imposes data type constraints on the data in files on HDFS.
 Those extra records is a mystry for me too. I don't think actual file on
 HDFS has extra records.
 
 Sqoop is creating Hive table incorrectly which **may** have caused this
 issue. Create table manually with correct data types if equivalent
 available or as string and try loading data. If this works fine we know
 where the issue is!
 
 Thanks,
 Abhijeet
 On Wed, Mar 6, 2013 at 2:05 PM, Ajit Kumar Shreevastava 
 ajit.shreevast...@hcl.com wrote:
 
 
   Hi Abhijeet,
 
  ** **
 
  Thanks for your response.
 
  If values that don’t fit in double must be getting inserted as Null is the
  case then count should not be mis-match in both the case.
 
  Here the null value inserted are extra value apart from the other value
  which is already present in both Oracle Table and Hive table.
 
  ** **
 
  Correct me if I am wrong in interpretation.
 
  ** **
 
  Thanks and Regards,
 
  Ajit Kumar Shreevastava
 
  ** **
 
  *From:* abhijeet gaikwad [mailto:abygaikwa...@gmail.com]
  *Sent:* Wednesday, March 06, 2013 1:46 PM
  *To:* user@hive.apache.org
  *Cc:* u...@sqoop.apache.org
  *Subject:* Re: Data mismatch when importing data from Oracle to Hive
  through Sqoop without an error
 
** **
 
  Sqoop maps numeric and decimal types (RDBMS) to double (Hive). I think the
  values that don't fit in double must be getting inserted as NULL.
  You can see this warning in your logs.
 
  Thanks,
  Abhijeet
 
  On Wed, Mar 6, 2013 at 1:32 PM, Ajit Kumar Shreevastava 
  ajit.shreevast...@hcl.com wrote:
 
  Hi all,
 
  I have notice one interesting thing in the below result-set.
 
  I have fired one query in both Oracle and Hive shell and found the
  following result set:à
 
   
 
  *SQL select count(1) from bttn*
 
  *  2  where bttn_id is null or data_inst_id is null or scr_id is null;
  *
 
  * *
 
  *  COUNT(1)*
 
  *--*
 
  * 0*
 
  *hive select count(1) from bttn*
 
  * where bttn_id is null or data_inst_id is null or scr_id is null;***
  **
 
  Total MapReduce jobs = 1
 
  Launching Job 1 out of 1
 
  Number of reduce tasks determined at compile time: 1
 
  In order to change the average load for a reducer (in bytes):
 
set hive.exec.reducers.bytes.per.reducer=number
 
  In order to limit the maximum number of reducers:
 
set hive.exec.reducers.max=number
 
  In order to set a constant number of reducers:
 
set mapred.reduce.tasks=number
 
  Starting Job = job_201303051835_0020, Tracking URL =
  http://NHCLT-PC44-2:50030/jobdetails.jsp?jobid=job_201303051835_0020http://nhclt-pc44-2:50030/jobdetails.jsp?jobid=job_201303051835_0020
  
 
  Kill Command = /home/hadoop/hadoop-1.0.3/bin/hadoop job  -kill
  job_201303051835_0020
 
  Hadoop job information for Stage-1: number of mappers: 1; number of
  reducers: 1
 
  2013-03-06 13:22:56,908 Stage-1 map = 0%,  reduce = 0%
 
  2013-03-06 13:23:05,928 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
  5.2 sec
 
  2013-03-06 13:23:06,931 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
  5.2 sec
 
  2013-03-06 13:23:07,934 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
  5.2 sec
 
  2013-03-06 13:23:08,938 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
  5.2 sec
 
  2013-03-06 13:23:09,941 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
  5.2 sec
 
  2013-03-06 13:23:10,944 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
  5.2 sec
 
  2013-03-06 13:23:11,947 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
  5.2 sec
 
  2013-03-06 13:23:12,956 Stage-1 map = 100%,  reduce = 0