Hi Ajit Do you know if rest of the columns also are null when the three non null columns are null
Venkat On Wed, Mar 6, 2013 at 12:35 AM, Ajit Kumar Shreevastava <ajit.shreevast...@hcl.com> wrote: > Hi Abhijeet, > > > > Thanks for your response. > > If values that don’t fit in double must be getting inserted as Null is the > case then count should not be mis-match in both the case. > > Here the null value inserted are extra value apart from the other value > which is already present in both Oracle Table and Hive table. > > > > Correct me if I am wrong in interpretation. > > > > Thanks and Regards, > > Ajit Kumar Shreevastava > > > > From: abhijeet gaikwad [mailto:abygaikwa...@gmail.com] > Sent: Wednesday, March 06, 2013 1:46 PM > To: user@hive.apache.org > Cc: u...@sqoop.apache.org > Subject: Re: Data mismatch when importing data from Oracle to Hive through > Sqoop without an error > > > > Sqoop maps numeric and decimal types (RDBMS) to double (Hive). I think the > values that don't fit in double must be getting inserted as NULL. > You can see this warning in your logs. > > Thanks, > Abhijeet > > On Wed, Mar 6, 2013 at 1:32 PM, Ajit Kumar Shreevastava > <ajit.shreevast...@hcl.com> wrote: > > Hi all, > > I have notice one interesting thing in the below result-set. > > I have fired one query in both Oracle and Hive shell and found the following > result set:à > > > > SQL> select count(1) from bttn > > 2 where bttn_id is null or data_inst_id is null or scr_id is null; > > > > COUNT(1) > > ---------- > > 0 > > hive> select count(1) from bttn > > > where bttn_id is null or data_inst_id is null or scr_id is null; > > Total MapReduce jobs = 1 > > Launching Job 1 out of 1 > > Number of reduce tasks determined at compile time: 1 > > In order to change the average load for a reducer (in bytes): > > set hive.exec.reducers.bytes.per.reducer=<number> > > In order to limit the maximum number of reducers: > > set hive.exec.reducers.max=<number> > > In order to set a constant number of reducers: > > set mapred.reduce.tasks=<number> > > Starting Job = job_201303051835_0020, Tracking URL = > http://NHCLT-PC44-2:50030/jobdetails.jsp?jobid=job_201303051835_0020 > > Kill Command = /home/hadoop/hadoop-1.0.3/bin/hadoop job -kill > job_201303051835_0020 > > Hadoop job information for Stage-1: number of mappers: 1; number of > reducers: 1 > > 2013-03-06 13:22:56,908 Stage-1 map = 0%, reduce = 0% > > 2013-03-06 13:23:05,928 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 5.2 > sec > > 2013-03-06 13:23:06,931 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 5.2 > sec > > 2013-03-06 13:23:07,934 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 5.2 > sec > > 2013-03-06 13:23:08,938 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 5.2 > sec > > 2013-03-06 13:23:09,941 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 5.2 > sec > > 2013-03-06 13:23:10,944 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 5.2 > sec > > 2013-03-06 13:23:11,947 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 5.2 > sec > > 2013-03-06 13:23:12,956 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 5.2 > sec > > 2013-03-06 13:23:13,959 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 5.2 > sec > > 2013-03-06 13:23:14,962 Stage-1 map = 100%, reduce = 33%, Cumulative CPU > 5.2 sec > > 2013-03-06 13:23:15,965 Stage-1 map = 100%, reduce = 33%, Cumulative CPU > 5.2 sec > > 2013-03-06 13:23:16,969 Stage-1 map = 100%, reduce = 33%, Cumulative CPU > 5.2 sec > > 2013-03-06 13:23:17,974 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 6.95 sec > > 2013-03-06 13:23:18,977 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 6.95 sec > > 2013-03-06 13:23:19,981 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 6.95 sec > > 2013-03-06 13:23:20,985 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 6.95 sec > > 2013-03-06 13:23:21,988 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 6.95 sec > > 2013-03-06 13:23:22,995 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 6.95 sec > > 2013-03-06 13:23:23,998 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 6.95 sec > > MapReduce Total cumulative CPU time: 6 seconds 950 msec > > Ended Job = job_201303051835_0020 > > MapReduce Jobs Launched: > > Job 0: Map: 1 Reduce: 1 Cumulative CPU: 6.95 sec HDFS Read: 184270926 > HDFS Write: 4 SUCCESS > > Total MapReduce CPU Time Spent: 6 seconds 950 msec > > OK > > 986 > > Time taken: 35.983 seconds > > hive> > > > > and 739169 – 738183=986 > > > > can anyone tell me why this happened as BTTN_ID , DATA_INST_ID, SCR_ID > are not null constrains of BTTN table and also composite Primary Key of the > table? > > Also tell me how can I prevent this unnecessary data generation in HIVE > table. > > > > Regards > > Ajit Kumar Shreevastava > > > > From: Ajit Kumar Shreevastava > Sent: Wednesday, March 06, 2013 12:40 PM > To: 'u...@sqoop.apache.org' > Cc: user@hive.apache.org > Subject: Data mismatch when importing data from Oracle to Hive through Sqoop > without an error > > > > HI, > > > > I have a table BTTN in Oracle database having 738183 records:à > > > > SQL> select count(1) from bttn; > > > > COUNT(1) > > ---------- > > 738183 > > > > Now I want to import this table to HIVE and I have fired the following > command at command prompt:à > > > > [hadoop@NHCLT-PC44-2 sqoop-oper]$ sqoop import --connect > jdbc:oracle:thin:@10.99.42.11:1521/clouddb --username HDFSUSER --table BTTN > --verbose -P --hive-table bttn --create-hive-table --hive-import --hive-home > /home/hadoop/user/hive/warehouse > > The output is as follows:à > > Warning: /usr/lib/hbase does not exist! HBase imports will fail. > > Please set $HBASE_HOME to the root of your HBase installation. > > 13/03/06 12:16:25 DEBUG tool.BaseSqoopTool: Enabled debug logging. > > Enter password: > > 13/03/06 12:16:32 INFO tool.BaseSqoopTool: Using Hive-specific delimiters > for output. You can override > > 13/03/06 12:16:32 INFO tool.BaseSqoopTool: delimiters with > --fields-terminated-by, etc. > > 13/03/06 12:16:32 DEBUG sqoop.ConnFactory: Loaded manager factory: > com.cloudera.sqoop.manager.DefaultManagerFactory > > 13/03/06 12:16:32 DEBUG sqoop.ConnFactory: Trying ManagerFactory: > com.cloudera.sqoop.manager.DefaultManagerFactory > > 13/03/06 12:16:32 DEBUG manager.DefaultManagerFactory: Trying with scheme: > jdbc:oracle:thin:@10.99.42.11 > > 13/03/06 12:16:32 DEBUG manager.OracleManager$ConnCache: Instantiated new > connection cache. > > 13/03/06 12:16:32 INFO manager.SqlManager: Using default fetchSize of 1000 > > 13/03/06 12:16:32 DEBUG sqoop.ConnFactory: Instantiated ConnManager > org.apache.sqoop.manager.OracleManager@2393385d > > 13/03/06 12:16:32 INFO tool.CodeGenTool: Beginning code generation > > 13/03/06 12:16:32 DEBUG manager.OracleManager: Using column names query: > SELECT t.* FROM BTTN t WHERE 1=0 > > 13/03/06 12:16:32 DEBUG manager.OracleManager: Creating a new connection for > jdbc:oracle:thin:@10.99.42.11:1521/clouddb, using username: HDFSUSER > > 13/03/06 12:16:32 DEBUG manager.OracleManager: No connection paramenters > specified. Using regular API for making connection. > > 13/03/06 12:16:32 INFO manager.OracleManager: Time zone has been set to GMT > > 13/03/06 12:16:32 DEBUG manager.SqlManager: Using fetchSize for next query: > 1000 > > 13/03/06 12:16:32 INFO manager.SqlManager: Executing SQL statement: SELECT > t.* FROM BTTN t WHERE 1=0 > > 13/03/06 12:16:32 DEBUG manager.OracleManager$ConnCache: Caching released > connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: selected columns: > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: BTTN_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: DATA_INST_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: SCR_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: BTTN_NU > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: CAT > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: WDTH > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: HGHT > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: KEY_SCAN > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: KEY_SHFT > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: FRGND_CPTN_COLR > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: FRGND_CPTN_COLR_PRSD > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: BKGD_CPTN_COLR > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: BKGD_CPTN_COLR_PRSD > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: BLM_FL > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: LCLZ_FL > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: MENU_ITEM_NU > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: BTTN_ASGN_LVL_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: ON_ATVT > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: ON_CLIK > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: ENBL_FL > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: BLM_SET_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: BTTN_ASGN_LVL_NAME > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: MKT_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: CRTE_TS > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: CRTE_USER_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: UPDT_TS > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: UPDT_USER_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: DEL_TS > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: DEL_USER_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: DLTD_FL > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: MENU_ITEM_NA > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: PRD_CD > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: BLM_SET_NA > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: SOUND_FILE_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: IS_DYNMC_BTTN > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: FRGND_CPTN_COLR_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: FRGND_CPTN_COLR_PRSD_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: BKGD_CPTN_COLR_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: BKGD_CPTN_COLR_PRSD_ID > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: Writing source file: > /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.java > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: Table name: BTTN > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: Columns: BTTN_ID:2, DATA_INST_ID:2, > SCR_ID:2, BTTN_NU:2, CAT:2, WDTH:2, HGHT:2, KEY_SCAN:2, KEY_SHFT:2, > FRGND_CPTN_COLR:12, FRGND_CPTN_COLR_PRSD:12, BKGD_CPTN_COLR:12, > BKGD_CPTN_COLR_PRSD:12, BLM_FL:2, LCLZ_FL:2, MENU_ITEM_NU:2, > BTTN_ASGN_LVL_ID:2, ON_ATVT:2, ON_CLIK:2, ENBL_FL:2, BLM_SET_ID:2, > BTTN_ASGN_LVL_NAME:12, MKT_ID:2, CRTE_TS:93, CRTE_USER_ID:12, UPDT_TS:93, > UPDT_USER_ID:12, DEL_TS:93, DEL_USER_ID:12, DLTD_FL:2, MENU_ITEM_NA:12, > PRD_CD:2, BLM_SET_NA:12, SOUND_FILE_ID:2, IS_DYNMC_BTTN:2, > FRGND_CPTN_COLR_ID:2, FRGND_CPTN_COLR_PRSD_ID:2, BKGD_CPTN_COLR_ID:2, > BKGD_CPTN_COLR_PRSD_ID:2, > > 13/03/06 12:16:32 DEBUG orm.ClassWriter: sourceFilename is BTTN.java > > 13/03/06 12:16:32 DEBUG orm.CompilationManager: Found existing > /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/ > > 13/03/06 12:16:32 INFO orm.CompilationManager: HADOOP_HOME is > /home/hadoop/hadoop-1.0.3/libexec/.. > > 13/03/06 12:16:32 DEBUG orm.CompilationManager: Adding source file: > /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.java > > 13/03/06 12:16:32 DEBUG orm.CompilationManager: Invoking javac with args: > > 13/03/06 12:16:32 DEBUG orm.CompilationManager: -sourcepath > > 13/03/06 12:16:32 DEBUG orm.CompilationManager: > /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/ > > 13/03/06 12:16:32 DEBUG orm.CompilationManager: -d > > 13/03/06 12:16:32 DEBUG orm.CompilationManager: > /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/ > > 13/03/06 12:16:32 DEBUG orm.CompilationManager: -classpath > > 13/03/06 12:16:32 DEBUG orm.CompilationManager: > /home/hadoop/hadoop-1.0.3/libexec/../conf:/usr/java/jdk1.6.0_32/lib/tools.jar:/home/hadoop/hadoop-1.0.3/libexec/..:/home/hadoop/hadoop-1.0.3/libexec/../hadoop-core-1.0.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/asm-3.2.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/aspectjrt-1.6.5.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/aspectjtools-1.6.5.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-beanutils-1.7.0.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-beanutils-core-1.8.0.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-cli-1.2.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-codec-1.4.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-collections-3.2.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-configuration-1.6.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-daemon-1.0.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-digester-1.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-el-1.0.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-httpclient-3.0.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-io-2.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-lang-2.4.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-logging-1.1.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-logging-api-1.0.4.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-math-2.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-net-1.4.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/core-3.1.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/hadoop-capacity-scheduler-1.0.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/hadoop-fairscheduler-1.0.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/hadoop-thriftfs-1.0.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/hsqldb-1.8.0.10.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jackson-core-asl-1.8.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jasper-compiler-5.5.12.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jasper-runtime-5.5.12.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jdeb-0.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jersey-core-1.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jersey-json-1.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jersey-server-1.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jets3t-0.6.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jetty-6.1.26.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jetty-util-6.1.26.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jsch-0.1.42.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/junit-4.5.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/kfs-0.2.2.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/log4j-1.2.15.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/mockito-all-1.8.5.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/oro-2.0.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/servlet-api-2.5-20081211.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/slf4j-api-1.4.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/slf4j-log4j12-1.4.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/xmlenc-0.52.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jsp-2.1/jsp-2.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jsp-2.1/jsp-api-2.1.jar:/home/hadoop/sqoop/conf::/home/hadoop/sqoop/lib/ant-contrib-1.0b3.jar:/home/hadoop/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar:/home/hadoop/sqoop/lib/avro-1.5.3.jar:/home/hadoop/sqoop/lib/avro-ipc-1.5.3.jar:/home/hadoop/sqoop/lib/avro-mapred-1.5.3.jar:/home/hadoop/sqoop/lib/commons-io-1.4.jar:/home/hadoop/sqoop/lib/hsqldb-1.8.0.10.jar:/home/hadoop/sqoop/lib/jackson-core-asl-1.7.3.jar:/home/hadoop/sqoop/lib/jackson-mapper-asl-1.7.3.jar:/home/hadoop/sqoop/lib/jopt-simple-3.2.jar:/home/hadoop/sqoop/lib/ojdbc6.jar:/home/hadoop/sqoop/lib/paranamer-2.3.jar:/home/hadoop/sqoop/lib/snappy-java-1.0.3.2.jar:/home/hadoop/sqoop/sqoop-1.4.2.jar:/home/hadoop/sqoop/sqoop-test-1.4.2.jar::/home/hadoop/hadoop-1.0.3/hadoop-core-1.0.3.jar:/home/hadoop/sqoop/sqoop-1.4.2.jar > > Note: /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.java > uses or overrides a deprecated API. > > Note: Recompile with -Xlint:deprecation for details. > > 13/03/06 12:16:34 INFO orm.CompilationManager: Writing jar file: > /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.jar > > 13/03/06 12:16:34 DEBUG orm.CompilationManager: Scanning for .class files in > directory: /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e > > 13/03/06 12:16:34 DEBUG orm.CompilationManager: Got classfile: > /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.class -> > BTTN.class > > 13/03/06 12:16:34 DEBUG orm.CompilationManager: Finished writing jar file > /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.jar > > 13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Got cached > connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER > > 13/03/06 12:16:34 INFO manager.OracleManager: Time zone has been set to GMT > > 13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Caching released > connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER > > 13/03/06 12:16:34 WARN manager.OracleManager: The table BTTN contains a > multi-column primary key. Sqoop will default to the column BTTN_ID only for > this job. > > 13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Got cached > connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER > > 13/03/06 12:16:34 INFO manager.OracleManager: Time zone has been set to GMT > > 13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Caching released > connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER > > 13/03/06 12:16:34 WARN manager.OracleManager: The table BTTN contains a > multi-column primary key. Sqoop will default to the column BTTN_ID only for > this job. > > 13/03/06 12:16:34 INFO mapreduce.ImportJobBase: Beginning import of BTTN > > 13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Got cached > connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER > > 13/03/06 12:16:34 INFO manager.OracleManager: Time zone has been set to GMT > > 13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Caching released > connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER > > 13/03/06 12:16:34 DEBUG mapreduce.DataDrivenImportJob: Using table class: > BTTN > > 13/03/06 12:16:34 DEBUG mapreduce.DataDrivenImportJob: Using InputFormat: > class com.cloudera.sqoop.mapreduce.db.OracleDataDrivenDBInputFormat > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/sqoop-1.4.2.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/ojdbc6.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/sqoop-1.4.2.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/sqoop-1.4.2.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/jackson-mapper-asl-1.7.3.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/hsqldb-1.8.0.10.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/avro-ipc-1.5.3.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/jopt-simple-3.2.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/ojdbc6.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/jackson-core-asl-1.7.3.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/ant-contrib-1.0b3.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/snappy-java-1.0.3.2.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/paranamer-2.3.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/avro-1.5.3.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/commons-io-1.4.jar > > 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: > file:/home/hadoop/sqoop/lib/avro-mapred-1.5.3.jar > > 13/03/06 12:16:35 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT > MIN(BTTN_ID), MAX(BTTN_ID) FROM BTTN > > 13/03/06 12:16:35 INFO mapred.JobClient: Running job: job_201303051835_0014 > > 13/03/06 12:16:36 INFO mapred.JobClient: map 0% reduce 0% > > 13/03/06 12:16:52 INFO mapred.JobClient: map 25% reduce 0% > > 13/03/06 12:17:01 INFO mapred.JobClient: map 75% reduce 0% > > 13/03/06 12:17:13 INFO mapred.JobClient: map 100% reduce 0% > > 13/03/06 12:17:19 INFO mapred.JobClient: Job complete: job_201303051835_0014 > > 13/03/06 12:17:19 INFO mapred.JobClient: Counters: 18 > > 13/03/06 12:17:19 INFO mapred.JobClient: Job Counters > > 13/03/06 12:17:19 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=71591 > > 13/03/06 12:17:19 INFO mapred.JobClient: Total time spent by all reduces > waiting after reserving slots (ms)=0 > > 13/03/06 12:17:19 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > > 13/03/06 12:17:19 INFO mapred.JobClient: Launched map tasks=4 > > 13/03/06 12:17:19 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 > > 13/03/06 12:17:19 INFO mapred.JobClient: File Output Format Counters > > 13/03/06 12:17:19 INFO mapred.JobClient: Bytes Written=184266237 > > 13/03/06 12:17:19 INFO mapred.JobClient: FileSystemCounters > > 13/03/06 12:17:19 INFO mapred.JobClient: HDFS_BYTES_READ=454 > > 13/03/06 12:17:19 INFO mapred.JobClient: FILE_BYTES_WRITTEN=117870 > > 13/03/06 12:17:19 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=184266237 > > 13/03/06 12:17:19 INFO mapred.JobClient: File Input Format Counters > > 13/03/06 12:17:19 INFO mapred.JobClient: Bytes Read=0 > > 13/03/06 12:17:19 INFO mapred.JobClient: Map-Reduce Framework > > 13/03/06 12:17:19 INFO mapred.JobClient: Map input records=738183 > > 13/03/06 12:17:19 INFO mapred.JobClient: Physical memory (bytes) > snapshot=735293440 > > 13/03/06 12:17:19 INFO mapred.JobClient: Spilled Records=0 > > 13/03/06 12:17:19 INFO mapred.JobClient: CPU time spent (ms)=28950 > > 13/03/06 12:17:19 INFO mapred.JobClient: Total committed heap usage > (bytes)=581435392 > > 13/03/06 12:17:19 INFO mapred.JobClient: Virtual memory (bytes) > snapshot=6086893568 > > 13/03/06 12:17:19 INFO mapred.JobClient: Map output records=738183 > > 13/03/06 12:17:19 INFO mapred.JobClient: SPLIT_RAW_BYTES=454 > > 13/03/06 12:17:19 INFO mapreduce.ImportJobBase: Transferred 175.73 MB in > 45.4959 seconds (3.8625 MB/sec) > > 13/03/06 12:17:19 INFO mapreduce.ImportJobBase: Retrieved 738183 records. > > 13/03/06 12:17:19 DEBUG hive.HiveImport: Hive.inputTable: BTTN > > 13/03/06 12:17:19 DEBUG hive.HiveImport: Hive.outputTable: bttn > > 13/03/06 12:17:19 DEBUG manager.OracleManager: Using column names query: > SELECT t.* FROM BTTN t WHERE 1=0 > > 13/03/06 12:17:19 DEBUG manager.OracleManager$ConnCache: Got cached > connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER > > 13/03/06 12:17:19 INFO manager.OracleManager: Time zone has been set to GMT > > 13/03/06 12:17:19 DEBUG manager.SqlManager: Using fetchSize for next query: > 1000 > > 13/03/06 12:17:19 INFO manager.SqlManager: Executing SQL statement: SELECT > t.* FROM BTTN t WHERE 1=0 > > 13/03/06 12:17:19 DEBUG manager.OracleManager$ConnCache: Caching released > connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BTTN_ID had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column DATA_INST_ID had to be > cast to a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column SCR_ID had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BTTN_NU had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column CAT had to be cast to a > less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column WDTH had to be cast to a > less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column HGHT had to be cast to a > less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column KEY_SCAN had to be cast > to a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column KEY_SHFT had to be cast > to a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BLM_FL had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column LCLZ_FL had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column MENU_ITEM_NU had to be > cast to a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BTTN_ASGN_LVL_ID had to > be cast to a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column ON_ATVT had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column ON_CLIK had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column ENBL_FL had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BLM_SET_ID had to be cast > to a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column MKT_ID had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column CRTE_TS had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column UPDT_TS had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column DEL_TS had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column DLTD_FL had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column PRD_CD had to be cast to > a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column SOUND_FILE_ID had to be > cast to a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column IS_DYNMC_BTTN had to be > cast to a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column FRGND_CPTN_COLR_ID had to > be cast to a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column FRGND_CPTN_COLR_PRSD_ID > had to be cast to a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BKGD_CPTN_COLR_ID had to > be cast to a less precise type in Hive > > 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BKGD_CPTN_COLR_PRSD_ID > had to be cast to a less precise type in Hive > > 13/03/06 12:17:19 DEBUG hive.TableDefWriter: Create statement: CREATE TABLE > `bttn` ( `BTTN_ID` DOUBLE, `DATA_INST_ID` DOUBLE, `SCR_ID` DOUBLE, `BTTN_NU` > DOUBLE, `CAT` DOUBLE, `WDTH` DOUBLE, `HGHT` DOUBLE, `KEY_SCAN` DOUBLE, > `KEY_SHFT` DOUBLE, `FRGND_CPTN_COLR` STRING, `FRGND_CPTN_COLR_PRSD` STRING, > `BKGD_CPTN_COLR` STRING, `BKGD_CPTN_COLR_PRSD` STRING, `BLM_FL` DOUBLE, > `LCLZ_FL` DOUBLE, `MENU_ITEM_NU` DOUBLE, `BTTN_ASGN_LVL_ID` DOUBLE, > `ON_ATVT` DOUBLE, `ON_CLIK` DOUBLE, `ENBL_FL` DOUBLE, `BLM_SET_ID` DOUBLE, > `BTTN_ASGN_LVL_NAME` STRING, `MKT_ID` DOUBLE, `CRTE_TS` STRING, > `CRTE_USER_ID` STRING, `UPDT_TS` STRING, `UPDT_USER_ID` STRING, `DEL_TS` > STRING, `DEL_USER_ID` STRING, `DLTD_FL` DOUBLE, `MENU_ITEM_NA` STRING, > `PRD_CD` DOUBLE, `BLM_SET_NA` STRING, `SOUND_FILE_ID` DOUBLE, > `IS_DYNMC_BTTN` DOUBLE, `FRGND_CPTN_COLR_ID` DOUBLE, > `FRGND_CPTN_COLR_PRSD_ID` DOUBLE, `BKGD_CPTN_COLR_ID` DOUBLE, > `BKGD_CPTN_COLR_PRSD_ID` DOUBLE) COMMENT 'Imported by sqoop on 2013/03/06 > 12:17:19' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES TERMINATED > BY '\012' STORED AS TEXTFILE > > 13/03/06 12:17:19 DEBUG hive.TableDefWriter: Load statement: LOAD DATA > INPATH 'hdfs://NHCLT-PC44-2:8020/user/hadoop/BTTN' INTO TABLE `bttn` > > 13/03/06 12:17:19 INFO hive.HiveImport: Removing temporary files from import > process: hdfs://NHCLT-PC44-2:8020/user/hadoop/BTTN/_logs > > 13/03/06 12:17:19 INFO hive.HiveImport: Loading uploaded data into Hive > > 13/03/06 12:17:19 DEBUG hive.HiveImport: Using external Hive process. > > 13/03/06 12:17:21 INFO hive.HiveImport: Logging initialized using > configuration in file:/home/hadoop/Hive/conf/hive-log4j.properties > > 13/03/06 12:17:21 INFO hive.HiveImport: Hive history > file=/home/hadoop/tmp/hadoop/hive_job_log_hadoop_201303061217_790206596.txt > > 13/03/06 12:17:32 INFO hive.HiveImport: OK > > 13/03/06 12:17:32 INFO hive.HiveImport: Time taken: 11.345 seconds > > 13/03/06 12:17:32 INFO hive.HiveImport: Loading data to table default.bttn > > 13/03/06 12:17:33 INFO hive.HiveImport: Table default.bttn stats: > [num_partitions: 0, num_files: 5, num_rows: 0, total_size: 184266237, > raw_data_size: 0] > > 13/03/06 12:17:33 INFO hive.HiveImport: OK > > 13/03/06 12:17:33 INFO hive.HiveImport: Time taken: 0.36 seconds > > 13/03/06 12:17:33 INFO hive.HiveImport: Hive import complete. > > 13/03/06 12:17:33 INFO hive.HiveImport: Export directory is empty, removing > it. > > [hadoop@NHCLT-PC44-2 sqoop-oper]$ > > > > Now I have fired the following query at hive prompt:à > > > > hive> select count(1) from bttn; > > Total MapReduce jobs = 1 > > Launching Job 1 out of 1 > > Number of reduce tasks determined at compile time: 1 > > In order to change the average load for a reducer (in bytes): > > set hive.exec.reducers.bytes.per.reducer=<number> > > In order to limit the maximum number of reducers: > > set hive.exec.reducers.max=<number> > > In order to set a constant number of reducers: > > set mapred.reduce.tasks=<number> > > Starting Job = job_201303051835_0016, Tracking URL = > http://NHCLT-PC44-2:50030/jobdetails.jsp?jobid=job_201303051835_0016 > > Kill Command = /home/hadoop/hadoop-1.0.3/bin/hadoop job -kill > job_201303051835_0016 > > Hadoop job information for Stage-1: number of mappers: 1; number of > reducers: 1 > > 2013-03-06 12:21:45,350 Stage-1 map = 0%, reduce = 0% > > 2013-03-06 12:21:54,370 Stage-1 map = 76%, reduce = 0% > > 2013-03-06 12:21:57,378 Stage-1 map = 100%, reduce = 0%, Cumulative CPU > 3.68 sec > > 2013-03-06 12:21:58,381 Stage-1 map = 100%, reduce = 0%, Cumulative CPU > 3.68 sec > > 2013-03-06 12:21:59,385 Stage-1 map = 100%, reduce = 0%, Cumulative CPU > 3.68 sec > > 2013-03-06 12:22:00,388 Stage-1 map = 100%, reduce = 0%, Cumulative CPU > 3.68 sec > > 2013-03-06 12:22:01,391 Stage-1 map = 100%, reduce = 0%, Cumulative CPU > 3.68 sec > > 2013-03-06 12:22:02,394 Stage-1 map = 100%, reduce = 0%, Cumulative CPU > 3.68 sec > > 2013-03-06 12:22:03,397 Stage-1 map = 100%, reduce = 0%, Cumulative CPU > 3.68 sec > > 2013-03-06 12:22:04,401 Stage-1 map = 100%, reduce = 0%, Cumulative CPU > 3.68 sec > > 2013-03-06 12:22:05,404 Stage-1 map = 100%, reduce = 0%, Cumulative CPU > 3.68 sec > > 2013-03-06 12:22:06,407 Stage-1 map = 100%, reduce = 0%, Cumulative CPU > 3.68 sec > > 2013-03-06 12:22:07,410 Stage-1 map = 100%, reduce = 0%, Cumulative CPU > 3.68 sec > > 2013-03-06 12:22:08,415 Stage-1 map = 100%, reduce = 0%, Cumulative CPU > 3.68 sec > > 2013-03-06 12:22:09,418 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 7.37 sec > > 2013-03-06 12:22:10,421 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 7.37 sec > > 2013-03-06 12:22:11,425 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 7.37 sec > > 2013-03-06 12:22:12,428 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 7.37 sec > > 2013-03-06 12:22:13,432 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 7.37 sec > > 2013-03-06 12:22:14,435 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 7.37 sec > > 2013-03-06 12:22:15,439 Stage-1 map = 100%, reduce = 100%, Cumulative CPU > 7.37 sec > > MapReduce Total cumulative CPU time: 7 seconds 370 msec > > Ended Job = job_201303051835_0016 > > MapReduce Jobs Launched: > > Job 0: Map: 1 Reduce: 1 Cumulative CPU: 7.37 sec HDFS Read: 184270926 > HDFS Write: 7 SUCCESS > > Total MapReduce CPU Time Spent: 7 seconds 370 msec > > OK > > 739169 > > Time taken: 37.639 seconds > > hive> > > > > The data present in oracle database is 738183 but imported into HIVE is > 739169. > > Can you please let me know the issue. > > I have already pasted all my screen log hare. > > > > Thanks and Regards, > > Ajit Kumar Shreevastava > > > > ::DISCLAIMER:: > ---------------------------------------------------------------------------------------------------------------------------------------------------- > > The contents of this e-mail and any attachment(s) are confidential and > intended for the named recipient(s) only. > E-mail transmission is not guaranteed to be secure or error-free as > information could be intercepted, corrupted, > lost, destroyed, arrive late or incomplete, or may contain viruses in > transmission. The e mail and its contents > (with or without referred errors) shall therefore not attach any liability > on the originator or HCL or its affiliates. > Views or opinions, if any, presented in this email are solely those of the > author and may not necessarily reflect the > views or opinions of HCL or its affiliates. Any form of reproduction, > dissemination, copying, disclosure, modification, > distribution and / or publication of this message without the prior written > consent of authorized representative of > HCL is strictly prohibited. If you have received this email in error please > delete it and notify the sender immediately. > Before opening any email and/or attachments, please check them for viruses > and other defects. > > ---------------------------------------------------------------------------------------------------------------------------------------------------- > >