Dmitry Zagorulkin created SQOOP-3123:
----------------------------------------

             Summary: Import from oracle using oraoop with map-column-java to 
avro fails if special characters encounter in table name or column name 
                 Key: SQOOP-3123
                 URL: https://issues.apache.org/jira/browse/SQOOP-3123
             Project: Sqoop
          Issue Type: Bug
          Components: connectors/oracle
    Affects Versions: 1.4.6, 1.4.7
            Reporter: Dmitry Zagorulkin


I'm trying to import data from oracle to avro using oraoop.

My table:

{code}
CREATE TABLE "IBS"."BRITISH#CATS"
(    "ID" NUMBER,
     "C_CODE" VARCHAR2(10),
     "C_USE_START#DATE" DATE,
     "C_USE_USE#NEXT_DAY" VARCHAR2(1),
     "C_LIM_MIN#DAT" DATE,
     "C_LIM_MIN#TIME" TIMESTAMP,
     "C_LIM_MIN#SUM" NUMBER,
     "C_OWNCODE" VARCHAR2(1),
     "C_LIMIT#SUM_LIMIT" NUMBER(17,2),
     "C_L@M" NUMBER(17,2),
     "C_1_THROW" NUMBER NOT NULL ENABLE,
     "C_#_LIMITS" NUMBER NOT NULL ENABLE
) SEGMENT CREATION IMMEDIATE
PCTFREE 70 PCTUSED 40 INITRANS 2 MAXTRANS 255
NOCOMPRESS LOGGING
STORAGE(INITIAL 2097152 NEXT 524288 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "WORK" ;
{code}

My first script is:
{code}
./sqoop import \                                                                
                                  
  -Doraoop.timestamp.string=false \
  --direct \
  --connect jdbc:oracle:thin:@localhost:49161:XE \
  --username system \
  --password oracle \
  --table IBS.BRITISH#CATS \
  --target-dir /Users/Dmitry/Developer/Java/sqoop/bin/imported \
  --as-avrodatafile \
  --map-column-java 
ID=String,C_CODE=String,C_USE_START#DATE=String,C_USE_USE#NEXT_DAY=String,C_LIM_MIN#DAT=String,C_LIM_MIN#TIME=String,C_LIM_MIN#SUM=String,C_OWNCODE=String,C_LIMIT#SUM_LIMIT=String,C_L_M=String,C_1_THROW=String,C_#_LIMITS=String
{code}

fails with

{code}
2017-01-13 16:11:21,348 ERROR [main] tool.ImportTool (ImportTool.java:run(625)) 
- Import failed: No column by the name C_LIMIT#SUM_LIMITfound while importing 
data; expecting one of [C_LIMIT_SUM_LIMIT, C_OWNCODE, C_L_M, C___LIMITS, 
C_LIM_MIN_DAT, C_1_THROW, C_CODE, C_USE_START_DATE, C_LIM_MIN_SUM, ID, 
C_LIM_MIN_TIME, C_USE_USE_NEXT_DAY]
{code}

After i've found that sqoop has replaced all special characters with 
underscore. My second script is:

{code}
./sqoop import \                                                                
                                  
  -D oraoop.timestamp.string=false \
  --direct \
  --connect jdbc:oracle:thin:@localhost:49161:XE \
  --username system \
  --password oracle \
  --table IBS.BRITISH#CATS \
  --target-dir /Users/Dmitry/Developer/Java/sqoop/bin/imported \
  --as-avrodatafile \
  --map-column-java 
ID=String,C_CODE=String,C_USE_START_DATE=String,C_USE_USE_NEXT_DAY=String,C_LIM_MIN_DAT=String,C_LIM_MIN_TIME=String,C_LIM_MIN_SUM=String,C_OWNCODE=String,C_LIMIT_SUM_LIMIT=String,C_L_M=String,C_1_THROW=String,C___LIMITS=String
 \
  --verbose
{code}

Fails with: Caused by: org.apache.avro.UnresolvedUnionException: Not in union 
["null","long"]: 2017-01-13 11:22:53.0

{code}
2017-01-13 16:14:54,687 WARN  [Thread-26] mapred.LocalJobRunner 
(LocalJobRunner.java:run(560)) - job_local1372531461_0001
java.lang.Exception: org.apache.avro.file.DataFileWriter$AppendWriteException: 
org.apache.avro.UnresolvedUnionException: Not in union ["null","long"]: 
2017-01-13 11:22:53.0
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException: 
org.apache.avro.UnresolvedUnionException: Not in union ["null","long"]: 
2017-01-13 11:22:53.0
        at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:308)
        at 
org.apache.sqoop.mapreduce.AvroOutputFormat$1.write(AvroOutputFormat.java:112)
        at 
org.apache.sqoop.mapreduce.AvroOutputFormat$1.write(AvroOutputFormat.java:108)
        at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:655)
        at 
org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
        at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
        at 
org.apache.sqoop.mapreduce.AvroImportMapper.map(AvroImportMapper.java:73)
        at 
org.apache.sqoop.mapreduce.AvroImportMapper.map(AvroImportMapper.java:39)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at 
org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.avro.UnresolvedUnionException: Not in union 
["null","long"]: 2017-01-13 11:22:53.0
        at 
org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:709)
        at 
org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:192)
        at 
org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:110)
        at 
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
        at 
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:150)
        at 
org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:153)
        at 
org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:90)
        at 
org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:182)
        at 
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143)
        at 
org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105)
        at 
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
        at 
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:150)
        at 
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60)
        at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:302)
        ... 17 more
{code}

I've found that old problem and "oraoop.timestamp.string=false" must solve it, 
but it does not.

What do you think?
Also please assign this problem to me.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to