CLOB data not imported into HBase from Oracle

Michal Taborsky Wed, 12 Jun 2013 14:12:59 -0700

Hello,

I am running Sqoop 1.3.0-cdh3u4, as part of the Cloudera CDH.


I am trying to get data from Oracle 11gR2 to HBase. The import works, but
CLOB columns are not making it into HBase.

My simplest testcase:

In Oracle:
CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE), CLOBCOL CLOB
);
INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval',
'clobval');

The sqoop command I run is following (the connect parameter is shortened,
but works):

sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1 --hbase-table
table1 --hbase-create-table --hbase-row-key NUMCOL --column-family d -m 1

The job runs OK, the only surprising is the second to last line:
13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0 bytes in
7.3188 seconds (0 bytes/sec)
13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1 records.

Anyway, after looking at the table in HBase:

# hbase shell
Version 0.90.6-cdh3u4, r, Mon May  7 13:14:00 PDT 2012

hbase(main):001:0> scan 'table1'
ROW                            COLUMN+CELL
 1                             column=d:STRCOL, timestamp=1371070804479,
value=strval
1 row(s) in 0.6070 seconds

The CLOBCOL is not there. The CLOB handling in sqoop must work in general,
because when I import the same table into Hive or just text file, the clob
data is there. The problem exists only when importing into HBase. I tried
searching Sqoop Jira and the internets at large, but could not find any
mention of CLOBs not getting into HBase.

Thank you for your help,
Michal Taborsky

CLOB data not imported into HBase from Oracle

Reply via email to