Hello, I am running Sqoop 1.3.0-cdh3u4, as part of the Cloudera CDH.
I am trying to get data from Oracle 11gR2 to HBase. The import works, but CLOB columns are not making it into HBase. My simplest testcase: In Oracle: CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE), CLOBCOL CLOB ); INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval', 'clobval'); The sqoop command I run is following (the connect parameter is shortened, but works): sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1 --hbase-table table1 --hbase-create-table --hbase-row-key NUMCOL --column-family d -m 1 The job runs OK, the only surprising is the second to last line: 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 7.3188 seconds (0 bytes/sec) 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1 records. Anyway, after looking at the table in HBase: # hbase shell Version 0.90.6-cdh3u4, r, Mon May 7 13:14:00 PDT 2012 hbase(main):001:0> scan 'table1' ROW COLUMN+CELL 1 column=d:STRCOL, timestamp=1371070804479, value=strval 1 row(s) in 0.6070 seconds The CLOBCOL is not there. The CLOB handling in sqoop must work in general, because when I import the same table into Hive or just text file, the clob data is there. The problem exists only when importing into HBase. I tried searching Sqoop Jira and the internets at large, but could not find any mention of CLOBs not getting into HBase. Thank you for your help, Michal Taborsky
