Dear all,

wenn i try to fetch a web page (e.g.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html ) with mysql storage
definition,
I am seeing the following error in my hadoop logs. ,  (no error with hbase )
;

java.io.IOException: java.sql.BatchUpdateException: Data truncation: Data
too long for column 'content' at row 1
    at org.gora.sql.store.SqlStore.flush(SqlStore.java:316)
    at org.gora.sql.store.SqlStore.close(SqlStore.java:163)
    at org.gora.mapreduce.GoraOutputFormat$1.close(GoraOutputFormat.java:72)
    at
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
    at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)

The type of the column 'content' is BLOB.
It may be important for the next developments of Gora.
Should I file this in nutch-jira or hithub/gora or nothing?

environments : ubuntu 10.04
JVM : 1.6.0_20
nutch 2.0 (trunk)
Mysql/HBase (0.20.6) / Hadoop(0.20.2) pseudo-distributed


Best regards,

Faruk Berksöz

Reply via email to