Hello all, When I try to read data from an HBase table, I get an unread block data exception. I am running HBase and Spark on a single node (my workstation). My code is in Java, and I'm running it from the Eclipse IDE. Here are the versions I'm using :
Cloudera : 2.5.0-cdh5.2.1 Hadoop : 2.5.0-cdh5.2.1 HBase : HBase 0.98.6-cdh5.2.1 Java : 1.7.0_65 Ubuntu : 14.04.1 LTS Spark : 1.2 This is the exception I'm getting : 15/01/12 15:47:10 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 10.211.55.3): java.lang.IllegalStateException: unread block data I notice the Spark jar that ships with Spark 1.2 is called spark-assembly-1.2.0-hadoop2.0.0-mr1-cdh4.2.0.jar. I'm assuming this means it was built with CDH4. I'm using CDH5. Could this be my problem? I've tried explicitly adding the Hadoop and HBase JARs in my code, and it didn't do anything. This is what I tried : sparkContext.addJar("/usr/lib/hbase/hbase-server-0.98.6-cdh5.2.1.jar"); sparkContext.addJar("/usr/lib/hbase/hbase-client-0.98.6-cdh5.2.1.jar"); sparkContext.addJar("/usr/lib/hadoop/hadoop-common-2.5.0-cdh5.2.1.jar"); Thanks for taking a look!