Hi, I am trying to do something like following in Spark:
JavaPairRDD<byte[], MyObject> eventRDD = hBaseRDD.map(new PairFunction<Tuple2<ImmutableBytesWritable, Result>, byte[], MyObject >() { @Override public Tuple2<byte[], MyObject > call(Tuple2<ImmutableBytesWritable, Result> immutableBytesWritableResultTuple2) throws Exception { return new Tuple2<byte[], MyObject >(immutableBytesWritableResultTuple2._1.get(), MyClass.get(immutableBytesWritableResultTuple2._2)); } }); eventRDD.foreach(new VoidFunction<Tuple2<byte[], Event>>() { @Override public void call(Tuple2<byte[], Event> eventTuple2) throws Exception { processForEvent(eventTuple2._2); } }); processForEvent() function flow contains some processing and ultimately writing to HBase Table. But I am getting serialisation issues with Hadoop and HBase inbuilt classes. How do I solve this ? Does using Kyro Serialisation help in this case ? Thanks, -Vibhor