Does this help ? final JavaHBaseContext hbaseContext = new JavaHBaseContext(javaSparkContext, conf); customerModels.foreachRDD(new Function<JavaRDD<Customer>, Void>() { private static final long serialVersionUID = 1L; @Override public Void call(JavaRDD<Customer> currentRDD) throws Exception { JavaRDD<Promotions> customerWithPromotion = hbaseContext.mapPartition(currentRDD, new PromotionLookupFunction()); customerWithPromotion.persist(StorageLevel.MEMORY_AND_DISK_SER()); customerWithPromotion.foreachPartition(<promotions function>); } });
> On 21-Oct-2015, at 10:55 AM, Nipun Arora <nipunarora2...@gmail.com> wrote: > > Hi All, > > Can anyone provide a design pattern for the following code shown in the Spark > User Manual, in JAVA ? I have the same exact use-case, and for some reason > the design pattern for Java is missing. > > Scala version taken from : > http://spark.apache.org/docs/latest/streaming-programming-guide.html#design-patterns-for-using-foreachrdd > > <http://spark.apache.org/docs/latest/streaming-programming-guide.html#design-patterns-for-using-foreachrdd> > > dstream.foreachRDD { rdd => > rdd.foreachPartition { partitionOfRecords => > val connection = createNewConnection() > partitionOfRecords.foreach(record => connection.send(record)) > connection.close() > } > } > > I have googled for it and haven't really found a solution. This seems to be > an important piece of information, especially for people who need to ship > their code necessarily in Java because of constraints in the company (like > me) :) > > I'd really appreciate any help > > Thanks > Nipun