Re: [Spark Streaming] Design Patterns forEachRDD

Sandip Mehta Wed, 21 Oct 2015 11:08:27 -0700

Does this help ?

final JavaHBaseContext hbaseContext = new JavaHBaseContext(javaSparkContext, 
conf);
    customerModels.foreachRDD(new Function<JavaRDD<Customer>, Void>() {
      private static final long serialVersionUID = 1L;
      @Override
      public Void call(JavaRDD<Customer> currentRDD) throws Exception {
        JavaRDD<Promotions> customerWithPromotion = 
hbaseContext.mapPartition(currentRDD, new PromotionLookupFunction());
        customerWithPromotion.persist(StorageLevel.MEMORY_AND_DISK_SER());
        customerWithPromotion.foreachPartition(<promotions function>);
      }
    });



> On 21-Oct-2015, at 10:55 AM, Nipun Arora <nipunarora2...@gmail.com> wrote:
> 
> Hi All,
> 
> Can anyone provide a design pattern for the following code shown in the Spark 
> User Manual, in JAVA ? I have the same exact use-case, and for some reason 
> the design pattern for Java is missing.
> 
>  Scala version taken from : 
> http://spark.apache.org/docs/latest/streaming-programming-guide.html#design-patterns-for-using-foreachrdd
>  
> <http://spark.apache.org/docs/latest/streaming-programming-guide.html#design-patterns-for-using-foreachrdd>
> 
> dstream.foreachRDD { rdd =>
>   rdd.foreachPartition { partitionOfRecords =>
>     val connection = createNewConnection()
>     partitionOfRecords.foreach(record => connection.send(record))
>     connection.close()
>   }
> }
> 
> I have googled for it and haven't really found a solution. This seems to be 
> an important piece of information, especially for people who need to ship 
> their code necessarily in Java because of constraints in the company (like 
> me) :)
> 
> I'd really appreciate any help
> 
> Thanks
> Nipun

Re: [Spark Streaming] Design Patterns forEachRDD

Reply via email to