MappedStream vs Transform API

madhu phatak Mon, 16 Mar 2015 01:34:22 -0700

Hi,
  Current implementation of map function in spark streaming looks as below.


  def map[U: ClassTag](mapFunc: T => U): DStream[U] = {

  new MappedDStream(this, context.sparkContext.clean(mapFunc))
}

It creates an instance of MappedDStream which is a subclass of DStream.

The same function can be also implemented using transform API

def map[U: ClassTag](mapFunc: T => U): DStream[U] =

this.transform(rdd => {

  rdd.map(mapFunc)
})


Both implementation looks same. If they are same, is there any advantage
having a subclass of DStream?. Why can't we just use transform API?


Regards,
Madhukara Phatak
http://datamantra.io/

MappedStream vs Transform API

Reply via email to