So after months and months, I finally started to try and tackle this, but my scala ability isn't up to it.
The problem is that, of course, even with the common interface, we don't want inter-operability between RDDs and DStreams. I looked into Monads, as per Ashish's suggestion, and I think I understand their relevance. But when done processing, one would still have to pull out the wrapped object, knowing what it was, and I don't see how to do that. I'm guessing there is a way to do this in scala, but I'm not seeing it. In detail, the requirement would be having something on the order of: abstract class DistributedCollection[T] { def [U] map(fcn: T => U): DistributedCollection[U] ... } class RDD extends DistrubutedCollection[T] { // Note the return type that doesn't quite match the interface def [U] map(fcn: T => U): RDD[U] ... } class DStream extends DistrubutedCollection[T] { // Note the return type that doesn't quite match the interface def [U] map(fcn: T => U): DStreamU] ... } Can anyone point me at a way to do this? Thanks, -Nathan On Thu, Dec 19, 2013 at 1:08 AM, Ashish Rangole <arang...@gmail.com> wrote: > I wonder if it will help to have a generic Monad container that wraps > either RDD or DStream and provides > map, flatmap, foreach and filter methods. > > case class DataMonad[A](data: A) { > def map[B]( f : A => B ) : DataMonad[B] = { > DataMonad( f( data ) ) > } > > def flatMap[B]( f : A => DataMonad[B] ) : DataMonad[B] = { > f( data ) > } > > def foreach ... > def withFilter ... > : > : > etc, something like that > } > > On Wed, Dec 18, 2013 at 10:42 PM, Reynold Xin <r...@apache.org> wrote: > >> >> On Wed, Dec 18, 2013 at 12:17 PM, Nathan Kronenfeld < >> nkronenf...@oculusinfo.com> wrote: >> >>> >>> >>> Since many of the functions exist in parallel between the two, I guess I >>> would expect something like: >>> >>> trait BasicRDDFunctions { >>> def map... >>> def reduce... >>> def filter... >>> def foreach... >>> } >>> >>> class RDD extends BasicRDDFunctions... >>> class DStream extends BasicRDDFunctions... >>> >> >> I like this idea. We should discuss more about it on the dev list. It >> would require refactoring some APIs, but does lead to better unification. >> > > -- Nathan Kronenfeld Senior Visualization Developer Oculus Info Inc 2 Berkeley Street, Suite 600, Toronto, Ontario M5A 4J5 Phone: +1-416-203-3003 x 238 Email: nkronenf...@oculusinfo.com