So after months and months, I finally started to try and tackle this, but
my scala ability isn't up to it.
The problem is that, of course, even with the common interface, we don't
want inter-operability between RDDs and DStreams.
I looked into Monads, as per Ashish's suggestion, and I think I understand
their relevance. But when done processing, one would still have to pull
out the wrapped object, knowing what it was, and I don't see how to do that.
I'm guessing there is a way to do this in scala, but I'm not seeing it.
In detail, the requirement would be having something on the order of:
abstract class DistributedCollection[T] {
def [U] map(fcn: T => U): DistributedCollection[U]
...
}
class RDD extends DistrubutedCollection[T] {
// Note the return type that doesn't quite match the interface
def [U] map(fcn: T => U): RDD[U]
...
}
class DStream extends DistrubutedCollection[T] {
// Note the return type that doesn't quite match the interface
def [U] map(fcn: T => U): DStreamU]
...
}
Can anyone point me at a way to do this?
Thanks,
-Nathan
On Thu, Dec 19, 2013 at 1:08 AM, Ashish Rangole <[email protected]> wrote:
> I wonder if it will help to have a generic Monad container that wraps
> either RDD or DStream and provides
> map, flatmap, foreach and filter methods.
>
> case class DataMonad[A](data: A) {
> def map[B]( f : A => B ) : DataMonad[B] = {
> DataMonad( f( data ) )
> }
>
> def flatMap[B]( f : A => DataMonad[B] ) : DataMonad[B] = {
> f( data )
> }
>
> def foreach ...
> def withFilter ...
> :
> :
> etc, something like that
> }
>
> On Wed, Dec 18, 2013 at 10:42 PM, Reynold Xin <[email protected]> wrote:
>
>>
>> On Wed, Dec 18, 2013 at 12:17 PM, Nathan Kronenfeld <
>> [email protected]> wrote:
>>
>>>
>>>
>>> Since many of the functions exist in parallel between the two, I guess I
>>> would expect something like:
>>>
>>> trait BasicRDDFunctions {
>>> def map...
>>> def reduce...
>>> def filter...
>>> def foreach...
>>> }
>>>
>>> class RDD extends BasicRDDFunctions...
>>> class DStream extends BasicRDDFunctions...
>>>
>>
>> I like this idea. We should discuss more about it on the dev list. It
>> would require refactoring some APIs, but does lead to better unification.
>>
>
>
--
Nathan Kronenfeld
Senior Visualization Developer
Oculus Info Inc
2 Berkeley Street, Suite 600,
Toronto, Ontario M5A 4J5
Phone: +1-416-203-3003 x 238
Email: [email protected]