Re: Spark and Scala

Soumya Simanta Fri, 12 Sep 2014 21:16:38 -0700

An RDD is a fault-tolerant distributed structure. It is the primary
abstraction in Spark.


I would strongly suggest that you have a look at the following to get a
basic idea.

http://www.cs.berkeley.edu/~pwendell/strataconf/api/core/spark/RDD.html
http://spark.apache.org/docs/latest/quick-start.html#basics
https://www.usenix.org/conference/nsdi12/technical-sessions/presentation/zaharia

On Sat, Sep 13, 2014 at 12:06 AM, Deep Pradhan <pradhandeep1...@gmail.com>
wrote:

> Take for example this:
> I have declared one queue *val queue = Queue.empty[Int]*, which is a pure
> scala line in the program. I actually want the queue to be an RDD but there
> are no direct methods to create RDD which is a queue right? What say do you
> have on this?
> Does there exist something like: *Create and RDD which is a queue *?
>
> On Sat, Sep 13, 2014 at 8:43 AM, Hari Shreedharan <
> hshreedha...@cloudera.com> wrote:
>
>> No, Scala primitives remain primitives. Unless you create an RDD using
>> one of the many methods - you would not be able to access any of the RDD
>> methods. There is no automatic porting. Spark is an application as far as
>> scala is concerned - there is no compilation (except of course, the scala,
>> JIT compilation etc).
>>
>> On Fri, Sep 12, 2014 at 8:04 PM, Deep Pradhan <pradhandeep1...@gmail.com>
>> wrote:
>>
>>> I know that unpersist is a method on RDD.
>>> But my confusion is that, when we port our Scala programs to Spark,
>>> doesn't everything change to RDDs?
>>>
>>> On Fri, Sep 12, 2014 at 10:16 PM, Nicholas Chammas <
>>> nicholas.cham...@gmail.com> wrote:
>>>
>>>> unpersist is a method on RDDs. RDDs are abstractions introduced by
>>>> Spark.
>>>>
>>>> An Int is just a Scala Int. You can't call unpersist on Int in Scala,
>>>> and that doesn't change in Spark.
>>>>
>>>> On Fri, Sep 12, 2014 at 12:33 PM, Deep Pradhan <
>>>> pradhandeep1...@gmail.com> wrote:
>>>>
>>>>> There is one thing that I am confused about.
>>>>> Spark has codes that have been implemented in Scala. Now, can we run
>>>>> any Scala code on the Spark framework? What will be the difference in the
>>>>> execution of the scala code in normal systems and on Spark?
>>>>> The reason for my question is the following:
>>>>> I had a variable
>>>>> *val temp = <some operations>*
>>>>> This temp was being created inside the loop, so as to manually throw
>>>>> it out of the cache, every time the loop ends I was calling
>>>>> *temp.unpersist()*, this was returning an error saying that *value
>>>>> unpersist is not a method of Int*, which means that temp is an Int.
>>>>> Can some one explain to me why I was not able to call *unpersist* on
>>>>> *temp*?
>>>>>
>>>>> Thank You
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Spark and Scala

Reply via email to