Re: Expanded docs for the various storage levels

2016-07-07 Thread Nicholas Chammas
JIRA is here: https://issues.apache.org/jira/browse/SPARK-16427

On Thu, Jul 7, 2016 at 3:18 PM Reynold Xin  wrote:

> Please create a patch. Thanks!
>
>
> On Thu, Jul 7, 2016 at 12:07 PM, Nicholas Chammas <
> nicholas.cham...@gmail.com> wrote:
>
>> I’m looking at the docs here:
>>
>>
>> http://spark.apache.org/docs/1.6.2/api/python/pyspark.html#pyspark.StorageLevel
>> 
>>
>> A newcomer to Spark won’t understand the meaning of _2, or the meaning
>> of _SER (or its value), and won’t understand how exactly memory and disk
>> play together when something like MEMORY_AND_DISK is selected.
>>
>> Is there a place in the docs that expands on the storage levels a bit? If
>> not, shall we create a JIRA and expand this documentation? I don’t mind
>> taking on this task, though frankly I’m interested in this because I don’t
>> fully understand the differences myself. :)
>>
>> Nick
>> ​
>>
>
>


Re: Expanded docs for the various storage levels

2016-07-07 Thread Reynold Xin
Please create a patch. Thanks!


On Thu, Jul 7, 2016 at 12:07 PM, Nicholas Chammas <
nicholas.cham...@gmail.com> wrote:

> I’m looking at the docs here:
>
>
> http://spark.apache.org/docs/1.6.2/api/python/pyspark.html#pyspark.StorageLevel
> 
>
> A newcomer to Spark won’t understand the meaning of _2, or the meaning of
> _SER (or its value), and won’t understand how exactly memory and disk
> play together when something like MEMORY_AND_DISK is selected.
>
> Is there a place in the docs that expands on the storage levels a bit? If
> not, shall we create a JIRA and expand this documentation? I don’t mind
> taking on this task, though frankly I’m interested in this because I don’t
> fully understand the differences myself. :)
>
> Nick
> ​
>


Expanded docs for the various storage levels

2016-07-07 Thread Nicholas Chammas
I’m looking at the docs here:

http://spark.apache.org/docs/1.6.2/api/python/pyspark.html#pyspark.StorageLevel


A newcomer to Spark won’t understand the meaning of _2, or the meaning of
_SER (or its value), and won’t understand how exactly memory and disk play
together when something like MEMORY_AND_DISK is selected.

Is there a place in the docs that expands on the storage levels a bit? If
not, shall we create a JIRA and expand this documentation? I don’t mind
taking on this task, though frankly I’m interested in this because I don’t
fully understand the differences myself. :)

Nick
​