pull/11956. This PR shows room to
performance improvement for float/double values that are not compressed.
Kazuaki Ishizaki
From: linguin@gmail.com
To: Maciej Bry��ski
Cc: Spark dev list
Date: 2016/08/28 11:30
Subject: Re: Cache'ing performance
Hi,
How does
uaki Ishizaki
>
>
>
> From:Maciej Bryński
> To: Spark dev list
> Date:2016/08/28 05:40
> Subject:Cache'ing performance
>
>
>
> Hi,
> I did some benchmark of cache function today.
>
> RDD
> sc.parallelize(0 until
these pull requests.
Best Regards,
Kazuaki Ishizaki
From: Maciej Bryński
To: Spark dev list
Date: 2016/08/28 05:40
Subject: Cache'ing performance
Hi,
I did some benchmark of cache function today.
RDD
sc.parallelize(0 until Int.MaxValue).cache().count()
Datas
Hi,
I did some benchmark of cache function today.
*RDD*
sc.parallelize(0 until Int.MaxValue).cache().count()
*Datasets*
spark.range(Int.MaxValue).cache().count()
For me Datasets was 2 times slower.
Results (3 nodes, 20 cores and 48GB RAM each)
*RDD - 6s*
*Datasets - 13,5 s*
Is that expected be