Hi
Can you share your Web UI, depicting your task level breakup.I can see many
thing
s that can be improved.
1. JavaRDD rdds = ...rdds.cache(); ->this caching is not needed as
you are not reading the rdd for any action
2.Instead of collecting as list, if you can save as text file, it would be
b
> why are you cache both rdd and table?
I try to cache all the data to avoid the bad performance for the first
query. Is it right?
> Which stage of job is slow?
The query is run many times on one sqlContext and each query execution
takes 1 second.
2015-04-23 11:33 GMT+03:00 ayan guha :
> Quick q
Quick questions: why are you cache both rdd and table?
Which stage of job is slow?
On 23 Apr 2015 17:12, "Nikolay Tikhonov" wrote:
> Hi,
> I have Spark SQL performance issue. My code contains a simple JavaBean:
>
> public class Person implements Externalizable {
> private int id;
>