Without knowing too much about your application, it would be hard to say.
Maybe it is working faster in local as there is no shuffling etc? The
spark.ui would be your best bet to know what stage is slowing things down.

On Fri 24 Aug, 2018, 3:26 PM Guillermo Ortiz, <konstt2...@gmail.com> wrote:

> Another test I just did it's to execute with local[X] and this problem
> doesn't happen.  Communication problems?
>
> 2018-08-23 22:43 GMT+02:00 Guillermo Ortiz <konstt2...@gmail.com>:
>
>> it's a complex DAG before the point I cache the RDD, they are some joins,
>> filter and maps before caching data, but most of the times it doesn't take
>> almost time to do it. I could understand if it would take the same time all
>> the times to process or cache the data. Besides it seems random and they
>> are any weird data in the input.
>>
>> Another test I tried it's disabled caching, and I saw that all the
>> microbatches last the same time, so it seems that it's relation with
>> caching these RDD's.
>>
>> El jue., 23 ago. 2018 a las 15:29, Sonal Goyal (<sonalgoy...@gmail.com>)
>> escribió:
>>
>>> How are these small RDDs created? Could the blockage be in their compute
>>> creation instead of their caching?
>>>
>>> Thanks,
>>> Sonal
>>> Nube Technologies <http://www.nubetech.co>
>>>
>>> <http://in.linkedin.com/in/sonalgoyal>
>>>
>>>
>>>
>>> On Thu, Aug 23, 2018 at 6:38 PM, Guillermo Ortiz <konstt2...@gmail.com>
>>> wrote:
>>>
>>>> I use spark with caching with persist method. I have several RDDs what
>>>> I cache but some of them are pretty small (about 300kbytes). Most of time
>>>> it works well and usually lasts 1s the whole job, but sometimes it takes
>>>> about 40s to store 300kbytes to cache.
>>>>
>>>> If I go to the SparkUI->Cache, I can see how the percentage is
>>>> increasing until 83% (250kbytes) and then it stops for a while. If I check
>>>> the event time in the Spark UI I can see that when this happen there is a
>>>> node where tasks takes very long time. This node could be any from the
>>>> cluster, it's not always the same.
>>>>
>>>> In the spark executor logs I can see it's that it takes about 40s in
>>>> store 3.7kb when this problem occurs
>>>>
>>>>     INFO  2018-08-23 12:46:58 Logging.scala:54 -
>>>> org.apache.spark.storage.BlockManager: Found block rdd_1705_23 locally
>>>>     INFO  2018-08-23 12:47:38 Logging.scala:54 -
>>>> org.apache.spark.storage.memory.MemoryStore: Block rdd_1692_7 stored as
>>>> bytes in memory (estimated size 3.7 KB, free 1048.0 MB)
>>>>     INFO  2018-08-23 12:47:38 Logging.scala:54 -
>>>> org.apache.spark.storage.BlockManager: Found block rdd_1692_7 locally
>>>>
>>>> I have tried with MEMORY_ONLY, MEMORY_AND_SER and so on with the same
>>>> results. I have checked the IO disk (although if I use memory_only I guess
>>>> that it doesn't have sense) and I can't see any problem. This happens
>>>> randomly, but it could be in the 25% of the jobs.
>>>>
>>>> Any idea about what it could be happening?
>>>>
>>>
>>>
>

Reply via email to