Re: coalesce and executor memory

Christopher Brady Fri, 12 Feb 2016 17:35:23 -0800

Thank you for the responses. The map function just changes the format ofthe record slightly, so I don't think that would be the cause of thememory problem.

So if I have 3 cores per executor, I need to be able to fit 3 partitionsper executor within whatever I specify for the executor memory? Is therea way I can programmatically find a number of partitions I can coalescedown to without running out of memory? Is there some documentation wherethis is explained?



On 02/12/2016 05:10 PM, Koert Kuipers wrote:

in spark, every partition needs to fit in the memory available to thecore processing it.

as you coalesce you reduce number of partitions, increasing partitionsize. at some point the partition no longer fits in memory.

On Fri, Feb 12, 2016 at 4:50 PM, Silvio Fiorito<silvio.fior...@granturing.com <mailto:silvio.fior...@granturing.com>>wrote:


    Coalesce essentially reduces parallelism, so fewer cores are
    getting more records. Be aware that it could also lead to loss of
    data locality, depending on how far you reduce. Depending on what
    you’re doing in the map operation, it could lead to OOM errors.
    Can you give more details as to what the code for the map looks like?




    On 2/12/16, 1:13 PM, "Christopher Brady"
    <christopher.br...@oracle.com
    <mailto:christopher.br...@oracle.com>> wrote:

    >Can anyone help me understand why using coalesce causes my
    executors to
    >crash with out of memory? What happens during coalesce that increases
    >memory usage so much?
    >
    >If I do:
    >hadoopFile -> sample -> cache -> map -> saveAsNewAPIHadoopFile
    >
    >everything works fine, but if I do:
    >hadoopFile -> sample -> coalesce -> cache -> map ->
    saveAsNewAPIHadoopFile
    >
    >my executors crash with out of memory exceptions.
    >
    >Is there any documentation that explains what causes the increased
    >memory requirements with coalesce? It seems to be less of a
    problem if I
    >coalesce into a larger number of partitions, but I'm not sure why
    this
    >is. How would I estimate how much additional memory the coalesce
    requires?
    >
    >Thanks.
    >
    >---------------------------------------------------------------------
    >To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
    <mailto:user-unsubscr...@spark.apache.org>
    >For additional commands, e-mail: user-h...@spark.apache.org
    <mailto:user-h...@spark.apache.org>
    >

Re: coalesce and executor memory

Reply via email to