Re: Running 30 Spark applications at the same time is slower than one on average

Artemis User Wed, 26 Oct 2022 10:34:19 -0700

Are these Cloudera specific acronyms? Not sure how Cloudera configuresSpark differently, but obviously the number of nodes is too small,considering each app only uses a small number of cores and RAM. So youmay consider increase the number of nodes. When all these apps jam ona few nodes, the cluster manager/scheduler and/or the network becomesoverwhelmed...


On 10/26/22 8:09 AM, Sean Owen wrote:

Resource contention. Now all the CPU and I/O is competing and probablyslows down


On Wed, Oct 26, 2022, 5:37 AM [email protected] <[email protected]> wrote:

    Hi All,

    I have a CDH5.16.2 hadoop cluster with 1+3 nodes(64C/128G, 1NN/RM
    + 3DN/NM), and yarn with 192C/240G. I used the following test
    scenario:

    1.spark app resource with 2G driver memory/2C driver vcore/1
    executor nums/2G executor memory/2C executor vcore.
    2.one spark app will use 5G4C on yarn.
    3.first, I only run one spark app takes 40s.
    4.Then, I run 30 the same spark app at once, and each spark app
    takes 80s on average.

    So, I want to know why the run time gap is so big, and how to
    optimize?

    Thanks

Re: Running 30 Spark applications at the same time is slower than one on average

Reply via email to