[GitHub] [spark] tgravescs commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-18 Thread GitBox
tgravescs commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-675474599 From my experience: 1) For the dynamic allocation it definitely is, but in my opinion the right solution there is to pull the dynamic allocation manager into the core

[GitHub] [spark] tgravescs commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-17 Thread GitBox
tgravescs commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-675001276 >> Yes they can make but in next run , after they see some abrupt behaviour (application hung/ Resource wasted).But what if due to this extra threshold size there is no

[GitHub] [spark] tgravescs commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-17 Thread GitBox
tgravescs commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-674877554 >> We are already setting some value in cluster defaults . But as I said earlier also "There is no fixed size of the Queue which can be used in all the Spark Jobs and even

[GitHub] [spark] tgravescs commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-14 Thread GitBox
tgravescs commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-674069902 I was just saying, if you had one pool you took from you can't control it per queue. Unless of course you put in some sort of minimum or something per queue. In most of the

[GitHub] [spark] tgravescs commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-13 Thread GitBox
tgravescs commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-673665565 Yeah so you have described the affects of it dropping the events, which I know. The thing I want to know is why it dropped the events in your cases. I'm not sure the

[GitHub] [spark] tgravescs commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-13 Thread GitBox
tgravescs commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-673479231 Part of the problem with that is that in most cases some queues have higher priority, which is why they were split apart. you really want the executor management queue to

[GitHub] [spark] tgravescs commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-13 Thread GitBox
tgravescs commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-673474159 > We can think of making it per queue basis any such approach based on the user requirement like I have already done in this PR using the conf(spark.set.optmized.event.queue)

[GitHub] [spark] tgravescs commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-12 Thread GitBox
tgravescs commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-673097079 sorry I'm still not seeing any difference here then increasing the size of the current queue? If both are not really allocating memory for the entire amount until runtime

[GitHub] [spark] tgravescs commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-12 Thread GitBox
tgravescs commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-672862588 so I definitely get the point here and it would definitely be nice to handle this better somehow, but if you are setting it be queue size + some threshold then the

[GitHub] [spark] tgravescs commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-12 Thread GitBox
tgravescs commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-672856870 so one question, here, you are seeing events dropped that cause hangs in the application code? What queue were they in?