Github user raajay commented on the issue:
https://github.com/apache/spark/pull/18690
I understand. My previous comment was just a clarification to your
question: "I'm not sure how does this code work in your changes?". I will close
this PR. The JIRA is already closed.
-
Github user raajay closed the pull request at:
https://github.com/apache/spark/pull/18690
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user raajay commented on the issue:
https://github.com/apache/spark/pull/18690
@jerryshao My CustomSInk has the report function defined. What I did not
have was an equivalent of JmxReporter defined in my CustomSink. The reporter
essentially periodically invokes the report
Github user raajay commented on the issue:
https://github.com/apache/spark/pull/18690
We were using a custom sink rather than the JmxSink for gathering metrics.
The sink did NOT have a "reporter" like the ones JmxSink or CsvSink have. I
guess a cleaner design is to
Github user raajay commented on the issue:
https://github.com/apache/spark/pull/18683
maxSizeInFlight can be large (~100-200 MB) when (a) available memory at
reducer is high, or (b) when reducer spends most of its time waiting for
fetchRequests. In such cases, using a hard coded
GitHub user raajay opened a pull request:
https://github.com/apache/spark/pull/18690
[SPARK-21334][CORE] Add metrics reporting service to External Shuffle Server
## What changes were proposed in this pull request?
Add a metrics reporting service, that periodically reports
GitHub user raajay opened a pull request:
https://github.com/apache/spark/pull/18683
[SPARK-21474][CORE] Make number of parallel fetches from a reducer
configurable
## What changes were proposed in this pull request?
Currently the number of parallel fetches is hard-coded