[GitHub] spark issue #19978: [SPARK-22784][CORE][WIP] Configure reading buffer size i...

2018-01-23 Thread MikhailErofeev
Github user MikhailErofeev commented on the issue: https://github.com/apache/spark/pull/19978 @srowen, yes, the processing is no longer IO-bound after backporting SPARK-20923 --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #19978: [SPARK-22784][CORE][WIP] Configure reading buffer...

2018-01-23 Thread MikhailErofeev
Github user MikhailErofeev closed the pull request at: https://github.com/apache/spark/pull/19978 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #19978: [SPARK-22784][CORE] Configure reading buffer size in Spa...

2017-12-16 Thread MikhailErofeev
Github user MikhailErofeev commented on the issue: https://github.com/apache/spark/pull/19978 @squito Your guess was right, and I can remove these blocks by https://issues.apache.org/jira/browse/SPARK-20923. I will test the performance after this patch and refine or close

[GitHub] spark issue #19978: [SPARK-22784][CORE] Configure reading buffer size in Spa...

2017-12-15 Thread MikhailErofeev
Github user MikhailErofeev commented on the issue: https://github.com/apache/spark/pull/19978 Thanks for the constuctive feedback. Here is my benchmark for a step of 1MB. During this run the speedup was 23%, I think there was some interference on my workstation. ``` 2048

[GitHub] spark issue #19978: [SPARK-22784][CORE] Configure reading buffer size in Spa...

2017-12-14 Thread MikhailErofeev
Github user MikhailErofeev commented on the issue: https://github.com/apache/spark/pull/19978 I don't mind to just set it to a higher value. Moreover, the current default value (2048) is small in any case. For my log files, 30M buffer was the best value (a bigger one did

[GitHub] spark pull request #19978: [SPARK-22784][CORE] Configure reading buffer size...

2017-12-14 Thread MikhailErofeev
GitHub user MikhailErofeev opened a pull request: https://github.com/apache/spark/pull/19978 [SPARK-22784][CORE] Configure reading buffer size in Spark History Server ## What changes were proposed in this pull request? Added debug logging of spent time and line size for each job