Interesting. For #3:
bq. reading data from, I guess you meant reading from disk. On Wed, Apr 20, 2016 at 10:45 AM, atootoonchian <a...@levyx.com> wrote: > Current spark logging mechanism can be improved by adding the following > parameters. It will help in understanding system bottlenecks and provide > useful guidelines for Spark application developer to design an optimized > application. > > 1. Shuffle Read Local Time: Time for a task to read shuffle data from local > storage. > 2. Shuffle Read Remote Time: Time for a task to read shuffle data from > remote node. > 3. Distribution processing time between computation, I/O, network: Show > distribution of processing time of each task between computation, reading > data from, and reading data from network. > 4. Average I/O bandwidth: Average time of I/O throughput for each task when > it fetches data from disk. > 5. Average Network bandwidth: Average network throughput for each task when > it fetches data from remote nodes. > > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Improving-system-design-logging-in-spark-tp17291.html > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >