Kay Ousterhout created SPARK-6066: ------------------------------------- Summary: Metadata in event log makes it very difficult for external libraries to parse event log Key: SPARK-6066 URL: https://issues.apache.org/jira/browse/SPARK-6066 Project: Spark Issue Type: Bug Affects Versions: 1.3.0 Reporter: Kay Ousterhout Assignee: Andrew Or Priority: Blocker
The fix for SPARK-2261 added a line at the beginning of the event log that encodes metadata. This line makes it much more difficult to parse the event logs from external libraries (like https://github.com/kayousterhout/trace-analysis, which is used by folks at Berkeley) because: (1) The metadata is not written as JSON, unlike the rest of the file (2) More annoyingly, if the file is compressed, the metadata is not compressed. This has a few side-effects: first, someone can't just use the command line to uncompress the file and then look at the logs, because the file is in this weird half-compressed format; and second, now external tools that parse these logs also need to deal with this weird format. We should fix this before the 1.3 release, because otherwise we'll have to add a bunch more backward-compatibility code to handle this weird format! -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org