This looks like a bug in the master branch of Spark, related to some recent changes to EventLoggingListener. You can reproduce this bug on a fresh Spark checkout by running
./bin/spark-shell --conf spark.eventLog.enabled=true --conf spark.eventLog.dir=/tmp/nonexistent-dir where /tmp/nonexistent-dir is a directory that doesn't exist and /tmp exists. It looks like older versions of EventLoggingListener would create the directory if it didn't exist. I think the issue here is that the error-checking code is overzealous and catches some non-error conditions, too; I've filed https://issues.apache.org/jira/browse/SPARK-5311 to investigate this. On Sun, Jan 18, 2015 at 1:59 PM, Ganon Pierce <ganon.pie...@me.com> wrote: > I posted about the Application WebUI error (specifically application WebUI > not the master WebUI generally) and have spent at least a few hours a day > for over week trying to resolve it so I’d be very grateful for any > suggestions. It is quite troubling that I appear to be the only one > encountering this issue and I’ve tried to include everything here which > might be relevant (sorry for the length). Please see the thread "Current > Build Gives HTTP ERROR” > https://www.mail-archive.com/user@spark.apache.org/msg18752.html to see > specifics about the application webUI issue and the master log. > > > Environment: > > I’m doing my spark builds and application programming in scala locally on > my macbook pro in eclipse, using modified ec2 launch scripts to launch my > cluster, uploading my spark builds and models to s3, and uploading > applications to and submitting them from ec2. I’m using java 8 locally and > also installing and using java 8 on my ec2 instances (which works with > spark 1.2.0). I have a windows machine at home (macbook is work machine), > but have not yet attempted to launch from there. > > > Errors: > > I’ve built two different recent git versions of spark both multiple times, > and when running applications both have produced an Application WebUI error > and this exception: > > Exception in thread "main" java.lang.IllegalArgumentException: Log > directory /tmp/spark-events does not exist. > > While both will display the master webUI just fine including > running/completed applications, registered workers etc, when I try to > access a running or completed application’s WebUI by clicking their > respective link, I receive a server error. When I manually create the above > log directory, the exception goes away, but the WebUI problem does not. I > don’t have any strong evidence, but I suspect these errors and whatever is > causing them are related. > > > Why and How of Modifications to Launch Scripts for Installation of > Unreleased Spark Versions: > > When using a prebuilt version of spark on my cluster everything works > except the new methods I need, which I had previously added to my custom > version of spark and used by building the spark-assembly.jar locally and > then replacing the assembly file produced through the 1.1.0 ec2 launch > scripts. However, since my pull request was accepted and can now be found > in the apache/spark repository along with some additional features I’d like > to use and because I’d like a more elegant permanent solution for launching > a cluster and installing unreleased versions of spark to my ec2 clusters, > I’ve modified the included ec2 launch scripts in this way (credit to gen > tang here: > https://www.mail-archive.com/user%40spark.apache.org/msg18761.html > <https://www.mail-archive.com/user@spark.apache.org/msg18761.html>): > > 1. Clone the most recent git version of spark > 2. Use the make-dist script > 3. Tar the dist folder and upload the resulting > spark-1.3.0-snapshot-hadoop1.tgz to s3 and change file permissions > 4. Fork the mesos/spark-ec2 repository and modify the spark/init.sh script > to do a wget of my hosted distribution instead of spark’s stable release > 5. Modify my spark_ec2.py script to point to my repository. > 6. Modify my spark_ec2.py script to install java 8 on my ec2 instances. > (This works and does not produce the above stated errors when using a > stable release like 1.2.0). > > > Additional Possibly Related Info: > > As far as I can tell (I went through line by line), when I launch my > recent build vs when I launch the most recent stable release the console > prints almost identical INFO and WARNINGS except where you would expect > things to be different e.g. version numbers. I’ve noted that after launch > the prebuilt stable version does not have a /tmp/spark-events directory, > but it is created when the application is launched, while it is never > created in my build. Further, in my unreleased builds the application logs > that I find are always stored as .inprogress files (when I set the logging > directory to /root/ or add the /tmp/spark-events directory manually) even > after completion, which I believe is supposed to change to .completed (or > something similar) when the application finishes. > > > Thanks for any help! > >