Didn't know about the location of those logs - so thanks for pointing out. But no, there's nothing useful in them.
here's an example log4j:WARN No appenders could be found for logger (akka.event.slf4j.Slf4jEventHandler). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Spark Executor Command: "java" "-cp" ":/opt/geo/midas/spark/spark/conf:/opt/geo/midas/spark/spark/assembly/target/scala-2.9.3/spark-assembly-0.8.0-SNAPSHOT-hadoop2.0.0-cdh4.3.0.jar" "-Xms512M" "-Xmx512M" "org.apache.spark.executor.StandaloneExecutorBackend" "akka://[email protected]:60874/user/StandaloneScheduler" "3" "rd.abc.xyz.com" "4" I'm able to write to hdfs from the spark-shell just fine. it's only this separate application mode that's not working On Sep 13, 2013, at 10:52 AM, Matei Zaharia <[email protected]> wrote: > Have you looked at the stdout and stderr files created for the job on the > worker nodes? By default they're in the "work" directory under SPARK_HOME. > > In my experience this either means no write permissions to the filesystem, or > no Java found. > > Matei > > On Sep 12, 2013, at 10:59 PM, Vipul Pandey <[email protected]> wrote: > >> - Master Branch >> - Standalone Mode >> >> I'm able to run the some basic commands on the Spark Shell. But when I >> package the same commands in a scala object in my app and run it separately >> - it just fails within a second without giving any reasons. >> >> This is what I see in the master logs : >> >> 13/09/12 22:20:22 INFO Master: Registering app indexXformation >> 13/09/12 22:20:22 INFO Master: Registered app indexXformation with ID >> app-20130912222022-0010 >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/0 >> on worker worker-20130911184654.xyz.abc..com-57772 >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/1 >> on worker worker-20130912180752-abc-vm0105.xyz.com-43175 >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/2 >> on worker worker-20130911184654-abc-vm0108.xyz.com-39247 >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/3 >> on worker worker-20130912183838-abc-vm0105.xyz.com-43175 >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/4 >> on worker worker-20130911184654-abc-vm0109.xyz.com-57730 >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/5 >> on worker worker-20130911184654-abc-vm0105.xyz.com-43175 >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/6 >> on worker worker-20130911184654-abc-vm0106.xyz.com-43044 >> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/0 >> because it is FAILED >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/7 >> on worker worker-20130911184654-abc-vm0107.xyz.com-57772 >> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/1 >> because it is FAILED >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/8 >> on worker worker-20130912180752-abc-vm0105.xyz.com-43175 >> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/2 >> because it is FAILED >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/9 >> on worker worker-20130911184654-abc-vm0108.xyz.com-39247 >> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/4 >> because it is FAILED >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/10 >> on worker worker-20130911184654-abc-vm0109.xyz.com-57730 >> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/3 >> because it is FAILED >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/11 >> on worker worker-20130912183838-abc-vm0105.xyz.com-43175 >> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/6 >> because it is FAILED >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/12 >> on worker worker-20130911184654-abc-vm0106.xyz.com-43044 >> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/7 >> because it is FAILED >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/13 >> on worker worker-20130911184654-abc-vm0107.xyz.com-57772 >> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/5 >> because it is FAILED >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/14 >> on worker worker-20130911184654-abc-vm0105.xyz.com-43175 >> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/8 >> because it is FAILED >> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/15 >> on worker worker-20130912180752-abc-vm0105.xyz.com-43175 >> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/9 >> because it is FAILED >> 13/09/12 22:20:22 ERROR Master: Application indexXformation with ID >> app-20130912222022-0010 failed 10 times, removing it >> 13/09/12 22:20:22 INFO Master: Removing app app-20130912222022-0010 >> >> >> and the slave logs say nothing at all. >> >> I read lines from a file and transform them in a different form. The >> "transformedRDD".first runs just fine and prints out the first value but >> RDD.count just fails without any reasons. >> I'm unable to find out why. I'm deploying the correct spark jar file in my >> project as well. >> Any clues anyone? >> again, this is the master branch and in standalone mode. >> >> >> >> >> >
