Not that I can think of. If you have Spark history Server running then it may be another place to look
On Mon, Jul 31, 2017 at 9:48 AM, John Zeng <johnz...@hotmail.com> wrote: > Hi, Ayan, > > > Thanks for the suggestion. I did that and got following weird message > even I enabled the log aggregation: > > > [root@john1 conf]# yarn logs -applicationId application_1501197841826_0013 > 17/07/30 16:45:06 INFO client.RMProxy: Connecting to ResourceManager at > john1.dg/192.168.6.90:8032 > /tmp/logs/root/logs/application_1501197841826_0013does not exist. > Log aggregation has not completed or is not enabled. > > Any other way to see my logs? > > Thanks > > John > > > > > ------------------------------ > *From:* ayan guha <guha.a...@gmail.com> > *Sent:* Sunday, July 30, 2017 10:34 PM > *To:* John Zeng; Riccardo Ferrari > > *Cc:* User > *Subject:* Re: Logging in RDD mapToPair of Java Spark application > > Hi > > As you are using yarn log aggregation, yarn moves all the logs to hdfs > after the application completes. > > You can use following command to get the logs: > yarn logs -applicationId <your application id> > > > > On Mon, 31 Jul 2017 at 3:17 am, John Zeng <johnz...@hotmail.com> wrote: > >> Thanks Riccardo for the valuable info. >> >> >> Following your guidance, I looked at the Spark UI and figured out the >> default logs location for executors is 'yarn/container-logs'. I ran my >> Spark app again and I can see a new folder was created for it: >> >> >> [root@john2 application_1501197841826_0013]# ls -l >> total 24 >> drwx--x--- 2 yarn yarn 4096 Jul 30 10:07 container_1501197841826_0013_ >> 01_000001 >> drwx--x--- 2 yarn yarn 4096 Jul 30 10:08 container_1501197841826_0013_ >> 01_000002 >> drwx--x--- 2 yarn yarn 4096 Jul 30 10:08 container_1501197841826_0013_ >> 01_000003 >> drwx--x--- 2 yarn yarn 4096 Jul 30 10:08 container_1501197841826_0013_ >> 02_000001 >> drwx--x--- 2 yarn yarn 4096 Jul 30 10:08 container_1501197841826_0013_ >> 02_000002 >> drwx--x--- 2 yarn yarn 4096 Jul 30 10:08 container_1501197841826_0013_ >> 02_000003 >> >> But when I tried to look into each its content, it was gone and there was >> not file at all from the same place: >> >> [root@john2 application_1501197841826_0013]# vi >> container_1501197841826_0013_* >> [root@john2 application_1501197841826_0013]# ls -l >> total 0 >> [root@john2 application_1501197841826_0013]# pwd >> /yarn/container-logs/application_1501197841826_0013 >> >> I believe Spark moves these logs to a different place. But where are >> they? >> >> Thanks >> >> John >> >> >> >> >> ------------------------------ >> *From:* Riccardo Ferrari <ferra...@gmail.com> >> *Sent:* Saturday, July 29, 2017 8:18 PM >> *To:* johnzengspark >> *Cc:* User >> *Subject:* Re: Logging in RDD mapToPair of Java Spark application >> >> Hi John, >> >> The reason you don't see the second sysout line is because is executed on >> a different JVM (ie. Driver vs Executor). the second sysout line should be >> available through the executor logs. Check the Executors tab. >> >> There are alternative approaches to manage log centralization however it >> really depends on what are your requirements. >> >> Hope it helps, >> >> On Sat, Jul 29, 2017 at 8:09 PM, johnzengspark <johnz...@hotmail.com> >> wrote: >> >>> Hi, All, >>> >>> Although there are lots of discussions related to logging in this news >>> group, I did not find an answer to my specific question so I am posting >>> mine >>> with the hope that this will not cause a duplicated question. >>> >>> Here is my simplified Java testing Spark app: >>> >>> public class SparkJobEntry { >>> public static void main(String[] args) { >>> // Following line is in stdout from JobTracker UI >>> System.out.println("argc=" + args.length); >>> >>> SparkConf conf = new SparkConf().setAppName(" >>> TestSparkApp"); >>> JavaSparkContext sc = new JavaSparkContext(conf); >>> JavaRDD<String> fileRDD = sc.textFile(args[0]); >>> >>> fileRDD.mapToPair(new PairFunction<String, String, >>> String>() { >>> >>> private static final long serialVersionUID = 1L; >>> >>> @Override >>> public Tuple2<String, String> call(String input) >>> throws Exception { >>> // Following line is not in stdout from >>> JobTracker UI >>> System.out.println("This line should be >>> printed in stdout"); >>> // Other code removed from here to make >>> things simple >>> return new Tuple2<String, String>("1", >>> "Testing data"); >>> }}).saveAsTextFile(args[0] + ".results"); >>> } >>> } >>> >>> What I expected from JobTracker UI is to see both stdout lines: first >>> line >>> is "argc=2" and second line is "This line should be printed in stdout". >>> But >>> I only see the first line which is outside of the 'mapToPair'. I >>> actually >>> have verified my 'mapToPair' is called and the statements after the >>> second >>> logging line were executed. The only issue for me is why the second >>> logging >>> is not in JobTracker UI. >>> >>> Appreciate your help. >>> >>> Thanks >>> >>> John >>> >>> >>> >>> -- >>> View this message in context: http://apache-spark-user-list. >>> 1001560.n3.nabble.com/Logging-in-RDD-mapToPair-of-Java- >>> Spark-application-tp29007.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>> >>> >> -- > Best Regards, > Ayan Guha > -- Best Regards, Ayan Guha