Re: Where can I find logs set inside RDD processing functions?
The yarn log aggregation is enabled and the logs which I get through yarn logs -applicationId your_application_id are no different than what I get through logs in Yarn Application tracking URL. They still dont have the above logs. On Fri, Feb 6, 2015 at 3:36 PM, Petar Zecevic petar.zece...@gmail.com wrote: You can enable YARN log aggregation (yarn.log-aggregation-enable to true) and execute command yarn logs -applicationId your_application_id after your application finishes. Or you can look at them directly in HDFS in /tmp/logs/user/logs/ applicationid/hostname On 6.2.2015. 19:50, nitinkak001 wrote: I am trying to debug my mapPartitionsFunction. Here is the code. There are two ways I am trying to log using log.info() or println(). I am running in yarn-cluster mode. While I can see the logs from driver code, I am not able to see logs from map, mapPartition functions in the Application Tracking URL. Where can I find the logs? /var outputRDD = partitionedRDD.mapPartitions(p = { val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]] p.map({ case(key, value) = { log.info(Inside map) println(Inside map); for(i - 0 until outputTuples.size()){ val outputRecord = outputTuples.get(i) if(outputRecord != null){ outputList.add(outputRecord.getCurrRecordProfileID(), outputRecord.getWindowRecordProfileID, outputRecord.getScore()) } } } }) outputList.iterator() })/ Here is my log4j.properties /log4j.rootCategory=INFO, console log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n # Settings to quiet third party logs that are too verbose log4j.logger.org.eclipse.jetty=WARN log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/ -- View this message in context: http://apache-spark-user-list. 1001560.n3.nabble.com/Where-can-I-find-logs-set-inside- RDD-processing-functions-tp21537.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Where can I find logs set inside RDD processing functions?
To add to What Petar said, when YARN log aggregation is enabled, consider specifying yarn.nodemanager.remote-app-log-dir which is where aggregated logs are saved. Cheers On Fri, Feb 6, 2015 at 12:36 PM, Petar Zecevic petar.zece...@gmail.com wrote: You can enable YARN log aggregation (yarn.log-aggregation-enable to true) and execute command yarn logs -applicationId your_application_id after your application finishes. Or you can look at them directly in HDFS in /tmp/logs/user/logs/ applicationid/hostname On 6.2.2015. 19:50, nitinkak001 wrote: I am trying to debug my mapPartitionsFunction. Here is the code. There are two ways I am trying to log using log.info() or println(). I am running in yarn-cluster mode. While I can see the logs from driver code, I am not able to see logs from map, mapPartition functions in the Application Tracking URL. Where can I find the logs? /var outputRDD = partitionedRDD.mapPartitions(p = { val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]] p.map({ case(key, value) = { log.info(Inside map) println(Inside map); for(i - 0 until outputTuples.size()){ val outputRecord = outputTuples.get(i) if(outputRecord != null){ outputList.add(outputRecord.getCurrRecordProfileID(), outputRecord.getWindowRecordProfileID, outputRecord.getScore()) } } } }) outputList.iterator() })/ Here is my log4j.properties /log4j.rootCategory=INFO, console log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n # Settings to quiet third party logs that are too verbose log4j.logger.org.eclipse.jetty=WARN log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/ -- View this message in context: http://apache-spark-user-list. 1001560.n3.nabble.com/Where-can-I-find-logs-set-inside- RDD-processing-functions-tp21537.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Where can I find logs set inside RDD processing functions?
You can enable YARN log aggregation (yarn.log-aggregation-enable to true) and execute command yarn logs -applicationId your_application_id after your application finishes. Or you can look at them directly in HDFS in /tmp/logs/user/logs/applicationid/hostname On 6.2.2015. 19:50, nitinkak001 wrote: I am trying to debug my mapPartitionsFunction. Here is the code. There are two ways I am trying to log using log.info() or println(). I am running in yarn-cluster mode. While I can see the logs from driver code, I am not able to see logs from map, mapPartition functions in the Application Tracking URL. Where can I find the logs? /var outputRDD = partitionedRDD.mapPartitions(p = { val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]] p.map({ case(key, value) = { log.info(Inside map) println(Inside map); for(i - 0 until outputTuples.size()){ val outputRecord = outputTuples.get(i) if(outputRecord != null){ outputList.add(outputRecord.getCurrRecordProfileID(), outputRecord.getWindowRecordProfileID, outputRecord.getScore()) } } } }) outputList.iterator() })/ Here is my log4j.properties /log4j.rootCategory=INFO, console log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n # Settings to quiet third party logs that are too verbose log4j.logger.org.eclipse.jetty=WARN log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/ -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Where-can-I-find-logs-set-inside-RDD-processing-functions-tp21537.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Where can I find logs set inside RDD processing functions?
yarn.nodemanager.remote-app-log-dir is set to /tmp/logs On Fri, Feb 6, 2015 at 4:14 PM, Ted Yu yuzhih...@gmail.com wrote: To add to What Petar said, when YARN log aggregation is enabled, consider specifying yarn.nodemanager.remote-app-log-dir which is where aggregated logs are saved. Cheers On Fri, Feb 6, 2015 at 12:36 PM, Petar Zecevic petar.zece...@gmail.com wrote: You can enable YARN log aggregation (yarn.log-aggregation-enable to true) and execute command yarn logs -applicationId your_application_id after your application finishes. Or you can look at them directly in HDFS in /tmp/logs/user/logs/ applicationid/hostname On 6.2.2015. 19:50, nitinkak001 wrote: I am trying to debug my mapPartitionsFunction. Here is the code. There are two ways I am trying to log using log.info() or println(). I am running in yarn-cluster mode. While I can see the logs from driver code, I am not able to see logs from map, mapPartition functions in the Application Tracking URL. Where can I find the logs? /var outputRDD = partitionedRDD.mapPartitions(p = { val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]] p.map({ case(key, value) = { log.info(Inside map) println(Inside map); for(i - 0 until outputTuples.size()){ val outputRecord = outputTuples.get(i) if(outputRecord != null){ outputList.add(outputRecord. getCurrRecordProfileID(), outputRecord.getWindowRecordProfileID, outputRecord.getScore()) } } } }) outputList.iterator() })/ Here is my log4j.properties /log4j.rootCategory=INFO, console log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n # Settings to quiet third party logs that are too verbose log4j.logger.org.eclipse.jetty=WARN log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO log4j.logger.org.apache.spark.repl.SparkILoop$ SparkILoopInterpreter=INFO/ -- View this message in context: http://apache-spark-user-list. 1001560.n3.nabble.com/Where-can-I-find-logs-set-inside- RDD-processing-functions-tp21537.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org