Re: Where can I find logs set inside RDD processing functions?

2015-02-06 Thread Nitin kak
The yarn log aggregation is enabled and the logs which I get through yarn
logs -applicationId your_application_id
are no different than what I get through logs in Yarn Application tracking
URL. They still dont have the above logs.

On Fri, Feb 6, 2015 at 3:36 PM, Petar Zecevic petar.zece...@gmail.com
wrote:


 You can enable YARN log aggregation (yarn.log-aggregation-enable to true)
 and execute command
 yarn logs -applicationId your_application_id
 after your application finishes.

 Or you can look at them directly in HDFS in /tmp/logs/user/logs/
 applicationid/hostname

 On 6.2.2015. 19:50, nitinkak001 wrote:

 I am trying to debug my mapPartitionsFunction. Here is the code. There are
 two ways I am trying to log using log.info() or println(). I am running
 in
 yarn-cluster mode. While I can see the logs from driver code, I am not
 able
 to see logs from map, mapPartition functions in the Application Tracking
 URL. Where can I find the logs?

   /var outputRDD = partitionedRDD.mapPartitions(p = {
val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]]
p.map({ case(key, value) = {
 log.info(Inside map)
 println(Inside map);
 for(i - 0 until outputTuples.size()){
   val outputRecord = outputTuples.get(i)
   if(outputRecord != null){
 outputList.add(outputRecord.getCurrRecordProfileID(),
 outputRecord.getWindowRecordProfileID, outputRecord.getScore())
   }
 }
  }
})
outputList.iterator()
  })/

 Here is my log4j.properties

 /log4j.rootCategory=INFO, console
 log4j.appender.console=org.apache.log4j.ConsoleAppender
 log4j.appender.console.target=System.err
 log4j.appender.console.layout=org.apache.log4j.PatternLayout
 log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p
 %c{1}: %m%n

 # Settings to quiet third party logs that are too verbose
 log4j.logger.org.eclipse.jetty=WARN
 log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
 log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
 log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/




 --
 View this message in context: http://apache-spark-user-list.
 1001560.n3.nabble.com/Where-can-I-find-logs-set-inside-
 RDD-processing-functions-tp21537.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org





Re: Where can I find logs set inside RDD processing functions?

2015-02-06 Thread Ted Yu
To add to What Petar said, when YARN log aggregation is enabled, consider
specifying yarn.nodemanager.remote-app-log-dir which is where aggregated
logs are saved.

Cheers

On Fri, Feb 6, 2015 at 12:36 PM, Petar Zecevic petar.zece...@gmail.com
wrote:


 You can enable YARN log aggregation (yarn.log-aggregation-enable to true)
 and execute command
 yarn logs -applicationId your_application_id
 after your application finishes.

 Or you can look at them directly in HDFS in /tmp/logs/user/logs/
 applicationid/hostname


 On 6.2.2015. 19:50, nitinkak001 wrote:

 I am trying to debug my mapPartitionsFunction. Here is the code. There are
 two ways I am trying to log using log.info() or println(). I am running
 in
 yarn-cluster mode. While I can see the logs from driver code, I am not
 able
 to see logs from map, mapPartition functions in the Application Tracking
 URL. Where can I find the logs?

   /var outputRDD = partitionedRDD.mapPartitions(p = {
val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]]
p.map({ case(key, value) = {
 log.info(Inside map)
 println(Inside map);
 for(i - 0 until outputTuples.size()){
   val outputRecord = outputTuples.get(i)
   if(outputRecord != null){
 outputList.add(outputRecord.getCurrRecordProfileID(),
 outputRecord.getWindowRecordProfileID, outputRecord.getScore())
   }
 }
  }
})
outputList.iterator()
  })/

 Here is my log4j.properties

 /log4j.rootCategory=INFO, console
 log4j.appender.console=org.apache.log4j.ConsoleAppender
 log4j.appender.console.target=System.err
 log4j.appender.console.layout=org.apache.log4j.PatternLayout
 log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p
 %c{1}: %m%n

 # Settings to quiet third party logs that are too verbose
 log4j.logger.org.eclipse.jetty=WARN
 log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
 log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
 log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/




 --
 View this message in context: http://apache-spark-user-list.
 1001560.n3.nabble.com/Where-can-I-find-logs-set-inside-
 RDD-processing-functions-tp21537.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org



 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Where can I find logs set inside RDD processing functions?

2015-02-06 Thread nitinkak001
I am trying to debug my mapPartitionsFunction. Here is the code. There are
two ways I am trying to log using log.info() or println(). I am running in
yarn-cluster mode. While I can see the logs from driver code, I am not able
to see logs from map, mapPartition functions in the Application Tracking
URL. Where can I find the logs?

 /var outputRDD = partitionedRDD.mapPartitions(p = {
  val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]]
  p.map({ case(key, value) = {
   log.info(Inside map)
   println(Inside map);
   for(i - 0 until outputTuples.size()){
 val outputRecord = outputTuples.get(i)
 if(outputRecord != null){
   outputList.add(outputRecord.getCurrRecordProfileID(),
outputRecord.getWindowRecordProfileID, outputRecord.getScore())
 }  
   }
}
  })
  outputList.iterator()
})/

Here is my log4j.properties

/log4j.rootCategory=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p
%c{1}: %m%n

# Settings to quiet third party logs that are too verbose
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Where-can-I-find-logs-set-inside-RDD-processing-functions-tp21537.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Where can I find logs set inside RDD processing functions?

2015-02-06 Thread Petar Zecevic


You can enable YARN log aggregation (yarn.log-aggregation-enable to 
true) and execute command

yarn logs -applicationId your_application_id
after your application finishes.

Or you can look at them directly in HDFS in 
/tmp/logs/user/logs/applicationid/hostname


On 6.2.2015. 19:50, nitinkak001 wrote:

I am trying to debug my mapPartitionsFunction. Here is the code. There are
two ways I am trying to log using log.info() or println(). I am running in
yarn-cluster mode. While I can see the logs from driver code, I am not able
to see logs from map, mapPartition functions in the Application Tracking
URL. Where can I find the logs?

  /var outputRDD = partitionedRDD.mapPartitions(p = {
   val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]]
   p.map({ case(key, value) = {
log.info(Inside map)
println(Inside map);
for(i - 0 until outputTuples.size()){
  val outputRecord = outputTuples.get(i)
  if(outputRecord != null){
outputList.add(outputRecord.getCurrRecordProfileID(),
outputRecord.getWindowRecordProfileID, outputRecord.getScore())
  }
}
 }
   })
   outputList.iterator()
 })/

Here is my log4j.properties

/log4j.rootCategory=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p
%c{1}: %m%n

# Settings to quiet third party logs that are too verbose
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO/




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Where-can-I-find-logs-set-inside-RDD-processing-functions-tp21537.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org




-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Where can I find logs set inside RDD processing functions?

2015-02-06 Thread Nitin kak
yarn.nodemanager.remote-app-log-dir  is set to /tmp/logs

On Fri, Feb 6, 2015 at 4:14 PM, Ted Yu yuzhih...@gmail.com wrote:

 To add to What Petar said, when YARN log aggregation is enabled, consider
 specifying yarn.nodemanager.remote-app-log-dir which is where aggregated
 logs are saved.

 Cheers

 On Fri, Feb 6, 2015 at 12:36 PM, Petar Zecevic petar.zece...@gmail.com
 wrote:


 You can enable YARN log aggregation (yarn.log-aggregation-enable to true)
 and execute command
 yarn logs -applicationId your_application_id
 after your application finishes.

 Or you can look at them directly in HDFS in /tmp/logs/user/logs/
 applicationid/hostname


 On 6.2.2015. 19:50, nitinkak001 wrote:

 I am trying to debug my mapPartitionsFunction. Here is the code. There
 are
 two ways I am trying to log using log.info() or println(). I am running
 in
 yarn-cluster mode. While I can see the logs from driver code, I am not
 able
 to see logs from map, mapPartition functions in the Application Tracking
 URL. Where can I find the logs?

   /var outputRDD = partitionedRDD.mapPartitions(p = {
val outputList = new ArrayList[scala.Tuple3[Long, Long, Int]]
p.map({ case(key, value) = {
 log.info(Inside map)
 println(Inside map);
 for(i - 0 until outputTuples.size()){
   val outputRecord = outputTuples.get(i)
   if(outputRecord != null){
 outputList.add(outputRecord.
 getCurrRecordProfileID(),
 outputRecord.getWindowRecordProfileID, outputRecord.getScore())
   }
 }
  }
})
outputList.iterator()
  })/

 Here is my log4j.properties

 /log4j.rootCategory=INFO, console
 log4j.appender.console=org.apache.log4j.ConsoleAppender
 log4j.appender.console.target=System.err
 log4j.appender.console.layout=org.apache.log4j.PatternLayout
 log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p
 %c{1}: %m%n

 # Settings to quiet third party logs that are too verbose
 log4j.logger.org.eclipse.jetty=WARN
 log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
 log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
 log4j.logger.org.apache.spark.repl.SparkILoop$
 SparkILoopInterpreter=INFO/




 --
 View this message in context: http://apache-spark-user-list.
 1001560.n3.nabble.com/Where-can-I-find-logs-set-inside-
 RDD-processing-functions-tp21537.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org



 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org