Re: RFC: varargs in Logging.scala?
On Thu, Apr 10, 2014 at 5:46 PM, Michael Armbrust mich...@databricks.com wrote: ... all of the suffer from the fact that the log message needs to be built even though it might not be used. This is not true of the current implementation (and this is actually why Spark has a logging trait instead of just using a logger directly.) If you look at the original function signatures: protected def logDebug(msg: = String) ... The = implies that we are passing the msg by name instead of by value. Under the covers, scala is creating a closure that can be used to calculate the log message, only if its actually required. Hah. Interesting. Guess it's my noob Scala hat showing off. I saw the PR about using scala-logging before, but didn't pay too close attention to it. Thanks for the info guys! -- Marcelo
Re: RFC: varargs in Logging.scala?
Another usage that's nice is: logDebug { val timeS = timeMillis/1000.0 sTime: $timeS } which can be useful for more complicated expressions. On Thu, Apr 10, 2014 at 5:55 PM, Michael Armbrust mich...@databricks.comwrote: BTW... You can do calculations in string interpolation: sTime: ${timeMillis / 1000}s Or use format strings. fFloat with two decimal places: $floatValue%.2f More info: http://docs.scala-lang.org/overviews/core/string-interpolation.html On Thu, Apr 10, 2014 at 5:46 PM, Michael Armbrust mich...@databricks.com wrote: Hi Marcelo, Thanks for bringing this up here, as this has been a topic of debate recently. Some thoughts below. ... all of the suffer from the fact that the log message needs to be built even though it might not be used. This is not true of the current implementation (and this is actually why Spark has a logging trait instead of just using a logger directly.) If you look at the original function signatures: protected def logDebug(msg: = String) ... The = implies that we are passing the msg by name instead of by value. Under the covers, scala is creating a closure that can be used to calculate the log message, only if its actually required. This does result is a significant performance improvement, but still requires allocating an object for the closure. The bytecode is really something like this: val logMessage = new Function0() { def call() = Log message + someExpensiveComputation() } log.debug(logMessage) In Catalyst and Spark SQL we are using the scala-logging package, which uses macros to automatically rewrite all of your log statements. You write: logger.debug(sLog message $someExpensiveComputation) You get: if(logger.debugEnabled) { val logMsg = Log message + someExpensiveComputation() logger.debug(logMsg) } IMHO, this is the cleanest option (and is supported by Typesafe). Based on a micro-benchmark, it is also the fastest: std logging: 19885.48ms spark logging 914.408ms scala logging 729.779ms Once the dust settles from the 1.0 release, I'd be in favor of standardizing on scala-logging. Michael
Re: RFC: varargs in Logging.scala?
Hi Marcelo, Thanks for bringing this up here, as this has been a topic of debate recently. Some thoughts below. ... all of the suffer from the fact that the log message needs to be built even though it might not be used. This is not true of the current implementation (and this is actually why Spark has a logging trait instead of just using a logger directly.) If you look at the original function signatures: protected def logDebug(msg: = String) ... The = implies that we are passing the msg by name instead of by value. Under the covers, scala is creating a closure that can be used to calculate the log message, only if its actually required. This does result is a significant performance improvement, but still requires allocating an object for the closure. The bytecode is really something like this: val logMessage = new Function0() { def call() = Log message + someExpensiveComputation() } log.debug(logMessage) In Catalyst and Spark SQL we are using the scala-logging package, which uses macros to automatically rewrite all of your log statements. You write: logger.debug(sLog message $someExpensiveComputation) You get: if(logger.debugEnabled) { val logMsg = Log message + someExpensiveComputation() logger.debug(logMsg) } IMHO, this is the cleanest option (and is supported by Typesafe). Based on a micro-benchmark, it is also the fastest: std logging: 19885.48ms spark logging 914.408ms scala logging 729.779ms Once the dust settles from the 1.0 release, I'd be in favor of standardizing on scala-logging. Michael
Re: RFC: varargs in Logging.scala?
BTW... You can do calculations in string interpolation: sTime: ${timeMillis / 1000}s Or use format strings. fFloat with two decimal places: $floatValue%.2f More info: http://docs.scala-lang.org/overviews/core/string-interpolation.html On Thu, Apr 10, 2014 at 5:46 PM, Michael Armbrust mich...@databricks.comwrote: Hi Marcelo, Thanks for bringing this up here, as this has been a topic of debate recently. Some thoughts below. ... all of the suffer from the fact that the log message needs to be built even though it might not be used. This is not true of the current implementation (and this is actually why Spark has a logging trait instead of just using a logger directly.) If you look at the original function signatures: protected def logDebug(msg: = String) ... The = implies that we are passing the msg by name instead of by value. Under the covers, scala is creating a closure that can be used to calculate the log message, only if its actually required. This does result is a significant performance improvement, but still requires allocating an object for the closure. The bytecode is really something like this: val logMessage = new Function0() { def call() = Log message + someExpensiveComputation() } log.debug(logMessage) In Catalyst and Spark SQL we are using the scala-logging package, which uses macros to automatically rewrite all of your log statements. You write: logger.debug(sLog message $someExpensiveComputation) You get: if(logger.debugEnabled) { val logMsg = Log message + someExpensiveComputation() logger.debug(logMsg) } IMHO, this is the cleanest option (and is supported by Typesafe). Based on a micro-benchmark, it is also the fastest: std logging: 19885.48ms spark logging 914.408ms scala logging 729.779ms Once the dust settles from the 1.0 release, I'd be in favor of standardizing on scala-logging. Michael