Hi Marcelo,

Thanks for bringing this up here, as this has been a topic of debate
recently.  Some thoughts below.

... all of the suffer from the fact that the log message needs to be built
> even
> though it might not be used.
>

This is not true of the current implementation (and this is actually why
Spark has a logging trait instead of just using a logger directly.)

If you look at the original function signatures:

protected def logDebug(msg: => String) ...


The => implies that we are passing the msg by name instead of by value.
Under the covers, scala is creating a closure that can be used to calculate
the log message, only if its actually required.  This does result is a
significant performance improvement, but still requires allocating an
object for the closure.  The bytecode is really something like this:

val logMessage = new Function0() { def call() =  "Log message" +
someExpensiveComputation() }
log.debug(logMessage)


In Catalyst and Spark SQL we are using the scala-logging package, which
uses macros to automatically rewrite all of your log statements.

You write: logger.debug(s"Log message $someExpensiveComputation")

You get:

if(logger.debugEnabled) {
  val logMsg = "Log message" + someExpensiveComputation()
  logger.debug(logMsg)
}

IMHO, this is the cleanest option (and is supported by Typesafe).  Based on
a micro-benchmark, it is also the fastest:

std logging: 19885.48ms
spark logging 914.408ms
scala logging 729.779ms

Once the dust settles from the 1.0 release, I'd be in favor of
standardizing on scala-logging.

Michael

Reply via email to