Hi, I'd like to discuss two things regarding logging and debugging:
1) Crunch currently ships a log4j.properties which can have precedence over users' log4j.properties, depending on classpath order. Libraries should never ship logging config as it forces users to repackage Crunch if they want to use their own. Our Nexus at work has a nice collection of repackaged libs. 2) Discussion about Pipeline.enableDebug() came up in CRUNCH-70. I believe it really shouldn't mess with logging configuration. Right now it bypasses the commons-logging facade and directly accesses log4j, causing a compile time dependency on log4j. It changes VM-wide state beyond Crunch as other Hadoop-related code executed afterwards will get changed logging config, too. And, most importantly, it's the responsibility of the operations team, not the developer to configure logging. Admins are used to log4j.properties, we shouldn't invent another non-standard way of doing things that overrides the usual way. My vote for 1) would be to remove our log4j.properties. For 2) I think the best solution would be to offer an example log4j.properties in our documentation (section "Debugging Your Pipelines" or something) that has the effect Pipeline.enableDebug() has now. Regards, Matthias
