Optionally skip log4j configuration
-----------------------------------

                 Key: CASSANDRA-3061
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3061
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
    Affects Versions: 0.8.4
            Reporter: Aaron Morton
            Assignee: Aaron Morton
            Priority: Minor


from this thread 
http://groups.google.com/group/brisk-users/browse_thread/thread/3a18f4679673bea8

When brisk accesses cassandra classes inside of a Hadoop Task JVM the 
AbstractCassandraDaemon uses a log4j PropertyConfigurator to setup cassandra 
logging. This closes all the existing appenders, including the TaskLogAppender 
for the hadoop task. They are not opened again because they are not in the 
config. 

log4j has Logger Repositories to handle multiple configs in the same process, 
but there is a bit of suck involved in making a RepositorySelector. 

Two examples...
http://www.mail-archive.com/log4j-dev@jakarta.apache.org/msg02972.html
http://docs.redhat.com/docs/en-US/JBoss_Enterprise_Application_Platform/4.2/html/Getting_Started_Guide/logging.log4j.reposelect.html

Basically all the selector has access to thread local storage, and it looks 
like normally people get the class loader from the current thread. A thread 
will inherit it's class loader from the thread that created it, unless 
otherwise specified. 

We have code in the same thread the uses hadoop and cassandra classes, so this 
could be a dead end.  

As a work around i've added cassandra.log4j.configure JVM param and made the 
AbstractCassandraServer skip the log4j config if it's false. My job completes 
and I can see the cassandra code logging an extra message I put in into the 
Hadoop task log file...

2011-08-19 15:56:06,442 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
Metrics system not started: Cannot locate configuration: tried 
hadoop-metrics2-maptask.properties, hadoop-metrics2.properties
2011-08-19 15:56:06,776 INFO 
org.apache.cassandra.service.AbstractCassandraDaemon: Logging initialized 
externally
2011-08-19 15:56:07,332 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0
 
The param has to be passed to the task JVM, so need to modify Haddop 
mapred-site.xml as follows 

<property>
  <name>mapred.child.java.opts</name>
  <value>-Xmx256m -Dcassandra.log4j.configure=false</value>
  <description>
    Tune your mapred jvm arguments for best performance. 
    Also see documentation from jvm vendor.
  </description>
</property>

It's not pretty but it works. In my extra log4j logging I can see the second 
reset() call is gone.  


Change the to Hadoop TaskLogAppender also stops the NPE but there may also be 
some lost log messages 
https://issues.apache.org/jira/browse/HADOOP-7556

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to