Re: 8029281: Synchronization issues in Logger and LogManager
On 12/3/13 8:36 AM, Peter Levart wrote: Hi Daniel, 1st sorry for the delay. I promised looking at the patch, but was then distracted by other things. I think that synchronization in LogManager is correct now. The fact that Mandy thinks so is also reassuring. Yes - that's usually a good sign :-) I wonder if calling LoggerWeakRef.dispose() from LoggerContext.addLocalLoger() is needed now that you check the LogNode.loggerRef before clearing in LoggerWeakRef.dispose(): Yes - we could probably call dispose() only when draining the queue - but I still have the feeling that calling it early is better. It makes sure that everything is properly cleaned-up before we modify it. I also like that we now have a single entry point - 'dispose()' - to clean up after a gc'ed logger rather than doing half of the job now and leaving the other half for later (and trusting that the later code will not do anything stupid). Thank you again for your reviews and feedback - they are very helpful! -- daniel if (n.loggerRef == this) { n.loggerRef = null; // clear LogNode's weak ref to us } The LoggerContext.addLocalLoger() will already overwrite the namedLoggers Hashtable entry for the logger's name with new LoggerWeakRef: namedLoggers.put(name, ref); ...and it will already overwrite the LogNode.loggerRef with new LoggerWeakRef: // Find the new node and its parent. LogNode node = getNode(name); node.loggerRef = ref; So the only effect of calling LoggerWeakRef.dispose() from LoggerContext.addLocalLoger() is a more prompt unlinking of LoggerWeakRef child, which has already been cleared but not yet disposed(), from parent logger. This will eventually happen in a call to LogManager.drainLoggerRefQueueBounded(), called from public LogManager.addLogger(), which is the only entry point to LoggerContext.addLocalLoger(). But it's not wrong as it is now when you also ensure more prompt unlinking of cleared LoggerWeakRef and code is more uniform and less fragile then it would be if you created and called a hypothetical disposeFromParent() which only took care of unlinking from parent logger... Regards, Peter On 11/29/2013 12:41 PM, Daniel Fuchs wrote: Hi, Here is a new revision that includes Peter's findings. Thanks again for that Peter! http://cr.openjdk.java.net/~dfuchs/webrev_8029281/webrev.01/ It has passed all jdk_core tests and running the new tests in a loop hasn't triggered any of the issues that appeared before the fix. -- daniel On 11/28/13 3:26 PM, Daniel Fuchs wrote: On 11/27/13 10:34 PM, Peter Levart wrote: Hi Daniel, I have started looking at the LogManager change first, and here's the 1st race I found... Hi Peter, Thanks a lot for your feedback! see below... The new method LoggerWeakRef.dispose() can be called from 3 places: - LoggerContext.addLocalLogger - LoggerContext.findLogger - LogManager.drainLoggerRefQueueBounded which is called from public method LogManager.addLogger The 1st and 2nd are guarded by LoggerContext.this lock, but the 3rd is unguarded. What can happen is the following: Thread1: LogManager.addLogger(someLogger) LogManager.drainLoggerRefQueueBounded() // finds some enqueued LoggerWeakRef for name == 'X.Y' and... LoggerWeakRef.dispose() // this is 1st call to dispose, so... synchronized (LoggerWeakRef.this) { disposed = true; } ---context switch--- Thread2: LogManager.addLogger(logger{name=='X.Y'}) LogManager.drainLoggerRefQueueBounded() // no enqueued refs... LoggerContext.addLocalLogger(logger{name=='X.Y'}, true) LoggerWeakRef ref = namedLoggers.get(name); // returns a non-null ref ('X.Y' still points to a cleared weak ref) if (ref.get() == null) { ref.dispose(); // this is a 2nd call to this ref, so it returns quickly } // ... namedLoggers gets new entry, etc... LogNode node = getNode(name); // get node for 'X.Y' *node.loggerRef = ref;* ---context switch--- Thread1: (still in LoggerWeakRef.dispose()) if (node != null) { *node.loggerRef = null;* // clear LogNode's weak ref to us *!!! NOT QUITE !!!* ... so Thread1 erroneously clears the reference from Node to new LoggerWeakRef that was just added by Thread2 ... Damn. I believe that you're right! I overlooked the fact that LogNode is reused. Locking on LogNode.context is indeed a good way of solving the issue. The only method that we will call from within the lock is LogNode.context.rmoveLoggerRef(name this) - and that already requires a lock on LogNode.context so it would be safe. I have also double checked the places where LogNode.loggerRef is set, and all those places (except in dispose() so far) are already from within a
Re: 8029281: Synchronization issues in Logger and LogManager
Hi Mandy, I have updated the webrev taking your comments into account. http://cr.openjdk.java.net/~dfuchs/webrev_8029281/webrev.02/ Changes are small and outlined below: On 12/3/13 1:49 AM, Mandy Chung wrote: On 11/29/2013 3:41 AM, Daniel Fuchs wrote: Hi, Here is a new revision that includes Peter's findings. Thanks again for that Peter! http://cr.openjdk.java.net/~dfuchs/webrev_8029281/webrev.01/ LogManager.java L156: make props volatile is good. I think the reason why LoggerWeakRef.node and parentRef don't need to be volatile because of the synchronized block to check and set disposed flag that ensures that only one thread will be doing the node and parentRef cleanup. I agree that the two lines should be done with synchronized on the LoggerContext. node.context.removeLogger(name); node.loggerRef = null; // clear LogNode's weak ref to us I don't see anything obvious wrong with this patch. Have you considered having the dispose() method simply synchronizes on node.context (probably need to get the LoggerContext when instantiating LoggerWeakRef and break dispose method into two - one for the node and the other for the parentRef)?I'm not proposing to do this in JDK 8 since we are ramping down and get a low risk fix in. It may be something to revisit for jdk 9 to a cleaner version. I believe that nulling the instance variables in LoggerWeakRef is probably no longer needed - since we now should be calling dispose only once - but I didn't want to change the code too much. I agree it would be better to revisit that in 9. Logger.java The assert this.loggerBundle == NO_RESOURCE_BUNDLE; in the Logger constructors doesn't seem to be needed. OK - removed. Having an immutable LoggerBundle class is good to ensure the resourceBundleName and resourceBundle are consistent. L1930: is lb.userBundle and lb.resourceBundleName null when getting to this line? Might be better to rename setupResourceInfo to setResourceBundleName (that creates a LoggerBundle with a null RB). On the other hand, setResourceBundle will instantiate a LoggerBundle with non-null rbname and RB. It wasn't obvious to me what does what. Right. Let's keep the old name for now. We can revisit the naming in 9. You right that when we reach the last line lb.userBundle must be null, otherwise we would have either returned or thrown an exception, since lb.userBundle != null = lb.resourceBundleName != null. Therefore I added an assert lb.userBundle == null and replaced the call to LoggerBundle.get(name, lb.userBundle) by LoggerBundle.get(name, null) Hopefully it will make what happens more obvious. 2137 final String rbName = getResourceBundleName(); 2138 if (!SYSTEM_LOGGER_RB_NAME.equals(rbName) 2139 !SYSTEM_LOGGER_RB_NAME.equals(b.getBaseBundleName())) { 2140 return LoggerBundle.get(rbName, b); 2141 } else { 2142 return SYSTEM_BUNDLE; 2143 } To get to the above line, either lb.userBundle is null or getResourceBundle() isoverridden. L2126 already checks that this logger does not associate with the system rbname. It seems to me that the special case here is when the Logger subclass overrides getResourceBundleName() to returnSYSTEM_LOGGER_RB_NAME or override getResourceBundle to return the sun.util.logging.resources.logging RB. I tink that we might just simplify the above and just return LoggerBundle.get(rbName, b) and no need to worry about those overridden case which does no use to such a Logger subclass with unmatched rbname and RB. Same comment applied to L2157-2165. Ah. I'm glad to hear that. This is a great simplification. done. -- daniel thanks Mandy
Re: 8029281: Synchronization issues in Logger and LogManager
On 12/3/2013 10:57 AM, Daniel Fuchs wrote: Hi Mandy, I have updated the webrev taking your comments into account. http://cr.openjdk.java.net/~dfuchs/webrev_8029281/webrev.02/ Looks much cleaner. Thanks for the update. Nit LogManager.java L1050: formatting - one extra space. No need to generate a new webrev. Mandy Changes are small and outlined below: On 12/3/13 1:49 AM, Mandy Chung wrote: On 11/29/2013 3:41 AM, Daniel Fuchs wrote: Hi, Here is a new revision that includes Peter's findings. Thanks again for that Peter! http://cr.openjdk.java.net/~dfuchs/webrev_8029281/webrev.01/ LogManager.java L156: make props volatile is good. I think the reason why LoggerWeakRef.node and parentRef don't need to be volatile because of the synchronized block to check and set disposed flag that ensures that only one thread will be doing the node and parentRef cleanup. I agree that the two lines should be done with synchronized on the LoggerContext. node.context.removeLogger(name); node.loggerRef = null; // clear LogNode's weak ref to us I don't see anything obvious wrong with this patch. Have you considered having the dispose() method simply synchronizes on node.context (probably need to get the LoggerContext when instantiating LoggerWeakRef and break dispose method into two - one for the node and the other for the parentRef)?I'm not proposing to do this in JDK 8 since we are ramping down and get a low risk fix in. It may be something to revisit for jdk 9 to a cleaner version. I believe that nulling the instance variables in LoggerWeakRef is probably no longer needed - since we now should be calling dispose only once - but I didn't want to change the code too much. I agree it would be better to revisit that in 9. Logger.java The assert this.loggerBundle == NO_RESOURCE_BUNDLE; in the Logger constructors doesn't seem to be needed. OK - removed. Having an immutable LoggerBundle class is good to ensure the resourceBundleName and resourceBundle are consistent. L1930: is lb.userBundle and lb.resourceBundleName null when getting to this line? Might be better to rename setupResourceInfo to setResourceBundleName (that creates a LoggerBundle with a null RB). On the other hand, setResourceBundle will instantiate a LoggerBundle with non-null rbname and RB. It wasn't obvious to me what does what. Right. Let's keep the old name for now. We can revisit the naming in 9. You right that when we reach the last line lb.userBundle must be null, otherwise we would have either returned or thrown an exception, since lb.userBundle != null = lb.resourceBundleName != null. Therefore I added an assert lb.userBundle == null and replaced the call to LoggerBundle.get(name, lb.userBundle) by LoggerBundle.get(name, null) Hopefully it will make what happens more obvious. 2137 final String rbName = getResourceBundleName(); 2138 if (!SYSTEM_LOGGER_RB_NAME.equals(rbName) 2139 !SYSTEM_LOGGER_RB_NAME.equals(b.getBaseBundleName())) { 2140 return LoggerBundle.get(rbName, b); 2141 } else { 2142 return SYSTEM_BUNDLE; 2143 } To get to the above line, either lb.userBundle is null or getResourceBundle() isoverridden. L2126 already checks that this logger does not associate with the system rbname. It seems to me that the special case here is when the Logger subclass overrides getResourceBundleName() to returnSYSTEM_LOGGER_RB_NAME or override getResourceBundle to return the sun.util.logging.resources.logging RB. I tink that we might just simplify the above and just return LoggerBundle.get(rbName, b) and no need to worry about those overridden case which does no use to such a Logger subclass with unmatched rbname and RB. Same comment applied to L2157-2165. Ah. I'm glad to hear that. This is a great simplification. done. -- daniel thanks Mandy
Re: 8029281: Synchronization issues in Logger and LogManager
On 11/29/2013 3:41 AM, Daniel Fuchs wrote: Hi, Here is a new revision that includes Peter's findings. Thanks again for that Peter! http://cr.openjdk.java.net/~dfuchs/webrev_8029281/webrev.01/ LogManager.java L156: make props volatile is good. I think the reason why LoggerWeakRef.node and parentRef don't need to be volatile because of the synchronized block to check and set disposed flag that ensures that only one thread will be doing the node and parentRef cleanup. I agree that the two lines should be done with synchronized on the LoggerContext. node.context.removeLogger(name); node.loggerRef = null; // clear LogNode's weak ref to us I don't see anything obvious wrong with this patch. Have you considered having the dispose() method simply synchronizes on node.context (probably need to get the LoggerContext when instantiating LoggerWeakRef and break dispose method into two - one for the node and the other for the parentRef)?I'm not proposing to do this in JDK 8 since we are ramping down and get a low risk fix in. It may be something to revisit for jdk 9 to a cleaner version. Logger.java The assert this.loggerBundle == NO_RESOURCE_BUNDLE; in the Logger constructors doesn't seem to be needed. Having an immutable LoggerBundle class is good to ensure the resourceBundleName and resourceBundle are consistent. L1930: is lb.userBundle and lb.resourceBundleName null when getting to this line? Might be better to rename setupResourceInfo to setResourceBundleName (that creates a LoggerBundle with a null RB). On the other hand, setResourceBundle will instantiate a LoggerBundle with non-null rbname and RB. It wasn't obvious to me what does what. 2137 final String rbName = getResourceBundleName(); 2138 if (!SYSTEM_LOGGER_RB_NAME.equals(rbName) 2139 !SYSTEM_LOGGER_RB_NAME.equals(b.getBaseBundleName())) { 2140 return LoggerBundle.get(rbName, b); 2141 } else { 2142 return SYSTEM_BUNDLE; 2143 } To get to the above line, either lb.userBundle is null or getResourceBundle() isoverridden. L2126 already checks that this logger does not associate with the system rbname. It seems to me that the special case here is when the Logger subclass overrides getResourceBundleName() to returnSYSTEM_LOGGER_RB_NAME or override getResourceBundle to return the sun.util.logging.resources.logging RB. I tink that we might just simplify the above and just return LoggerBundle.get(rbName, b) and no need to worry about those overridden case which does no use to such a Logger subclass with unmatched rbname and RB. Same comment applied to L2157-2165. thanks Mandy
Re: 8029281: Synchronization issues in Logger and LogManager
Hi Daniel, 1st sorry for the delay. I promised looking at the patch, but was then distracted by other things. I think that synchronization in LogManager is correct now. The fact that Mandy thinks so is also reassuring. I wonder if calling LoggerWeakRef.dispose() from LoggerContext.addLocalLoger() is needed now that you check the LogNode.loggerRef before clearing in LoggerWeakRef.dispose(): if (n.loggerRef == this) { n.loggerRef = null; // clear LogNode's weak ref to us } The LoggerContext.addLocalLoger() will already overwrite the namedLoggers Hashtable entry for the logger's name with new LoggerWeakRef: namedLoggers.put(name, ref); ...and it will already overwrite the LogNode.loggerRef with new LoggerWeakRef: // Find the new node and its parent. LogNode node = getNode(name); node.loggerRef = ref; So the only effect of calling LoggerWeakRef.dispose() from LoggerContext.addLocalLoger() is a more prompt unlinking of LoggerWeakRef child, which has already been cleared but not yet disposed(), from parent logger. This will eventually happen in a call to LogManager.drainLoggerRefQueueBounded(), called from public LogManager.addLogger(), which is the only entry point to LoggerContext.addLocalLoger(). But it's not wrong as it is now when you also ensure more prompt unlinking of cleared LoggerWeakRef and code is more uniform and less fragile then it would be if you created and called a hypothetical disposeFromParent() which only took care of unlinking from parent logger... Regards, Peter On 11/29/2013 12:41 PM, Daniel Fuchs wrote: Hi, Here is a new revision that includes Peter's findings. Thanks again for that Peter! http://cr.openjdk.java.net/~dfuchs/webrev_8029281/webrev.01/ It has passed all jdk_core tests and running the new tests in a loop hasn't triggered any of the issues that appeared before the fix. -- daniel On 11/28/13 3:26 PM, Daniel Fuchs wrote: On 11/27/13 10:34 PM, Peter Levart wrote: Hi Daniel, I have started looking at the LogManager change first, and here's the 1st race I found... Hi Peter, Thanks a lot for your feedback! see below... The new method LoggerWeakRef.dispose() can be called from 3 places: - LoggerContext.addLocalLogger - LoggerContext.findLogger - LogManager.drainLoggerRefQueueBounded which is called from public method LogManager.addLogger The 1st and 2nd are guarded by LoggerContext.this lock, but the 3rd is unguarded. What can happen is the following: Thread1: LogManager.addLogger(someLogger) LogManager.drainLoggerRefQueueBounded() // finds some enqueued LoggerWeakRef for name == 'X.Y' and... LoggerWeakRef.dispose() // this is 1st call to dispose, so... synchronized (LoggerWeakRef.this) { disposed = true; } ---context switch--- Thread2: LogManager.addLogger(logger{name=='X.Y'}) LogManager.drainLoggerRefQueueBounded() // no enqueued refs... LoggerContext.addLocalLogger(logger{name=='X.Y'}, true) LoggerWeakRef ref = namedLoggers.get(name); // returns a non-null ref ('X.Y' still points to a cleared weak ref) if (ref.get() == null) { ref.dispose(); // this is a 2nd call to this ref, so it returns quickly } // ... namedLoggers gets new entry, etc... LogNode node = getNode(name); // get node for 'X.Y' *node.loggerRef = ref;* ---context switch--- Thread1: (still in LoggerWeakRef.dispose()) if (node != null) { *node.loggerRef = null;* // clear LogNode's weak ref to us *!!! NOT QUITE !!!* ... so Thread1 erroneously clears the reference from Node to new LoggerWeakRef that was just added by Thread2 ... Damn. I believe that you're right! I overlooked the fact that LogNode is reused. Locking on LogNode.context is indeed a good way of solving the issue. The only method that we will call from within the lock is LogNode.context.rmoveLoggerRef(name this) - and that already requires a lock on LogNode.context so it would be safe. I have also double checked the places where LogNode.loggerRef is set, and all those places (except in dispose() so far) are already from within a lock on LoggerContext. I think I would prefer however to keep the 'disposed' flag - and to check whether LogNode.loggerRef == this before clearing it. Given that - as I said - LogNode.loggerRef is always modified from within a lock on LoggerContext I think the following would work: void dispose() { synchronized(this) { if (disposed) return; disposed = true; } final LogNode n = node; if (n != null) { synchronized (n.context) { n.context.removeLoggerRef(name, this); name = null; if (n.loggerRef == this)
Re: 8029281: Synchronization issues in Logger and LogManager
Hi, Here is a new revision that includes Peter's findings. Thanks again for that Peter! http://cr.openjdk.java.net/~dfuchs/webrev_8029281/webrev.01/ It has passed all jdk_core tests and running the new tests in a loop hasn't triggered any of the issues that appeared before the fix. -- daniel On 11/28/13 3:26 PM, Daniel Fuchs wrote: On 11/27/13 10:34 PM, Peter Levart wrote: Hi Daniel, I have started looking at the LogManager change first, and here's the 1st race I found... Hi Peter, Thanks a lot for your feedback! see below... The new method LoggerWeakRef.dispose() can be called from 3 places: - LoggerContext.addLocalLogger - LoggerContext.findLogger - LogManager.drainLoggerRefQueueBounded which is called from public method LogManager.addLogger The 1st and 2nd are guarded by LoggerContext.this lock, but the 3rd is unguarded. What can happen is the following: Thread1: LogManager.addLogger(someLogger) LogManager.drainLoggerRefQueueBounded() // finds some enqueued LoggerWeakRef for name == 'X.Y' and... LoggerWeakRef.dispose() // this is 1st call to dispose, so... synchronized (LoggerWeakRef.this) { disposed = true; } ---context switch--- Thread2: LogManager.addLogger(logger{name=='X.Y'}) LogManager.drainLoggerRefQueueBounded() // no enqueued refs... LoggerContext.addLocalLogger(logger{name=='X.Y'}, true) LoggerWeakRef ref = namedLoggers.get(name); // returns a non-null ref ('X.Y' still points to a cleared weak ref) if (ref.get() == null) { ref.dispose(); // this is a 2nd call to this ref, so it returns quickly } // ... namedLoggers gets new entry, etc... LogNode node = getNode(name); // get node for 'X.Y' *node.loggerRef = ref;* ---context switch--- Thread1: (still in LoggerWeakRef.dispose()) if (node != null) { *node.loggerRef = null;* // clear LogNode's weak ref to us *!!! NOT QUITE !!!* ... so Thread1 erroneously clears the reference from Node to new LoggerWeakRef that was just added by Thread2 ... Damn. I believe that you're right! I overlooked the fact that LogNode is reused. Locking on LogNode.context is indeed a good way of solving the issue. The only method that we will call from within the lock is LogNode.context.rmoveLoggerRef(name this) - and that already requires a lock on LogNode.context so it would be safe. I have also double checked the places where LogNode.loggerRef is set, and all those places (except in dispose() so far) are already from within a lock on LoggerContext. I think I would prefer however to keep the 'disposed' flag - and to check whether LogNode.loggerRef == this before clearing it. Given that - as I said - LogNode.loggerRef is always modified from within a lock on LoggerContext I think the following would work: void dispose() { synchronized(this) { if (disposed) return; disposed = true; } final LogNode n = node; if (n != null) { synchronized (n.context) { n.context.removeLoggerRef(name, this); name = null; if (n.loggerRef == this) { n.loggerRef = null; } node = null; } } if (parentRef != null) { Logger parent = parentRef.get(); if (parent != null) { parent.removeChildLogger(this); } parentRef = null; // clear our weak ref to the parent Logger } } I don't think 'node' or 'parentRef' need to be volatile. That - said - we could still declare them volatile to make it obvious that the code in 'dispose()' will not see a stale value (I don't think it can, but having the variables volatile would make it obvious). If we reach 'dispose()' then it does mean that the LoggerWeakRef is no longer used - so I believe the real thing we need to guard against is actually modifications of the LogNode state - rather modification of the LoggerWeakRef state. Therefore I believe that the important thing is to guarantee that all modifications to LogNode.loggerRef happen from within a lock on LoggerContext. I'd also like to keep the disposed flag because it makes it immediately obvious that the method is processed only once (so we haven't to worry about making it re-entrant at every step). Would you agree with the code above? best regards, and congratulations for catching this! -- daniel One way to fix this is to synchronize the clearing of node.logerRef on node.context too. I would make LoggerWeakRef.node and parentRef fields volatile and do the following: final class LoggerWeakRef extends WeakReferenceLogger { private String name; // for namedLoggers cleanup private volatile LogNode node; // for loggerRef cleanup private
Re: 8029281: Synchronization issues in Logger and LogManager
On 11/27/13 10:34 PM, Peter Levart wrote: Hi Daniel, I have started looking at the LogManager change first, and here's the 1st race I found... Hi Peter, Thanks a lot for your feedback! see below... The new method LoggerWeakRef.dispose() can be called from 3 places: - LoggerContext.addLocalLogger - LoggerContext.findLogger - LogManager.drainLoggerRefQueueBounded which is called from public method LogManager.addLogger The 1st and 2nd are guarded by LoggerContext.this lock, but the 3rd is unguarded. What can happen is the following: Thread1: LogManager.addLogger(someLogger) LogManager.drainLoggerRefQueueBounded() // finds some enqueued LoggerWeakRef for name == 'X.Y' and... LoggerWeakRef.dispose() // this is 1st call to dispose, so... synchronized (LoggerWeakRef.this) { disposed = true; } ---context switch--- Thread2: LogManager.addLogger(logger{name=='X.Y'}) LogManager.drainLoggerRefQueueBounded() // no enqueued refs... LoggerContext.addLocalLogger(logger{name=='X.Y'}, true) LoggerWeakRef ref = namedLoggers.get(name); // returns a non-null ref ('X.Y' still points to a cleared weak ref) if (ref.get() == null) { ref.dispose(); // this is a 2nd call to this ref, so it returns quickly } // ... namedLoggers gets new entry, etc... LogNode node = getNode(name); // get node for 'X.Y' *node.loggerRef = ref;* ---context switch--- Thread1: (still in LoggerWeakRef.dispose()) if (node != null) { *node.loggerRef = null;* // clear LogNode's weak ref to us *!!! NOT QUITE !!!* ... so Thread1 erroneously clears the reference from Node to new LoggerWeakRef that was just added by Thread2 ... Damn. I believe that you're right! I overlooked the fact that LogNode is reused. Locking on LogNode.context is indeed a good way of solving the issue. The only method that we will call from within the lock is LogNode.context.rmoveLoggerRef(name this) - and that already requires a lock on LogNode.context so it would be safe. I have also double checked the places where LogNode.loggerRef is set, and all those places (except in dispose() so far) are already from within a lock on LoggerContext. I think I would prefer however to keep the 'disposed' flag - and to check whether LogNode.loggerRef == this before clearing it. Given that - as I said - LogNode.loggerRef is always modified from within a lock on LoggerContext I think the following would work: void dispose() { synchronized(this) { if (disposed) return; disposed = true; } final LogNode n = node; if (n != null) { synchronized (n.context) { n.context.removeLoggerRef(name, this); name = null; if (n.loggerRef == this) { n.loggerRef = null; } node = null; } } if (parentRef != null) { Logger parent = parentRef.get(); if (parent != null) { parent.removeChildLogger(this); } parentRef = null; // clear our weak ref to the parent Logger } } I don't think 'node' or 'parentRef' need to be volatile. That - said - we could still declare them volatile to make it obvious that the code in 'dispose()' will not see a stale value (I don't think it can, but having the variables volatile would make it obvious). If we reach 'dispose()' then it does mean that the LoggerWeakRef is no longer used - so I believe the real thing we need to guard against is actually modifications of the LogNode state - rather modification of the LoggerWeakRef state. Therefore I believe that the important thing is to guarantee that all modifications to LogNode.loggerRef happen from within a lock on LoggerContext. I'd also like to keep the disposed flag because it makes it immediately obvious that the method is processed only once (so we haven't to worry about making it re-entrant at every step). Would you agree with the code above? best regards, and congratulations for catching this! -- daniel One way to fix this is to synchronize the clearing of node.logerRef on node.context too. I would make LoggerWeakRef.node and parentRef fields volatile and do the following: final class LoggerWeakRef extends WeakReferenceLogger { private String name; // for namedLoggers cleanup private volatile LogNode node; // for loggerRef cleanup private volatile WeakReferenceLogger parentRef; // for kids cleanup LoggerWeakRef(Logger logger) { super(logger, loggerRefQueue); name = logger.getName(); // save for namedLoggers cleanup } // dispose of this LoggerWeakRef object void dispose() { LogNode node = this.node; if
Re: 8029281: Synchronization issues in Logger and LogManager
Hi Daniel, I have started looking at the LogManager change first, and here's the 1st race I found... The new method LoggerWeakRef.dispose() can be called from 3 places: - LoggerContext.addLocalLogger - LoggerContext.findLogger - LogManager.drainLoggerRefQueueBounded which is called from public method LogManager.addLogger The 1st and 2nd are guarded by LoggerContext.this lock, but the 3rd is unguarded. What can happen is the following: Thread1: LogManager.addLogger(someLogger) LogManager.drainLoggerRefQueueBounded() // finds some enqueued LoggerWeakRef for name == 'X.Y' and... LoggerWeakRef.dispose() // this is 1st call to dispose, so... synchronized (LoggerWeakRef.this) { disposed = true; } ---context switch--- Thread2: LogManager.addLogger(logger{name=='X.Y'}) LogManager.drainLoggerRefQueueBounded() // no enqueued refs... LoggerContext.addLocalLogger(logger{name=='X.Y'}, true) LoggerWeakRef ref = namedLoggers.get(name); // returns a non-null ref ('X.Y' still points to a cleared weak ref) if (ref.get() == null) { ref.dispose(); // this is a 2nd call to this ref, so it returns quickly } // ... namedLoggers gets new entry, etc... LogNode node = getNode(name); // get node for 'X.Y' *node.loggerRef = ref;* ---context switch--- Thread1: (still in LoggerWeakRef.dispose()) if (node != null) { *node.loggerRef = null;* // clear LogNode's weak ref to us *!!! NOT QUITE !!!* ... so Thread1 erroneously clears the reference from Node to new LoggerWeakRef that was just added by Thread2 ... One way to fix this is to synchronize the clearing of node.logerRef on node.context too. I would make LoggerWeakRef.node and parentRef fields volatile and do the following: final class LoggerWeakRef extends WeakReferenceLogger { private String name; // for namedLoggers cleanup private volatile LogNode node; // for loggerRef cleanup private volatile WeakReferenceLogger parentRef; // for kids cleanup LoggerWeakRef(Logger logger) { super(logger, loggerRefQueue); name = logger.getName(); // save for namedLoggers cleanup } // dispose of this LoggerWeakRef object void dispose() { LogNode node = this.node; if (node != null) { synchronized (node.context) { node = this.node; // double-checked locking if (node != null) { // if we have a LogNode, then we were a named Logger // so clear namedLoggers weak ref to us node.context.removeLoggerRef(name, this); name = null; // clear our ref to the Logger's name node.loggerRef = null; // clear LogNode's weak ref to us this.node = null; // clear our ref to LogNode } } } WeakReferenceLogger parentRef = this.parentRef; if (parentRef != null) { // this LoggerWeakRef has or had a parent Logger this.parentRef = null; // clear our weak ref to the parent Logger (racy, but harmless) Logger parent = parentRef.get(); if (parent != null) { // the parent Logger is still there so clear the // parent Logger's weak ref to us parent.removeChildLogger(this); } } } ... What do you think? That's all for today. Will check the rest of patch tomorrow. Regards, Peter On 11/27/2013 08:23 PM, Daniel Fuchs wrote: Hi, Please find below a webrev for: 8029281: Synchronization issues in Logger and LogManager https://bugs.openjdk.java.net/browse/JDK-8029281: I believe this changeset will also fix JDK-8027670 and JDK-8029092 - which I have thus closed as duplicates of JDK-8029281. webrev: http://cr.openjdk.java.net/~dfuchs/webrev_8029281/webrev.00/ Now some comments on the fix: LogManager: When a logger was garbage collected, and then a new logger of the same name requested, addlocalLogger used to call LoggerContext.removeLogger to remove the stale reference and replace it by a new one. But sometime, the stale reference was still in the reference queue, which caused the *new* logger to be wrongly removed later when the reference queue was drained. LogManager was also missing some synchronization for its 'props' variable. Logger: userBundle resourceBundleName were sometimes accessed within a synchronized block, and sometimes without. In particular the getters were not synchronized, which could cause race conditions because an other thread