On 01/10/2013 10:05 AM, shanliang wrote: > Jaroslav Bachorik wrote: >> On 01/09/2013 03:25 PM, shanliang wrote: >> >>> Jaroslav Bachorik wrote: >>> >>>> On 01/09/2013 02:44 PM, shanliang wrote: >>>> >>>> >>>>> Let's forget the JMX implementation at first. If an MBean is >>>>> unregistered, a user at client side calls "removeNotificationListener" >>>>> on the MBean, what should happen? if the user calls "isRegistered" on >>>>> the MBean, what should happen? >>>>> >>>>> I have done 2 tests, I used only one thread: >>>>> >>>>> 1) >>>>> ...... >>>>> localServer.unregisterMBean(myMBean); >>>>> boolean isRegistered = remoteClientServer.isRegistered(myMBean)); >>>>> >>>>> I got isRegistered = false; >>>>> >>>>> 2) >>>>> ...... >>>>> localServer.unregisterMBean(myMBean); >>>>> System.out.println("isRegistered = >>>>> "+remoteClientServer.sRegistered(myMBean)); >>>>> remoteClientServer.removeNotificationListener(myMBean, listener); >>>>> >>>>> I did not get an exception. >>>>> >>>>> The 1) told that the client could know the MBean was unregistered, >>>>> then >>>>> the client should throw an exception for the call of >>>>> "removeNotificationListener" in 2). >>>>> >>>> Yes, but then it would not test the listener leakage as it was supposed >>>> to test but rather the fact that the client throws the appropriate >>>> exception. The fact that the mbean was unregistered does not >>>> necessarily >>>> mean that the listeners were released. The main problem remains - the >>>> listeners are being cleaned-up asynchronously and the clean-up process >>>> might race against the other uses of the JMX API. >>>> >>> client.removeNotificationListener is not a right way here to test >>> listener leak, we could use some other ways, for example we keep the >>> listener in a weak reference, then after the mbean is removed, the weak >>> reference should be empty after some time. Another way is like >>> DeadListenerTest does to check whether clean has done at server side: >>> use reflection to get the "listenerMap" at server side and make sure it >>> is empty, but this need to add a private method to the class >>> ClientNotifForwarder. >>> >> >> There will still be problems with timing. You need either to wait for >> the GC to kick in to clean up the weak ref. And the listenerMap will not >> be purged of the unregistered MBean listeners until the notification is >> generated, processed on the ClientNotificationForwarder and forwarded to >> the server. So there goes the timing issue again. >> >> The problem is that the "unregisterMBean" operation does not guarantee >> that the listeners have been unregistered at the time it returns. So, >> one way or the other we will need to wait an arbitrary amount of time >> before checking for the memory leak. >> > Yes we need to wait, but you can use a cycle like: > long maxWaitingTime = 3000; > long startTime = System.currentTimeMillis(); > while ( weakReference.get != null > && System.currentTimeMillis() < startTime + > maxWaitingTime) { > System.gc(); > Thread.sleep(100); > System.gc(); > } > > if (weakReference.get != null) { > // failed > } >
Sounds reasonable. I'll update the test. -JB- > Shanliang >> -JB- >> >> >>> I think we have 3 things to do here: >>> 1) modify the test to not use removeNotificationListener for testing >>> listener leak >>> 2) create a new bug about a client does not throw an exception after an >>> mbean is unregistered >>> 3) create a bug about a client does not throw a same exception as at >>> server side. >>> >>> I will do 2) and 3), if you like you can continue 1), it might need to >>> do fix also in the JMX implementation. >>> >>> Shanliang >>> >>>> >>>> >>>>> The test "DeadListenerTest" got passed in some machines because of the >>>>> timeout for waiting a notification. I think its failure just tells >>>>> a new >>>>> bug. >>>>> >>>>> To set a longer timeout just hides the real bug, and the test might >>>>> fail >>>>> again one day if running condition is changed and you might need >>>>> longer >>>>> timeout again. >>>>> >>>> Yes, I agree with you that extending the timeout just lessens the >>>> likelihood of the race condition and does not prevent it. >>>> >>>> >>>> >>>>> Shanliang >>>>> >>>>> Jaroslav Bachorik wrote: >>>>> >>>>>> On 01/09/2013 11:08 AM, shanliang wrote: >>>>>> >>>>>> >>>>>>> Jaroslav Bachorik wrote: >>>>>>> >>>>>>>> On 01/09/2013 09:40 AM, shanliang wrote: >>>>>>>> >>>>>>>> >>>>>>>>> I still have no idea why the test failed, but I do not see why a >>>>>>>>> longer >>>>>>>>> timeout can fix the test. Have you reproduced the problem and >>>>>>>>> tested >>>>>>>>> your fix? if yes then possible the long timeout hided a real >>>>>>>>> problem. >>>>>>>>> >>>>>>>> Yes, I can reproduce the problem (using the fastbuild bits and >>>>>>>> -Xcomp >>>>>>>> switch) and verify that the fix makes the test pass. >>>>>>>> >>>>>>>> The ClientNotifForwarder scans the notifications for >>>>>>>> MBeanServerNotification.UNREGISTRATION_NOTIFICATION and removes the >>>>>>>> appropriate notification listeners in a separate thread. Thus, >>>>>>>> calling >>>>>>>> "removeNotificationListener" on the main thread is prone to racing. >>>>>>>> >>>>>>> It is true that ClientNotifForwarder scans the notifications for >>>>>>> MBeanServerNotification.UNREGISTRATION_NOTIFICATION and removes the >>>>>>> appropriate notification listeners in a separate thread. This is >>>>>>> for a >>>>>>> client connection to do clean if a user never calls >>>>>>> removeNotificationListener. >>>>>>> >>>>>>> But calling directly removeNotificationListener from a client should >>>>>>> still get exception if the clean has not been done. As I said, if >>>>>>> the >>>>>>> client checked and found the listener was still there, then the >>>>>>> client >>>>>>> sent a request to its server to remove the listener at server side, >>>>>>> the >>>>>>> server should find that the MBean in question was not registered, >>>>>>> so the >>>>>>> server should throw an exception. The bug might be here. >>>>>>> >>>>>> This won't work. The server side listeners are removed upon receiving >>>>>> the "unregistered" notification which is delivered from the >>>>>> ClientNotificationForwarder and it may have not run yet (since it >>>>>> runs >>>>>> in a separate executor thread). The result is that the attempt to >>>>>> remove >>>>>> the notification listener on the server will succeed as well failing >>>>>> the >>>>>> test subsequently. >>>>>> >>>>>> -JB- >>>>>> >>>>>> >>>>>> >>>>>>> Shanliang >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> The timeout you made longer was used to wait a notification which >>>>>>>>> should >>>>>>>>> never arrive. >>>>>>>>> >>>>>>>> Well, it can be used to allow more time to process the "unregister" >>>>>>>> notification, too. >>>>>>>> >>>>>>>> When I think more of this I am more inclined to fix the race >>>>>>>> condition. >>>>>>>> An updated webrev will follow. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> To remove a listener from a client side, we did: >>>>>>>>> 1) at client side, check whether it was added in the client side >>>>>>>>> 2) at server side, check whether the MBean in question was >>>>>>>>> registered in >>>>>>>>> the MBeanServer (!!!) >>>>>>>>> 3) at server side, check whether the listener was added. >>>>>>>>> >>>>>>>>> So 2) tells that we did not rely on a "unregister" notification. >>>>>>>>> Anyway, >>>>>>>>> if you use a SAME thread to call "unregister" operation to >>>>>>>>> unregister an >>>>>>>>> mbean, then any following call (without any time break) to use the >>>>>>>>> mbean >>>>>>>>> should fail, like "removeNotificationListener", "isRegistered" >>>>>>>>> etc. >>>>>>>>> >>>>>>>>> I do see a bug here, if we remove a listener from a non-existing >>>>>>>>> MBeam, >>>>>>>>> we get "ListenerNotFoundException" at a client side, but get >>>>>>>>> "InstanceNotFoundException" at server side, I think we should >>>>>>>>> create a >>>>>>>>> bug, because both implemented the same interface >>>>>>>>> MBeanServerConnection. >>>>>>>>> >>>>>>>> Yes, it is rather inconsistent. >>>>>>>> >>>>>>>> -JB- >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Shanliang >>>>>>>>> >>>>>>>>> Jaroslav Bachorik wrote: >>>>>>>>> >>>>>>>>>> Looking for review and a sponsor. >>>>>>>>>> >>>>>>>>>> Webrev at http://cr.openjdk.java.net/~jbachorik/7170447/webrev.00 >>>>>>>>>> >>>>>>>>>> In this issue the timing is the problem. >>>>>>>>>> MBeanServer.unregisterMBean() >>>>>>>>>> fires the "unregister" notification which is sent to the server >>>>>>>>>> asynchronously. Thus it may happen that the "unregister" >>>>>>>>>> notification >>>>>>>>>> has not been yet processed at the time of invoking >>>>>>>>>> removeNotificationListener() and the notification listeners >>>>>>>>>> hasn't >>>>>>>>>> been >>>>>>>>>> cleaned up leading to the test failure. >>>>>>>>>> >>>>>>>>>> There is no synchronization between the client and the server and >>>>>>>>>> such >>>>>>>>>> race condition can occur occasionally. Normally, the execution is >>>>>>>>>> fast >>>>>>>>>> enough to behave like the "unregister" notification is processed >>>>>>>>>> synchronously with the unregisterMBean() operation but it seems >>>>>>>>>> that >>>>>>>>>> using fastdebug Server VM bits with the -Xcomp option strains >>>>>>>>>> the CPU >>>>>>>>>> enough to make this problem appear. >>>>>>>>>> >>>>>>>>>> There is no proper fix for this - the only thing that work is >>>>>>>>>> waiting a >>>>>>>>>> bit longer in the main thread to give the notification processing >>>>>>>>>> thread >>>>>>>>>> some time to clean up the listeners. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> >>>>>>>>>> -JB- >>>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >> > >