Re: [8] Review request for 7186109: Simplify lock machinery for PostEventQueue & EventQueue

Artem Ananiev Thu, 13 Sep 2012 08:22:26 -0700

Hi, Peter,

thank you very much for deep analysis! It makes everybody betterunderstand what's going, what's wrong and what can be improved.

Your particular example below is absolutely valid: there are non-zerochances e2 is posted to EventQueue before e1. I don't see any ways tofix that except to pass the event to post to flushPendingEvents().

However, this is not a side effect of the current fix. Before Oleg'schanges, e2 is still possible to land in EventQueue before e1. Thereforeit should be considered a separate problem and fixed separately.


Oleg,

the latest .5 version of the fix looks fine to me.

Thanks,

Artem

On 9/12/2012 9:46 AM, Peter Levart wrote:

Hi Oleg,

The reasoning in my previous message, that I made just before going to bed,
was not entirely correct. I made a mistake which was obvious in the morning.
But even if .notifyAll() is correctly interpreted, the results are similar.
So, once again:

Imagine two threads (t1, t2) calling concurrently EventQueue.postEvent() -
each with it's own event (e1, e2):

It can happen that t1:

- acquires the monitor in PostEventQueue 1st,
- establishes the flushThread variable,
- takes 1st snapshot of events,
- releases the monitor,
- flushes 1st snapshot of events,
- re-acquires the monitor,
- clears the flushThread variable,
- broadcasts via "notifyAll()",
- releases the monitor, ...

when t2 that was waiting, wakes up:

- acquires the monitor
- establishes the flushThread variable,
- takes a 2nd snapshot of events,
- releases the monitor,
- flushes 2nd snapshot of events,
- re-acquires the monitor,
- clears the flushThread variable,
- broadcasts via "notifyAll()",
- releases the monitor,
- posts e2 to the EventQueue, ...

and only after that t1 gets a chance to:

- post e1 to the EventQueue


The order of events that end up in EventQueue would still be as follows:

1st snapshot (taken by t1) of toolkit events,
2nd snapshot (taken by t2) of toolkit events,
e2 (posted by t2),
e1 (posted by t1)

where "2nd snapshot" are toolkit events that were "born" after e1.

Probability of such a scenario is very low, but a higher probability is that
at least some events from 2nd snapshot get inserted into EventQueue before e1.

The point I was trying to make is that as it is done currently, events that
end up in the EventQueue are only roughly ordered according to the time at
which they are posted to the queues by different threads. Ordering among
threads is not specified (the events don't have a timestamp or a sequence
allocated) Ordering is only maintained among events posted from the same
thread.

Different approaches are only more or less fair as to the ordering by "real"
time.

Regards, Peter

On Wednesday, September 12, 2012 12:02:17 AM Peter Levart wrote:

Hi Oleg,

On Tuesday, September 11, 2012 04:03:01 PM Oleg Pekhovskiy wrote:

Hi Peter,

please, see my comments below...

Thanks,
Oleg

9/11/2012 9:42 AM, Peter Levart wrote:

On Tuesday, September 11, 2012 01:25:39 AM Oleg Pekhovskiy wrote:

Hi Peter,

your idea might work if drainTo() is used before while-cycle,
otherwise the order of events could be broken sometimes.


I don't know how. The flush() is synchronized! No two threads can
execute
while-poll at the same time. And LinkedBlockingQueue is used as FIFO, so
it does not matter if toolkit thread posts any events concurrently - the
order of events is the same.


Imagine the situation when flushing caused by EventQueue.postEvent()
(called from any thread) is in progress.
Toolkit thread calls PostEventQueue.postEvent() while poll() is being
calling over and over again.
As a result the message posted by Toolkit thread would be posted to
EventQueue within current flushing cycle.
That would happen before posting of the event that caused flushing (see
EventQueue.postEvent()) and would violate the order of messages.


There's no absolute guarantee anyway since there's allways a window between
SunToolkit.flushPendingEvents() and acquire-ing a pushPopLock where anything
can happen...

Imagine two threads (t1, t2) calling concurrently EventQueue.postEvent() -
each with it's own event (e1, e2):

It can happen that t1:

- acquires the monitor in PostEventQueue 1st,
- establishes the flushThread variable,
- takes 1st snapshot of events,
- releases the monitor,
- flushes 1st snapshot of events,
- re-acquires the monitor,
- clears the flushThread variable,
- broadcasts via "notifyAll()" which temporarily releases the monitor, ...

when t2 that was waiting, wakes up:

- acquires the monitor
- establishes the flushThread variable,
- takes a 2nd snapshot of events,
- releases the monitor,
- flushes 2nd snapshot of events,
- re-acquires the monitor,
- clears the flushThread variable,
- broadcasts via "notifyAll()" which temporarily releases the monitor,
- reacquires the monitor,
- releases the monitor
- posts e2 to the EventQueue, ...

and only after that t1 gets a chance to:

- re-acquire the monitor
- release the monitor
- post e1 to the EventQueue


The order of events that end up in EventQueue is then as follows:

1st snapshot (taken by t1) of toolkit events,
2nd snapshot (taken by t2) of toolkit events,
e2 (posted by t2),
e1 (posted by t1)

where "2nd snapshot" are toolkit events that were "born" after e1.


Nobody guarantees that the event posted by EventQueue.postEvent will be the
1st event to arrive into the EventQueue after the snapshot of toolkit events
taken by the same thread at just the same time.

This order could only be maintained if SunToolkit.flushPendingEvents() was
allways called as part of the critical section in the EventQueue guarded by
pushPopLock. But then this would have to be guaranteed allways or there is a
chance of deadlock.

(It's true that flushing is not done while holding a monitor on
PostEventQueue any more, but flushingThread variable and a wait loop at the
begginig of flush() actualy constitue a self-made mutex lock which guards
the section containing flushing loop and can equally so participate in a
dead-lock situation.)

Since SunToolkit.flushPendingEvents() is public, anyone can call it, so this
is dangerous...

Regards, Peter

P.S. I'm just thinking loud:

What about a scheme where there was a special "copying" thread employed for
transfering events from PostEventQueue to EventQueue? No
SunToolkit.flushPendingEvents() method is needed any more then.

Also, usage of LinkedBlockingQueue could lead to performance decrease


Internally it uses ReentrantLock, which in flush while-poll loop is
acquired once per poll. Uncontended acquire is just a CAS. I don't think
that in this context (GUI events) it presents any difference. So any
approach is good-enough.>

(especially with drainTo()).

The class has more complicated logic entailing pitfalls in future.


That might be true:

http://stackoverflow.com/questions/12349881/why-linkedblockingqueuepoll-
ma
y-hang-up


I'm not denying that CAS in most cases is faster than locking. But if
you look at the implementation of LinkedBlockingQueue you'll see some
unnecessary things (for our case) as this class is quite common.

Thanks,
Oleg


Regards, Peter

08.09.2012 21:39, Peter Levart wrote:

Hi Oleg,

I still think that there is room for simplification. Now that the
flush
can be synchronized again (because EventQueue.detatchDispatchThread is
not calling SunToolkit.PostEventQueue's synchronized methods while
holding pushPopLock), we just have to make sure that toolkit thread is
not blocked by flushing.

Here's a simplified PostEventQueue that does that:

class PostEventQueue {

      private final Queue<AWTEvent>  queue = new
      LinkedBlockingQueue<>();
      //
      unbounded private final EventQueue eventQueue;

      private boolean isFlushing;

      PostEventQueue(EventQueue eq) {

         eventQueue = eq;

      }

      public synchronized void flush() {

         if (isFlushing)

            return;

         isFlushing = true;
         try {

            AWTEvent event;
            while ((event = queue.poll()) != null)

               eventQueue.postEvent(event);

         }
         finally {

            isFlushing = false;

         }

      }

      /*
      * Enqueue an AWTEvent to be posted to the Java EventQueue.
      */
      void postEvent(AWTEvent event) {

         queue.offer(event);
         SunToolkit.wakeupEventQueue(eventQueue, event.getSource() ==
         AWTAutoShutdown.getInstance());>

      }

}



This implementation also finishes the flush in the presence of
interrupts...

Peter

On Saturday, September 08, 2012 04:09:40 AM Oleg Pekhovskiy wrote:

Artem, Anthony, David,

please review the next version of fix:
http://cr.openjdk.java.net/~bagiras/8/7186109.4/

What's new in comparison with the previous version:

1. Removed "isThreadLocalFlushing" and "isFlashing", created
"flushThread" instead
(stores the thread that currently performs flushing).
2. Implemented both recursion and multi-thread block on "flushThread"
only.
3. Added Thread.interrupt() to the catch block as outside we would
like
to know what happened in PostEventQueue.flush().
We are not able to rethrow the exception because we couldn't change
the
signature (by adding "throwns")
for all related methods (e.g. EventQueue.postEvent()).

The fix successfully passed
"closed/java/awt/EventQueue/PostEventOrderingTest.java" test.

IMHO, the code became clearer.

Looking forward to your comments!

Thanks,
Oleg

8/30/2012 9:19 PM, Oleg Pekhovskiy wrote:

There are also other 2 methods that will require 'throws
InterruptedException' or try-catch:
1. EventQueue.postEvent()
2. EventQueue.removeSourceEvents()

Thanks,
Oleg

8/30/2012 9:01 PM, Oleg Pekhovskiy wrote:

Hi,

I got another idea preparing the next version of fix.

Previously we didn't catch InterruptedException and stack unwinding
took place right up to
try-catch section in EventDispatchThread.pumpOneEventForFilters().

So seems like it would be correct not eating that exception in
PostEventQueue.flush()
but just check the state using isInterrupted() method and add
'throws
InterruptedException'
to PostEventQueue.flush() and SunToolkit.flushPendingEvents()
methods.

Thoughts?

Thanks,
Oleg

8/30/2012 5:20 PM, Anthony Petrov wrote:

I see. After giving it more thought I don't see an easy solution
for
this issue, too. As long as we guarantee that the EDT is recreated
should more events be posted, I don't see a big problem with this.
So let's stick with the "Minimize..." approach then.

On 08/30/12 00:18, Oleg Pekhovskiy wrote:

Hi Anthony,

I see your concerns.

As PostEventQueue.flush() method left unsynchronized,
we potentially could return PostEventQueue.noEvents()
and return check in EventQueue.detachDispatchThread()
back to the condition.
But is could increase the possibility of deadlock in future
(with PostEventQueue&  pushPopLock).

Artem, what do you think?

Thanks,
Oleg

29.08.2012 15:22, Anthony Petrov wrote:

On 8/29/2012 3:08 PM, Anthony Petrov wrote:

Hi Oleg,

I'm still concerned about the following:

detachDispatchThread()
{
flush();
lock();
// possibly detach
unlock();
}

at EventQueue.java. What if an even get posted to the queue
after
the


A typo: s/even get/event gets/.

flush() returns but before we even acquired the lock? We may
still
end up with a situation when we detach the thread while in fact
there
are some pending events present, which actually contradicts the
current logic of the detach() method. I see that you say
"Minimize
discard possibility" in the comment instead of "Prevent ...",
but
I
feel uncomfortable with this actually.

What exactly prevents us from adding some synchronization to
ensure
that the detaching only happens when there's really no pending
events?

SunToolkit.java:

2120 Boolean b = isThreadLocalFlushing.get();
2121 if (b != null&&  b) {
2122 return;
2123 }
2124 2125 isThreadLocalFlushing.set(true);
2126 try {


How does access to the isThreadLocalFlushing synchronized? What
happens if the flush() is being invoked from two different
threads
for the same post event queue? Why do we have two "isFlushing"
flags?
Can we collapse them into one? Why is the isFlushing set/reset
in
two
disjunct synchronized(){} blocks?

Overall, I find the current synchronization scheme in flush()
very,
*very* (and I mean it) confusing. Can we simplify it somehow?

On 8/28/2012 6:33 PM, Oleg Pekhovskiy wrote:

Hi Artem, Anthony,

thank you for your proposals!

We with Artem also had off-line discussion,
so as a result I prepared improved version of fix:
http://cr.openjdk.java.net/~bagiras/8/7186109.3/

What was done:
1. EventQueue.detachDispatchThread(): moved
SunToolkit.flushPnedingEvents() above the comments and added a
separate comment to it.
2. Moved SunToolkitSubclass.flushPendingEvents(AppContext)
method to
SunToolkit. Deleted SunToolkitSubclass.
3. Moved isFlushingPendingEvents to PostEventQueue with the
new
name
- isThreadLocalFlushing and made it ThreadLocal.
4. Left PostEventQueue.flush() unsynchronized and created
wait()-notifyAll() synchronization mechanism to avoid blocking
of
PostEventQueue.postEvent().

Looking forward to your comments!

Thanks,
Oleg

20.08.2012 20:20, Artem Ananiev wrote:

Hi, Oleg,

here are a few comments:

1. What is the reason of keeping "isFlushingPendingEvents" in
SunToolkit, given that PEQ.flush() is synchronized (and
therefore
serialized) anyway?

2. flushPendingEvents(AppContext) may be moved directly to
SunToolkit, so we don't need a separate sun-class for that.

3. EQ.java:1035-1040 - this comment is obsolete and must be
replaced by another one.

Thanks,

Artem

On 8/17/2012 4:49 PM, Oleg Pekhovskiy wrote:

Hi!

Please review the fix for CR:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7186109

Webrev:
http://cr.openjdk.java.net/~bagiras/8/7186109.1/

The following changes were made:
1. Removed flushLock from SunToolkit.flushPendingEvent()
2. Returned method PostEventQueue.flush() as 'synchronized'
back
3. Added call of SunToolkit.flushPendingEvents() to
EventQueue.detachDispatchThread(),
right before pushPopLock.lock()
4. Removed !SunToolkit.isPostEventQueueEmpty() check from
EventQueue.detachDispatchThread()
5. Removed SunToolkit.isPostEventQueueEmpty()&
PostEventQueue.noEvents();

Thanks,
Oleg
<http://cr.openjdk.java.net/%7Ebagiras/8/7186109.1/>

Re: [8] Review request for 7186109: Simplify lock machinery for PostEventQueue & EventQueue

Reply via email to