On 9/8/2015 6:13 PM, Sergey Bylokhov wrote:
On 08.09.15 18:02, Semyon Sadetsky wrote:


On 9/8/2015 2:06 PM, Sergey Bylokhov wrote:
On 08.09.15 13:42, Semyon Sadetsky wrote:


On 9/8/2015 1:31 PM, Sergey Bylokhov wrote:
On 08.09.15 13:15, Semyon Sadetsky wrote:


On 9/8/2015 12:33 PM, Sergey Bylokhov wrote:
On 08.09.15 12:04, Semyon Sadetsky wrote:


On 9/7/2015 6:56 PM, Sergey Bylokhov wrote:
On 07.09.15 16:41, Semyon Sadetsky wrote:


On 9/7/2015 3:28 PM, Sergey Bylokhov wrote:
On 03.09.15 17:43, Semyon Sadetsky wrote:


On 8/5/2015 2:33 PM, Sergey Bylokhov wrote:
On 05.08.15 14:20, Semyon Sadetsky wrote:


On 8/5/2015 1:39 PM, Sergey Bylokhov wrote:
On 05.08.15 13:18, Semyon Sadetsky wrote:


On 8/5/2015 12:27 PM, Sergey Bylokhov wrote:
On 04.08.15 14:54, Semyon Sadetsky wrote:
On 8/3/2015 6:05 PM, Sergey Bylokhov wrote:
Hi, Semyon
Did you try to change dwMilliseconds from INFINITE to the
timeout(10 seconds by default?) which is passed to the
method?
It does not help? Because even when dnd is not used we
should
not wait event for infinite time.
It would not help to fix the issue because 10 seconds is
too big
interval. But for consistency it is not bad to have a time
limit.
http://cr.openjdk.java.net/~ssadetsky/8132664/webrev.01/
Note that syncNativeQueue is intended to wait until the
native
event queue is flushed or until timeout will expire. So
even if
timeout is expired we collect the native events during this
period
of time.Can you double check that the event counter is
incremented
during dnd? I do not know how we block the toolkit thread, probably we create some nested loops which ignore our event
posted
from syncNativeQueue, can we change that?
Yes, this is an internal secondary loop which waits for
mouse
release event.
Can we change the condition and process the sync event in
this
loop?
Why? Will receive all events on the toolkit thread when
doDragDrop
returns.

When how we get dragenter/exit events?

On the platform side they are not events but callbacks which are
converted into events on the java side and added to the
queue. So
they
will be detected by isEQEmpty() and waitForIdle() will be
repeated.

This is not directly related to this fix. Most of our
callbacks/events
posts events to the EQ, but there is a possibility for a lag
between
callback and a post events, and this is why the
syncNativeQueue was
added.
Let's return to the beginning: the syncNativeQueue method
according to
its specifications should try to flush the native system,
track the
native activity and should not wait more than timeout parameter.

It is not possible to send sync events to the DD loop. Only
mouse/keyboard event can return control to the app. I would not
emulate
them in the syncnative.
Always waiting the max time is not good. Maybe wait a bit and
check if
no callbacks happened?

It will be good to check that callbacks were occurred in the
same way
as native events. This can be done in this fix. As for question
about
wait or not when timeout is over, is a shared problem, which affect
the osx also. It can be fixed separately. I guess the logic in
realSync can be tweaked and the default timeout(10*4 sec) is not a
optimal as a default value, 2 or 3 seconds without any events
should
be enough.
Could you explain your logic?

My logic is to follow the specification.
I cannot accept this as an answer because it is an abstract
declaration.
Please, try to be more specific.

Just implement as I described it above.

Sorry?

Just change implementation of syncNativeQueue to don't block forever
if timeout parameter is positive. And wait forever in case of negative
timeout. This is enough to fix for now, now need to add some
workarounds for dnd, since the realsync itself should be reworked.

Sergey, I'm a bit tired of this level of explanations.

Explanations is simple just implement the method according specification, since the problem is general and is not specific it should be fixed on the shared level.

An internal method javadoc can be changed any time.
What do you mean under "for now" and why the realsync itself should be
reworked?

For now means for now. There are a few fixes which were pushed to the jdk9 in to realsync area, as well as a few fixes to the tests (regression/sqe), and we already got a bunch of bugs. And "for now" we should not implement any workarounds since we know that the shared code contains errors.
Can you give me the bug number you are talking about?

The solution fixes the problem. Test is passed.


Yes, the test still fail after the fix, the reason will be not a lock but a slow execution.

It is not true. I just rerun the test with updated jdk9 client codebase under windows and it passed in 17 seconds:

----------messages:(3/117)----------
command: main DefaultNoDrop
reason: User specified action: run main DefaultNoDrop
elapsed time (seconds): 17.339
----------System.out:(8/177)----------
invoking: onEDT1
invoking: onEDT10
invoking: onBackgroundThread20
invoking: onEDT30
invoking: onEDT40
invoking: onBackgroundThread50
invoking: onEDT60
invoking: onEDT70
----------System.err:(1/16)----------
STATUS:Passed.


If you cannot explain in
words why it does not work for you, maybe you can provide a test case
which fails with the proposed solution?


I can only make guesses what specification section you mean. For
example, the next is waitForIdle() method's spec:
     /**
* Waits until all events currently on the event queue have been
processed.
      * @throws  IllegalThreadStateException if called on the AWT
event
dispatching thread
      */
What does specifically contradicts to this specification?

Check the documentation of the realSync method.

realSync is an internal method. It is not a part of the specification.
It has this description of the timeout parameter:
      * @param timeout the maximum time to wait in milliseconds,
negative means "forever".

That is as it used in the fix. No?


Why to wait short wait period doesn't work
if we are in DD and there are no callbacks?

"No callbacks" during what period? My point: this period should be
passed to this method.
Period is passed to the method.
Java_sun_awt_windows_WToolkit_syncNativeQueue(JNIEnv *env, jobject
self,
jlong timeout)
It's meaning is a _maximum_ time the method should wait. Before the
fix
it was ignored and the method never returned.
With the fix the timeout is taken into account but this aspect of the
fix is not related to the original issue and it cannot fix the issue
alone.
Or you meant another method? Again, please be more specific otherwise
this discussion can be infinite.

No callbacks means no
activity and in analogous situations the method returns true.



I see a clear disadvantage: even with 2-3 second wait period the
test
will executes much longer than on Linux platform because D&D is
triggered every 3 mouse drags . Also other D&D tests suites will
demand
much more time to pass the windows platform. Users can report
bugs if
AWT Robot starts to produce multi-seconds delay after drag events.
Maybe you can provide an example scenario which demonstrates that
it is
really necessary to pay this price?





Event counter is not changed during toolkit thread
blocking of
cause. Not sure that we can change that. But since toolkit
queue is
blocked we can assume that we are synced.

The timeout value is maintained on the shared level and
actually
this test will fail with timeout on osx as well
JDK-7185258.
The
test will fail even if the time out will be changed to
±100ms,
because it call realsync on each pixel move, ±200 times.
This
problem can be fixed in different ways like tweak of
timeouts and
numbers of iterations, or changing the test to call w4idle
only
after the latest move(actually I think this method can be
moved to
the robot class).

So I still think that the right fix for a deadlock,
which is
subject of this bug, is simply change the
syncNativeQueue and
waits using a timeout if it is positive, and waits
forever if
timeout is negative (the same bug on osx JDK-8080504).
I'm not sure that waiting brings any value. What do you
propose to
return if it timed out? The event counter will not be
changed
regardless of waiting.
But it should be changed, because we get native events
from the
system during dnd and in each such callback we should update
this
counter. If callbacks were not called=>counter was not
updated
then
sync assume that currently we do not process events. If
callbacks
were called then sync assume that we have events in the
native
queue
and should try to sync again on the next iteration.
No. Events are not processed while toolkit is blocked in
doDragDrop(). The application state is frozen on that period.
That is
Windows approach.

With such waiting the test will fail because of either jtreg
timeout
Default timeout is 120 seconds for everything, the test
try to
sync
the queue 200 of times after each move, so yes it can fail
with
timeout even if spend in nativesync 200 ms, the possible
solutions
were in my previous email.
either InfiniteLoop exception.
This exception will be disabled by default lately in jdk9
timeframe,
right now it helps to find some related issues.



On 03.08.15 17:26, Semyon Sadetsky wrote:
Hello,

Please review fix for JDK9:

bug: https://bugs.openjdk.java.net/browse/JDK-8132664
webrev:
http://cr.openjdk.java.net/~ssadetsky/8132664/webrev.00/

DoDragDrop() is blocking, so upon drag operation is
triggered
the toolkit thread is blocked and the WM_AWT_WAIT
cannot be
processed which in its turn blocks the AWT robot.
The solution is to escape AWT robot waiting in
syncNativeQueue() if drag operation is in progress.

--Semyon






























Reply via email to