subject:"\[Mesa\-dev\] \[PATCH v4 0\/3\] asynchronous pbo transfer with glthread"

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-20 Thread gregory hainaut

On Thu, 20 Apr 2017 10:56:34 +0900
Michel Dänzer  wrote:

> On 20/04/17 01:43 AM, gregory hainaut wrote:
> > Hello All,
> > 
> > I ported PCSX2 to xcb (at least the non-glx part). Crash is gone :)
> > So I can send the v5 with the hash delete fix.
> > 
> > However, Mesa might receive crash bug report when glthread is enabled
> > on a random app that doesn't use xcb/XinitThread properly.
> 
> This means it's still a bug in Mesa, you're just working around it in
> your application.
> 
> As we've explained, Mesa's glthread cannot make any libX11 API calls.
>
 
Yes. And unfortunately the crash is still here (seem less often).

 
> > Maybe it would be better to always enable the XInitThread mode by default 
> > on the X11 library.
> > If performance of X11 is critical, it would be better to switch to xcb 
> > anyway.
> 
> There has been talk about making that change, but there's not even a
> specific plan yet for making it happen upstream. It doesn't change the
> situation with currently shipping libX11.
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-19 Thread Michel Dänzer

On 20/04/17 01:43 AM, gregory hainaut wrote:
> Hello All,
> 
> I ported PCSX2 to xcb (at least the non-glx part). Crash is gone :)
> So I can send the v5 with the hash delete fix.
> 
> However, Mesa might receive crash bug report when glthread is enabled
> on a random app that doesn't use xcb/XinitThread properly.

This means it's still a bug in Mesa, you're just working around it in
your application.

As we've explained, Mesa's glthread cannot make any libX11 API calls.

> Maybe it would be better to always enable the XInitThread mode by default on 
> the X11 library.
> If performance of X11 is critical, it would be better to switch to xcb anyway.

There has been talk about making that change, but there's not even a
specific plan yet for making it happen upstream. It doesn't change the
situation with currently shipping libX11.

-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-19 Thread gregory hainaut

Hello All,

I ported PCSX2 to xcb (at least the non-glx part). Crash is gone :)
So I can send the v5 with the hash delete fix.

However, Mesa might receive crash bug report when glthread is enabled
on a random app that doesn't use xcb/XinitThread properly.

Maybe it would be better to always enable the XInitThread mode by default on 
the X11 library.
If performance of X11 is critical, it would be better to switch to xcb anyway.

Cheers,
Gregory



On Tue, 18 Apr 2017 15:35:59 +0200
Marek Olšák  wrote:

> All GL calls that might use libX11 must not be asynchronous within glthread.
> 
> Marek
> 
> On Apr 18, 2017 10:43 AM, "Gregory Hainaut" 
> wrote:
> 
> Hello Michel,
> 
> As yes, I completely forgot about XInitThreads that must be it. I
> don't know how Nvidia manage to solve/force it. Anyway, I will fix my
> application.
> 
> Thanks you for the info.
> 
> On 4/18/17, Michel Dänzer  wrote:
> > On 18/04/17 05:04 PM, gregory hainaut wrote:
> >> On Tue, 18 Apr 2017 08:51:24 +0200
> >> gregory hainaut  wrote:
> >>
> >>> On Mon, 17 Apr 2017 11:17:42 +0900
> >>> Michel Dänzer  wrote:
> >>>
>  On 15/04/17 05:08 PM, gregory hainaut wrote:
> > On Sat, 15 Apr 2017 00:50:15 +0200
> > Dieter Nützel  wrote:
> >
> >> Am 14.04.2017 07:53, schrieb gregory hainaut:
> >>> On Fri, 14 Apr 2017 05:20:38 +0200
> >>> Dieter Nützel  wrote:
> >>>
>  Am 14.04.2017 02:06, schrieb Dieter Nützel:
> > Hello Gregory,
> >
> > have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
> > It result in crazy numbers and do not 'return' (one core stays @
> > 100%).
> 
>  This is related to 'mesa_glthread=true'.
>  If I disable (unset) it, all is fine after 'b' benchmark and 'pbo'
>  exit
>  with ESC as expeted.
>  Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)
> 
>  Hope that helps.
> 
>  Dieter
> >>>
> >>> Hello Dieter,
> >>>
> >>> I tested the demo. There is a pseudo unrelated bug on the exit of
> >>> the
> >>> application.
> >>>
> >>> Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable,
> >>> found non-freed data
> >>>
> >>> I will add a call to a _mesa_HashDeleteAll to fix it.
> >>> i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb,
> >>> ctx);
> >>>
> >>> Now let's go back to the test behavior. The benchmarks will send 4s
> >>> of
> >>> asynchronous PBO transfer commands. And then will sync gl_thread
> >>> which
> >>> mean the application thread will be blocked until all PBO transfers
> >>> are
> >>> done. Gl_thread is faster to dispatch command so you will need to
> >>> wait
> >>> more before the thread goes back to real life.
> >>>
> >>> On my side, I need to wait around 45 seconds for 6 millions of
> >>> commands.
> >>> Result:  6,440,627 reads (gl thread on + PBO patches)
> >>> Result:274,960 reads (gl thread off)
> >>>
> >>> In your case, "Result:  77,444,412 reads", I hope you're patient.
> >>> I think you must wait at least 10 minutes.
> >>
> >> Now, I was patient...
> >> Tried 2 times but after ~20 minutes I've killed it at first and
> >> attached
> >> gdb at it during second run.
> >>
> >> 0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from
> >> /lib64/libpthread.so.0
> >> (gdb) bt
> >> #0  0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from
> >> /lib64/libpthread.so.0
> >> #1  0x7fbda5359453 in ?? () from /usr/local/lib/dri/r600_dri.so
> >> #2  0x7fbda53661f4 in ?? () from /usr/local/lib/dri/r600_dri.so
> >> #3  0x00401e18 in ?? ()
> >> #4  0x004028c7 in ?? ()
> >> #5  0x7fbda9925781 in fghRedrawWindow () from
> >> /usr/lib64/libglut.so.3
> >> #6  0x7fbda9925c08 in ?? () from /usr/lib64/libglut.so.3
> >> #7  0x7fbda9926cf9 in fgEnumWindows () from
> >> /usr/lib64/libglut.so.3
> >> #8  0x7fbda9925ce4 in glutMainLoopEvent () from
> >> /usr/lib64/libglut.so.3
> >> #9  0x7fbda9925d85 in glutMainLoop () from
> >> /usr/lib64/libglut.so.3
> >> #10 0x004019fc in ?? ()
> >> #11 0x7fbda957e541 in __libc_start_main () from /lib64/libc.so.6
> >> #12 0x00401afa in ?? ()
> >>
> >> Should I do more or not worth it?
> >>
> >> Dieter
> >
> > Hello Dieter,
> >
> > To be honest, I don't konw how much time you need to wait. 77 millions
> > of
> > PBO transfer is quite huge. It depends on CPU/Memory/PCIe/VRAM/GPU
> > speed.
> >
> > Hum based on the image size (194*188*4), you need to approximately
> > transfer
> >

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-18 Thread Marek Olšák

All GL calls that might use libX11 must not be asynchronous within glthread.

Marek

On Apr 18, 2017 10:43 AM, "Gregory Hainaut" 
wrote:

Hello Michel,

As yes, I completely forgot about XInitThreads that must be it. I
don't know how Nvidia manage to solve/force it. Anyway, I will fix my
application.

Thanks you for the info.

On 4/18/17, Michel Dänzer  wrote:
> On 18/04/17 05:04 PM, gregory hainaut wrote:
>> On Tue, 18 Apr 2017 08:51:24 +0200
>> gregory hainaut  wrote:
>>
>>> On Mon, 17 Apr 2017 11:17:42 +0900
>>> Michel Dänzer  wrote:
>>>
 On 15/04/17 05:08 PM, gregory hainaut wrote:
> On Sat, 15 Apr 2017 00:50:15 +0200
> Dieter Nützel  wrote:
>
>> Am 14.04.2017 07:53, schrieb gregory hainaut:
>>> On Fri, 14 Apr 2017 05:20:38 +0200
>>> Dieter Nützel  wrote:
>>>
 Am 14.04.2017 02:06, schrieb Dieter Nützel:
> Hello Gregory,
>
> have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
> It result in crazy numbers and do not 'return' (one core stays @
> 100%).

 This is related to 'mesa_glthread=true'.
 If I disable (unset) it, all is fine after 'b' benchmark and 'pbo'
 exit
 with ESC as expeted.
 Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)

 Hope that helps.

 Dieter
>>>
>>> Hello Dieter,
>>>
>>> I tested the demo. There is a pseudo unrelated bug on the exit of
>>> the
>>> application.
>>>
>>> Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable,
>>> found non-freed data
>>>
>>> I will add a call to a _mesa_HashDeleteAll to fix it.
>>> i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb,
>>> ctx);
>>>
>>> Now let's go back to the test behavior. The benchmarks will send 4s
>>> of
>>> asynchronous PBO transfer commands. And then will sync gl_thread
>>> which
>>> mean the application thread will be blocked until all PBO transfers
>>> are
>>> done. Gl_thread is faster to dispatch command so you will need to
>>> wait
>>> more before the thread goes back to real life.
>>>
>>> On my side, I need to wait around 45 seconds for 6 millions of
>>> commands.
>>> Result:  6,440,627 reads (gl thread on + PBO patches)
>>> Result:274,960 reads (gl thread off)
>>>
>>> In your case, "Result:  77,444,412 reads", I hope you're patient.
>>> I think you must wait at least 10 minutes.
>>
>> Now, I was patient...
>> Tried 2 times but after ~20 minutes I've killed it at first and
>> attached
>> gdb at it during second run.
>>
>> 0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib64/libpthread.so.0
>> (gdb) bt
>> #0  0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib64/libpthread.so.0
>> #1  0x7fbda5359453 in ?? () from /usr/local/lib/dri/r600_dri.so
>> #2  0x7fbda53661f4 in ?? () from /usr/local/lib/dri/r600_dri.so
>> #3  0x00401e18 in ?? ()
>> #4  0x004028c7 in ?? ()
>> #5  0x7fbda9925781 in fghRedrawWindow () from
>> /usr/lib64/libglut.so.3
>> #6  0x7fbda9925c08 in ?? () from /usr/lib64/libglut.so.3
>> #7  0x7fbda9926cf9 in fgEnumWindows () from
>> /usr/lib64/libglut.so.3
>> #8  0x7fbda9925ce4 in glutMainLoopEvent () from
>> /usr/lib64/libglut.so.3
>> #9  0x7fbda9925d85 in glutMainLoop () from
>> /usr/lib64/libglut.so.3
>> #10 0x004019fc in ?? ()
>> #11 0x7fbda957e541 in __libc_start_main () from /lib64/libc.so.6
>> #12 0x00401afa in ?? ()
>>
>> Should I do more or not worth it?
>>
>> Dieter
>
> Hello Dieter,
>
> To be honest, I don't konw how much time you need to wait. 77 millions
> of
> PBO transfer is quite huge. It depends on CPU/Memory/PCIe/VRAM/GPU
> speed.
>
> Hum based on the image size (194*188*4), you need to approximately
> transfer
> 10522 GB of data from your GPU... Which is likely around 20 minutes if
> PCIe run at full speed. Honestly I will let the application in
> background
> for a couple of hours.

 Basically, the application needs to be fixed not to emit an unlimited
 number of PBO transfers without doing anything which requires
 synchronizing to the transfers.


>>>
>>> Hello Michel, Timothy, Marek
>>>
>>> Yes, I think it should limit the number of transfer to a million. And
>>> also uses fence to measure the PBO transfer.
>>>
>>>
>>> However, I have found others crashes on PCSX2 with those patches. It
>>> seems related to synchronization issue with GLX/DRI/X11. This series
>>> removes most of the gl sync for PCSX2. So any missing sync will

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-18 Thread Gregory Hainaut

Hello Michel,

As yes, I completely forgot about XInitThreads that must be it. I
don't know how Nvidia manage to solve/force it. Anyway, I will fix my
application.

Thanks you for the info.

On 4/18/17, Michel Dänzer  wrote:
> On 18/04/17 05:04 PM, gregory hainaut wrote:
>> On Tue, 18 Apr 2017 08:51:24 +0200
>> gregory hainaut  wrote:
>>
>>> On Mon, 17 Apr 2017 11:17:42 +0900
>>> Michel Dänzer  wrote:
>>>
 On 15/04/17 05:08 PM, gregory hainaut wrote:
> On Sat, 15 Apr 2017 00:50:15 +0200
> Dieter Nützel  wrote:
>
>> Am 14.04.2017 07:53, schrieb gregory hainaut:
>>> On Fri, 14 Apr 2017 05:20:38 +0200
>>> Dieter Nützel  wrote:
>>>
 Am 14.04.2017 02:06, schrieb Dieter Nützel:
> Hello Gregory,
>
> have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
> It result in crazy numbers and do not 'return' (one core stays @
> 100%).

 This is related to 'mesa_glthread=true'.
 If I disable (unset) it, all is fine after 'b' benchmark and 'pbo'
 exit
 with ESC as expeted.
 Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)

 Hope that helps.

 Dieter
>>>
>>> Hello Dieter,
>>>
>>> I tested the demo. There is a pseudo unrelated bug on the exit of
>>> the
>>> application.
>>>
>>> Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable,
>>> found non-freed data
>>>
>>> I will add a call to a _mesa_HashDeleteAll to fix it.
>>> i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb,
>>> ctx);
>>>
>>> Now let's go back to the test behavior. The benchmarks will send 4s
>>> of
>>> asynchronous PBO transfer commands. And then will sync gl_thread
>>> which
>>> mean the application thread will be blocked until all PBO transfers
>>> are
>>> done. Gl_thread is faster to dispatch command so you will need to
>>> wait
>>> more before the thread goes back to real life.
>>>
>>> On my side, I need to wait around 45 seconds for 6 millions of
>>> commands.
>>> Result:  6,440,627 reads (gl thread on + PBO patches)
>>> Result:274,960 reads (gl thread off)
>>>
>>> In your case, "Result:  77,444,412 reads", I hope you're patient.
>>> I think you must wait at least 10 minutes.
>>
>> Now, I was patient...
>> Tried 2 times but after ~20 minutes I've killed it at first and
>> attached
>> gdb at it during second run.
>>
>> 0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib64/libpthread.so.0
>> (gdb) bt
>> #0  0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib64/libpthread.so.0
>> #1  0x7fbda5359453 in ?? () from /usr/local/lib/dri/r600_dri.so
>> #2  0x7fbda53661f4 in ?? () from /usr/local/lib/dri/r600_dri.so
>> #3  0x00401e18 in ?? ()
>> #4  0x004028c7 in ?? ()
>> #5  0x7fbda9925781 in fghRedrawWindow () from
>> /usr/lib64/libglut.so.3
>> #6  0x7fbda9925c08 in ?? () from /usr/lib64/libglut.so.3
>> #7  0x7fbda9926cf9 in fgEnumWindows () from
>> /usr/lib64/libglut.so.3
>> #8  0x7fbda9925ce4 in glutMainLoopEvent () from
>> /usr/lib64/libglut.so.3
>> #9  0x7fbda9925d85 in glutMainLoop () from
>> /usr/lib64/libglut.so.3
>> #10 0x004019fc in ?? ()
>> #11 0x7fbda957e541 in __libc_start_main () from /lib64/libc.so.6
>> #12 0x00401afa in ?? ()
>>
>> Should I do more or not worth it?
>>
>> Dieter
>
> Hello Dieter,
>
> To be honest, I don't konw how much time you need to wait. 77 millions
> of
> PBO transfer is quite huge. It depends on CPU/Memory/PCIe/VRAM/GPU
> speed.
>
> Hum based on the image size (194*188*4), you need to approximately
> transfer
> 10522 GB of data from your GPU... Which is likely around 20 minutes if
> PCIe run at full speed. Honestly I will let the application in
> background
> for a couple of hours.

 Basically, the application needs to be fixed not to emit an unlimited
 number of PBO transfers without doing anything which requires
 synchronizing to the transfers.


>>>
>>> Hello Michel, Timothy, Marek
>>>
>>> Yes, I think it should limit the number of transfer to a million. And
>>> also uses fence to measure the PBO transfer.
>>>
>>>
>>> However, I have found others crashes on PCSX2 with those patches. It
>>> seems related to synchronization issue with GLX/DRI/X11. This series
>>> removes most of the gl sync for PCSX2. So any missing sync will trigger
>>> a crash. Or I got a not obvious bug in my patches.
>>>
>>>
>>> Please find a backtrace below of a crash during a draw. I manage to get a
>>> similar

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-18 Thread Michel Dänzer

On 18/04/17 05:04 PM, gregory hainaut wrote:
> On Tue, 18 Apr 2017 08:51:24 +0200
> gregory hainaut  wrote:
> 
>> On Mon, 17 Apr 2017 11:17:42 +0900
>> Michel Dänzer  wrote:
>>
>>> On 15/04/17 05:08 PM, gregory hainaut wrote:
 On Sat, 15 Apr 2017 00:50:15 +0200
 Dieter Nützel  wrote:

> Am 14.04.2017 07:53, schrieb gregory hainaut:
>> On Fri, 14 Apr 2017 05:20:38 +0200
>> Dieter Nützel  wrote:
>>
>>> Am 14.04.2017 02:06, schrieb Dieter Nützel:
 Hello Gregory,

 have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
 It result in crazy numbers and do not 'return' (one core stays @ 100%).
>>>
>>> This is related to 'mesa_glthread=true'.
>>> If I disable (unset) it, all is fine after 'b' benchmark and 'pbo' 
>>> exit
>>> with ESC as expeted.
>>> Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)
>>>
>>> Hope that helps.
>>>
>>> Dieter
>>
>> Hello Dieter,
>>
>> I tested the demo. There is a pseudo unrelated bug on the exit of the
>> application.
>>
>> Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable,
>> found non-freed data
>>
>> I will add a call to a _mesa_HashDeleteAll to fix it.
>> i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb, ctx);
>>
>> Now let's go back to the test behavior. The benchmarks will send 4s of
>> asynchronous PBO transfer commands. And then will sync gl_thread which
>> mean the application thread will be blocked until all PBO transfers are
>> done. Gl_thread is faster to dispatch command so you will need to wait
>> more before the thread goes back to real life.
>>
>> On my side, I need to wait around 45 seconds for 6 millions of 
>> commands.
>> Result:  6,440,627 reads (gl thread on + PBO patches)
>> Result:274,960 reads (gl thread off)
>>
>> In your case, "Result:  77,444,412 reads", I hope you're patient.
>> I think you must wait at least 10 minutes.
>
> Now, I was patient...
> Tried 2 times but after ~20 minutes I've killed it at first and attached 
> gdb at it during second run.
>
> 0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> (gdb) bt
> #0  0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> #1  0x7fbda5359453 in ?? () from /usr/local/lib/dri/r600_dri.so
> #2  0x7fbda53661f4 in ?? () from /usr/local/lib/dri/r600_dri.so
> #3  0x00401e18 in ?? ()
> #4  0x004028c7 in ?? ()
> #5  0x7fbda9925781 in fghRedrawWindow () from 
> /usr/lib64/libglut.so.3
> #6  0x7fbda9925c08 in ?? () from /usr/lib64/libglut.so.3
> #7  0x7fbda9926cf9 in fgEnumWindows () from /usr/lib64/libglut.so.3
> #8  0x7fbda9925ce4 in glutMainLoopEvent () from 
> /usr/lib64/libglut.so.3
> #9  0x7fbda9925d85 in glutMainLoop () from /usr/lib64/libglut.so.3
> #10 0x004019fc in ?? ()
> #11 0x7fbda957e541 in __libc_start_main () from /lib64/libc.so.6
> #12 0x00401afa in ?? ()
>
> Should I do more or not worth it?
>
> Dieter

 Hello Dieter,

 To be honest, I don't konw how much time you need to wait. 77 millions of
 PBO transfer is quite huge. It depends on CPU/Memory/PCIe/VRAM/GPU speed.

 Hum based on the image size (194*188*4), you need to approximately transfer
 10522 GB of data from your GPU... Which is likely around 20 minutes if
 PCIe run at full speed. Honestly I will let the application in background
 for a couple of hours.
>>>
>>> Basically, the application needs to be fixed not to emit an unlimited
>>> number of PBO transfers without doing anything which requires
>>> synchronizing to the transfers.
>>>
>>>
>>
>> Hello Michel, Timothy, Marek
>>
>> Yes, I think it should limit the number of transfer to a million. And
>> also uses fence to measure the PBO transfer.
>>
>>
>> However, I have found others crashes on PCSX2 with those patches. It
>> seems related to synchronization issue with GLX/DRI/X11. This series
>> removes most of the gl sync for PCSX2. So any missing sync will trigger
>> a crash. Or I got a not obvious bug in my patches.
>>
>>
>> Please find a backtrace below of a crash during a draw. I manage to get a 
>> similar backtrace (i.e. 
>> same exception in _XReply/dequeue_pending_request) when I call XGetGeometry.
>>
>>
>> #4  0xf61ec777 in __GI___assert_fail (assertion=0xf6122099 
>> "!xcb_xlib_unknown_req_in_deq", file=0xf6122067 "../../src/xcb_io.c", 
>> line=179, function=0xf612248d <__PRETTY_FUNCTION__.14063> 
>> "dequeue_pending_request")
>> at assert.c:101
>> #5  0xf60abbcd in dequeue_pending_request (dpy=, 
>> req=) at ../../src/xcb_io.c:185
>> #6

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-18 Thread gregory hainaut

On Tue, 18 Apr 2017 08:51:24 +0200
gregory hainaut  wrote:

> On Mon, 17 Apr 2017 11:17:42 +0900
> Michel Dänzer  wrote:
> 
> > On 15/04/17 05:08 PM, gregory hainaut wrote:
> > > On Sat, 15 Apr 2017 00:50:15 +0200
> > > Dieter Nützel  wrote:
> > > 
> > >> Am 14.04.2017 07:53, schrieb gregory hainaut:
> > >>> On Fri, 14 Apr 2017 05:20:38 +0200
> > >>> Dieter Nützel  wrote:
> > >>>
> >  Am 14.04.2017 02:06, schrieb Dieter Nützel:
> > > Hello Gregory,
> > >
> > > have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
> > > It result in crazy numbers and do not 'return' (one core stays @ 
> > > 100%).
> > 
> >  This is related to 'mesa_glthread=true'.
> >  If I disable (unset) it, all is fine after 'b' benchmark and 'pbo' 
> >  exit
> >  with ESC as expeted.
> >  Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)
> > 
> >  Hope that helps.
> > 
> >  Dieter
> > >>>
> > >>> Hello Dieter,
> > >>>
> > >>> I tested the demo. There is a pseudo unrelated bug on the exit of the
> > >>> application.
> > >>>
> > >>> Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable,
> > >>> found non-freed data
> > >>>
> > >>> I will add a call to a _mesa_HashDeleteAll to fix it.
> > >>> i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb, ctx);
> > >>>
> > >>> Now let's go back to the test behavior. The benchmarks will send 4s of
> > >>> asynchronous PBO transfer commands. And then will sync gl_thread which
> > >>> mean the application thread will be blocked until all PBO transfers are
> > >>> done. Gl_thread is faster to dispatch command so you will need to wait
> > >>> more before the thread goes back to real life.
> > >>>
> > >>> On my side, I need to wait around 45 seconds for 6 millions of 
> > >>> commands.
> > >>> Result:  6,440,627 reads (gl thread on + PBO patches)
> > >>> Result:274,960 reads (gl thread off)
> > >>>
> > >>> In your case, "Result:  77,444,412 reads", I hope you're patient.
> > >>> I think you must wait at least 10 minutes.
> > >>
> > >> Now, I was patient...
> > >> Tried 2 times but after ~20 minutes I've killed it at first and attached 
> > >> gdb at it during second run.
> > >>
> > >> 0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> > >> /lib64/libpthread.so.0
> > >> (gdb) bt
> > >> #0  0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> > >> /lib64/libpthread.so.0
> > >> #1  0x7fbda5359453 in ?? () from /usr/local/lib/dri/r600_dri.so
> > >> #2  0x7fbda53661f4 in ?? () from /usr/local/lib/dri/r600_dri.so
> > >> #3  0x00401e18 in ?? ()
> > >> #4  0x004028c7 in ?? ()
> > >> #5  0x7fbda9925781 in fghRedrawWindow () from 
> > >> /usr/lib64/libglut.so.3
> > >> #6  0x7fbda9925c08 in ?? () from /usr/lib64/libglut.so.3
> > >> #7  0x7fbda9926cf9 in fgEnumWindows () from /usr/lib64/libglut.so.3
> > >> #8  0x7fbda9925ce4 in glutMainLoopEvent () from 
> > >> /usr/lib64/libglut.so.3
> > >> #9  0x7fbda9925d85 in glutMainLoop () from /usr/lib64/libglut.so.3
> > >> #10 0x004019fc in ?? ()
> > >> #11 0x7fbda957e541 in __libc_start_main () from /lib64/libc.so.6
> > >> #12 0x00401afa in ?? ()
> > >>
> > >> Should I do more or not worth it?
> > >>
> > >> Dieter
> > > 
> > > Hello Dieter,
> > > 
> > > To be honest, I don't konw how much time you need to wait. 77 millions of
> > > PBO transfer is quite huge. It depends on CPU/Memory/PCIe/VRAM/GPU speed.
> > > 
> > > Hum based on the image size (194*188*4), you need to approximately 
> > > transfer
> > > 10522 GB of data from your GPU... Which is likely around 20 minutes if
> > > PCIe run at full speed. Honestly I will let the application in background
> > > for a couple of hours.
> > 
> > Basically, the application needs to be fixed not to emit an unlimited
> > number of PBO transfers without doing anything which requires
> > synchronizing to the transfers.
> > 
> > 
> 
> Hello Michel, Timothy, Marek
> 
> Yes, I think it should limit the number of transfer to a million. And
> also uses fence to measure the PBO transfer.
> 
> 
> However, I have found others crashes on PCSX2 with those patches. It
> seems related to synchronization issue with GLX/DRI/X11. This series
> removes most of the gl sync for PCSX2. So any missing sync will trigger
> a crash. Or I got a not obvious bug in my patches.
> 
> 
> Please find a backtrace below of a crash during a draw. I manage to get a 
> similar backtrace (i.e. 
> same exception in _XReply/dequeue_pending_request) when I call XGetGeometry.
> 
> 
> #4  0xf61ec777 in __GI___assert_fail (assertion=0xf6122099 
> "!xcb_xlib_unknown_req_in_deq", file=0xf6122067 "../../src/xcb_io.c", 
> line=179, function=0xf612248d <__PRETTY_FUNCTION__.14063> 
> "dequeue_pending_request")
> at assert.c:101
> #5  0xf60abbcd in dequeue_pending_request (dpy=, 
> req=) at

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-18 Thread Michel Dänzer

On 18/04/17 03:51 PM, gregory hainaut wrote:
> 
> However, I have found others crashes on PCSX2 with those patches. It
> seems related to synchronization issue with GLX/DRI/X11. This series
> removes most of the gl sync for PCSX2. So any missing sync will trigger
> a crash. Or I got a not obvious bug in my patches.
> 
> 
> Please find a backtrace below of a crash during a draw. I manage to get a 
> similar backtrace (i.e. 
> same exception in _XReply/dequeue_pending_request) when I call XGetGeometry.
> 
> 
> #4  0xf61ec777 in __GI___assert_fail (assertion=0xf6122099 
> "!xcb_xlib_unknown_req_in_deq", file=0xf6122067 "../../src/xcb_io.c", 
> line=179, function=0xf612248d <__PRETTY_FUNCTION__.14063> 
> "dequeue_pending_request")
> at assert.c:101
> #5  0xf60abbcd in dequeue_pending_request (dpy=, 
> req=) at ../../src/xcb_io.c:185
> #6  0xf60aca17 in _XReply (dpy=0xe8fdde80, rep=0xcd46b910, extra=6, 
> discard=0) at ../../src/xcb_io.c:639
> #7  0xf3bba8df in DRI2GetBuffersWithFormat (dpy=0xe8fdde80, 
> drawable=83886261, width=0xd8ba11e8, height=0xd8ba11ec, 
> attachments=0xcd46ba38, count=1, outCount=0xcd46ba24) at dri2.c:485
> #8  0xf3bbac45 in dri2GetBuffersWithFormat (driDrawable=0xd8ba11d0, 
> width=0xd8ba11e8, height=0xd8ba11ec, attachments=0xcd46ba38, count=1, 
> out_count=0xcd46ba24, loaderPrivate=0xf225df10) at dri2_glx.c:894
> #9  0xd555e121 in dri2_drawable_get_buffers (count=, 
> atts=0xa15f8b20, drawable=0xa2e50a00) at dri2.c:285
> #10 dri2_allocate_textures (ctx=0xd8b98810, drawable=0xa2e50a00, 
> statts=0xa15f8b20, statts_count=2) at dri2.c:480
> #11 0xd5557bc0 in dri_st_framebuffer_validate (stctx=0x9df20900, 
> stfbi=0xa2e50a00, statts=0xa15f8b20, count=2, out=0xcd46bb80) at 
> dri_drawable.c:83
> #12 0xd533ae8a in st_framebuffer_validate (stfb=stfb@entry=0xa15f8780, 
> st=st@entry=0x9df20900) at state_tracker/st_manager.c:189
> 
> 
> I don't have any clue on the GLX/DRI/X11 interaction with OpenGL. If
> someone have any idea, feel free to share :)

Calling libX11 APIs (such as _XReply) for the same Display* from
multiple threads is only safe if XInitThreads was called (and completed)
before any other libX11 APIs were called.

Since Mesa cannot enforce this, the only safe course of action is to
only call libX11 APIs from one thread (at least for the time being;
there are plans to make libX11 always behave as if XInitThreads was
called first, but I'm not sure when it'll happen).


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-18 Thread gregory hainaut

On Mon, 17 Apr 2017 11:17:42 +0900
Michel Dänzer  wrote:

> On 15/04/17 05:08 PM, gregory hainaut wrote:
> > On Sat, 15 Apr 2017 00:50:15 +0200
> > Dieter Nützel  wrote:
> > 
> >> Am 14.04.2017 07:53, schrieb gregory hainaut:
> >>> On Fri, 14 Apr 2017 05:20:38 +0200
> >>> Dieter Nützel  wrote:
> >>>
>  Am 14.04.2017 02:06, schrieb Dieter Nützel:
> > Hello Gregory,
> >
> > have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
> > It result in crazy numbers and do not 'return' (one core stays @ 100%).
> 
>  This is related to 'mesa_glthread=true'.
>  If I disable (unset) it, all is fine after 'b' benchmark and 'pbo' 
>  exit
>  with ESC as expeted.
>  Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)
> 
>  Hope that helps.
> 
>  Dieter
> >>>
> >>> Hello Dieter,
> >>>
> >>> I tested the demo. There is a pseudo unrelated bug on the exit of the
> >>> application.
> >>>
> >>> Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable,
> >>> found non-freed data
> >>>
> >>> I will add a call to a _mesa_HashDeleteAll to fix it.
> >>> i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb, ctx);
> >>>
> >>> Now let's go back to the test behavior. The benchmarks will send 4s of
> >>> asynchronous PBO transfer commands. And then will sync gl_thread which
> >>> mean the application thread will be blocked until all PBO transfers are
> >>> done. Gl_thread is faster to dispatch command so you will need to wait
> >>> more before the thread goes back to real life.
> >>>
> >>> On my side, I need to wait around 45 seconds for 6 millions of 
> >>> commands.
> >>> Result:  6,440,627 reads (gl thread on + PBO patches)
> >>> Result:274,960 reads (gl thread off)
> >>>
> >>> In your case, "Result:  77,444,412 reads", I hope you're patient.
> >>> I think you must wait at least 10 minutes.
> >>
> >> Now, I was patient...
> >> Tried 2 times but after ~20 minutes I've killed it at first and attached 
> >> gdb at it during second run.
> >>
> >> 0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> >> /lib64/libpthread.so.0
> >> (gdb) bt
> >> #0  0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> >> /lib64/libpthread.so.0
> >> #1  0x7fbda5359453 in ?? () from /usr/local/lib/dri/r600_dri.so
> >> #2  0x7fbda53661f4 in ?? () from /usr/local/lib/dri/r600_dri.so
> >> #3  0x00401e18 in ?? ()
> >> #4  0x004028c7 in ?? ()
> >> #5  0x7fbda9925781 in fghRedrawWindow () from 
> >> /usr/lib64/libglut.so.3
> >> #6  0x7fbda9925c08 in ?? () from /usr/lib64/libglut.so.3
> >> #7  0x7fbda9926cf9 in fgEnumWindows () from /usr/lib64/libglut.so.3
> >> #8  0x7fbda9925ce4 in glutMainLoopEvent () from 
> >> /usr/lib64/libglut.so.3
> >> #9  0x7fbda9925d85 in glutMainLoop () from /usr/lib64/libglut.so.3
> >> #10 0x004019fc in ?? ()
> >> #11 0x7fbda957e541 in __libc_start_main () from /lib64/libc.so.6
> >> #12 0x00401afa in ?? ()
> >>
> >> Should I do more or not worth it?
> >>
> >> Dieter
> > 
> > Hello Dieter,
> > 
> > To be honest, I don't konw how much time you need to wait. 77 millions of
> > PBO transfer is quite huge. It depends on CPU/Memory/PCIe/VRAM/GPU speed.
> > 
> > Hum based on the image size (194*188*4), you need to approximately transfer
> > 10522 GB of data from your GPU... Which is likely around 20 minutes if
> > PCIe run at full speed. Honestly I will let the application in background
> > for a couple of hours.
> 
> Basically, the application needs to be fixed not to emit an unlimited
> number of PBO transfers without doing anything which requires
> synchronizing to the transfers.
> 
> 

Hello Michel, Timothy, Marek

Yes, I think it should limit the number of transfer to a million. And
also uses fence to measure the PBO transfer.


However, I have found others crashes on PCSX2 with those patches. It
seems related to synchronization issue with GLX/DRI/X11. This series
removes most of the gl sync for PCSX2. So any missing sync will trigger
a crash. Or I got a not obvious bug in my patches.


Please find a backtrace below of a crash during a draw. I manage to get a 
similar backtrace (i.e. 
same exception in _XReply/dequeue_pending_request) when I call XGetGeometry.


#4  0xf61ec777 in __GI___assert_fail (assertion=0xf6122099 
"!xcb_xlib_unknown_req_in_deq", file=0xf6122067 "../../src/xcb_io.c", line=179, 
function=0xf612248d <__PRETTY_FUNCTION__.14063> "dequeue_pending_request")
at assert.c:101
#5  0xf60abbcd in dequeue_pending_request (dpy=, req=) at ../../src/xcb_io.c:185
#6  0xf60aca17 in _XReply (dpy=0xe8fdde80, rep=0xcd46b910, extra=6, discard=0) 
at ../../src/xcb_io.c:639
#7  0xf3bba8df in DRI2GetBuffersWithFormat (dpy=0xe8fdde80, drawable=83886261, 
width=0xd8ba11e8, height=0xd8ba11ec, attachments=0xcd46ba38, count=1, 
outCount=0xcd46ba24) at dri2.c:485
#8  0xf3bbac45 in

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-16 Thread Michel Dänzer

On 15/04/17 05:08 PM, gregory hainaut wrote:
> On Sat, 15 Apr 2017 00:50:15 +0200
> Dieter Nützel  wrote:
> 
>> Am 14.04.2017 07:53, schrieb gregory hainaut:
>>> On Fri, 14 Apr 2017 05:20:38 +0200
>>> Dieter Nützel  wrote:
>>>
 Am 14.04.2017 02:06, schrieb Dieter Nützel:
> Hello Gregory,
>
> have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
> It result in crazy numbers and do not 'return' (one core stays @ 100%).

 This is related to 'mesa_glthread=true'.
 If I disable (unset) it, all is fine after 'b' benchmark and 'pbo' 
 exit
 with ESC as expeted.
 Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)

 Hope that helps.

 Dieter
>>>
>>> Hello Dieter,
>>>
>>> I tested the demo. There is a pseudo unrelated bug on the exit of the
>>> application.
>>>
>>> Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable,
>>> found non-freed data
>>>
>>> I will add a call to a _mesa_HashDeleteAll to fix it.
>>> i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb, ctx);
>>>
>>> Now let's go back to the test behavior. The benchmarks will send 4s of
>>> asynchronous PBO transfer commands. And then will sync gl_thread which
>>> mean the application thread will be blocked until all PBO transfers are
>>> done. Gl_thread is faster to dispatch command so you will need to wait
>>> more before the thread goes back to real life.
>>>
>>> On my side, I need to wait around 45 seconds for 6 millions of 
>>> commands.
>>> Result:  6,440,627 reads (gl thread on + PBO patches)
>>> Result:274,960 reads (gl thread off)
>>>
>>> In your case, "Result:  77,444,412 reads", I hope you're patient.
>>> I think you must wait at least 10 minutes.
>>
>> Now, I was patient...
>> Tried 2 times but after ~20 minutes I've killed it at first and attached 
>> gdb at it during second run.
>>
>> 0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
>> /lib64/libpthread.so.0
>> (gdb) bt
>> #0  0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
>> /lib64/libpthread.so.0
>> #1  0x7fbda5359453 in ?? () from /usr/local/lib/dri/r600_dri.so
>> #2  0x7fbda53661f4 in ?? () from /usr/local/lib/dri/r600_dri.so
>> #3  0x00401e18 in ?? ()
>> #4  0x004028c7 in ?? ()
>> #5  0x7fbda9925781 in fghRedrawWindow () from 
>> /usr/lib64/libglut.so.3
>> #6  0x7fbda9925c08 in ?? () from /usr/lib64/libglut.so.3
>> #7  0x7fbda9926cf9 in fgEnumWindows () from /usr/lib64/libglut.so.3
>> #8  0x7fbda9925ce4 in glutMainLoopEvent () from 
>> /usr/lib64/libglut.so.3
>> #9  0x7fbda9925d85 in glutMainLoop () from /usr/lib64/libglut.so.3
>> #10 0x004019fc in ?? ()
>> #11 0x7fbda957e541 in __libc_start_main () from /lib64/libc.so.6
>> #12 0x00401afa in ?? ()
>>
>> Should I do more or not worth it?
>>
>> Dieter
> 
> Hello Dieter,
> 
> To be honest, I don't konw how much time you need to wait. 77 millions of
> PBO transfer is quite huge. It depends on CPU/Memory/PCIe/VRAM/GPU speed.
> 
> Hum based on the image size (194*188*4), you need to approximately transfer
> 10522 GB of data from your GPU... Which is likely around 20 minutes if
> PCIe run at full speed. Honestly I will let the application in background
> for a couple of hours.

Basically, the application needs to be fixed not to emit an unlimited
number of PBO transfers without doing anything which requires
synchronizing to the transfers.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-15 Thread gregory hainaut

On Sat, 15 Apr 2017 00:50:15 +0200
Dieter Nützel  wrote:

> Am 14.04.2017 07:53, schrieb gregory hainaut:
> > On Fri, 14 Apr 2017 05:20:38 +0200
> > Dieter Nützel  wrote:
> > 
> >> Am 14.04.2017 02:06, schrieb Dieter Nützel:
> >> > Hello Gregory,
> >> >
> >> > have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
> >> > It result in crazy numbers and do not 'return' (one core stays @ 100%).
> >> 
> >> This is related to 'mesa_glthread=true'.
> >> If I disable (unset) it, all is fine after 'b' benchmark and 'pbo' 
> >> exit
> >> with ESC as expeted.
> >> Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)
> >> 
> >> Hope that helps.
> >> 
> >> Dieter
> > 
> > Hello Dieter,
> > 
> > I tested the demo. There is a pseudo unrelated bug on the exit of the
> > application.
> > 
> > Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable,
> > found non-freed data
> > 
> > I will add a call to a _mesa_HashDeleteAll to fix it.
> > i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb, ctx);
> > 
> > Now let's go back to the test behavior. The benchmarks will send 4s of
> > asynchronous PBO transfer commands. And then will sync gl_thread which
> > mean the application thread will be blocked until all PBO transfers are
> > done. Gl_thread is faster to dispatch command so you will need to wait
> > more before the thread goes back to real life.
> > 
> > On my side, I need to wait around 45 seconds for 6 millions of 
> > commands.
> > Result:  6,440,627 reads (gl thread on + PBO patches)
> > Result:274,960 reads (gl thread off)
> > 
> > In your case, "Result:  77,444,412 reads", I hope you're patient.
> > I think you must wait at least 10 minutes.
> 
> Now, I was patient...
> Tried 2 times but after ~20 minutes I've killed it at first and attached 
> gdb at it during second run.
> 
> 0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> (gdb) bt
> #0  0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> #1  0x7fbda5359453 in ?? () from /usr/local/lib/dri/r600_dri.so
> #2  0x7fbda53661f4 in ?? () from /usr/local/lib/dri/r600_dri.so
> #3  0x00401e18 in ?? ()
> #4  0x004028c7 in ?? ()
> #5  0x7fbda9925781 in fghRedrawWindow () from 
> /usr/lib64/libglut.so.3
> #6  0x7fbda9925c08 in ?? () from /usr/lib64/libglut.so.3
> #7  0x7fbda9926cf9 in fgEnumWindows () from /usr/lib64/libglut.so.3
> #8  0x7fbda9925ce4 in glutMainLoopEvent () from 
> /usr/lib64/libglut.so.3
> #9  0x7fbda9925d85 in glutMainLoop () from /usr/lib64/libglut.so.3
> #10 0x004019fc in ?? ()
> #11 0x7fbda957e541 in __libc_start_main () from /lib64/libc.so.6
> #12 0x00401afa in ?? ()
> 
> Should I do more or not worth it?
> 
> Dieter

Hello Dieter,

To be honest, I don't konw how much time you need to wait. 77 millions of
PBO transfer is quite huge. It depends on CPU/Memory/PCIe/VRAM/GPU speed.

Hum based on the image size (194*188*4), you need to approximately transfer
10522 GB of data from your GPU... Which is likely around 20 minutes if
PCIe run at full speed. Honestly I will let the application in background
for a couple of hours.

Backtrace without symbol is hard to read. But I'm pretty sure, it is
waiting on the glError call.

Cheers,
Gregory

 
> >> > mesa-demos/tests> ./pbo
> >> > ATTENTION: default value of option mesa_glthread overridden by
> >> > environment.
> >> > GL_VERSION = 4.1 Mesa 17.1.0-devel (git-7c8fe31e1c)
> >> > GL_RENDERER = Gallium 0.4 on AMD TURKS (DRM 2.49.0 /
> >> > 4.11.0-rc6-1.g5a51416-default, LLVM 5.0.0)
> >> > Loaded 194 by 188 image
> >> > Converting RGB image to RGBA
> >> > Benchmarking...
> >> > Result:  7712 reads in 4.00 seconds = -383971576.00
> >> > pixels/sec
> >> >
> >> > top - 02:04:42 up 10:05,  4 users,  load average: 1,03, 0,77, 0,71
> >> > Tasks: 265 total,   1 running, 264 sleeping,   0 stopped,   0 zombie
> >> > %Cpu0  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,
> >> >  0,0 st
> >> > %Cpu1  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,
> >> >  0,0 st
> >> > %Cpu2  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,
> >> >  0,0 st
> >> > %Cpu3  :  2,3 us,  0,3 sy,  0,0 ni, 97,3 id,  0,0 wa,  0,0 hi,  0,0 si,
> >> >  0,0 st
> >> > %Cpu4  :  1,7 us,  0,3 sy,  0,0 ni, 98,0 id,  0,0 wa,  0,0 hi,  0,0 si,
> >> >  0,0 st
> >> > %Cpu5  : 98,3 us,  1,7 sy,  0,0 ni,  0,0 id,  0,0 wa,  0,0 hi,  0,0 si,
> >> >  0,0 st
> >> > %Cpu6  :  2,0 us,  0,3 sy,  0,0 ni, 97,7 id,  0,0 wa,  0,0 hi,  0,0 si,
> >> >  0,0 st
> >> > %Cpu7  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,
> >> >  0,0 st
> >> > KiB Mem : 24680300 total,  8155356 free,  5751864 used, 10773080
> >> > buff/cache
> >> > KiB Swap:0 total,0 free,0 used. 18437888 avail
> >> > Mem
> >> >
> >> >   PID USER  PR  NIVIRTRESSHR S

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-14 Thread Dieter Nützel


Am 14.04.2017 07:53, schrieb gregory hainaut:

On Fri, 14 Apr 2017 05:20:38 +0200
Dieter Nützel  wrote:


Am 14.04.2017 02:06, schrieb Dieter Nützel:
> Hello Gregory,
>
> have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
> It result in crazy numbers and do not 'return' (one core stays @ 100%).

This is related to 'mesa_glthread=true'.
If I disable (unset) it, all is fine after 'b' benchmark and 'pbo' 
exit

with ESC as expeted.
Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)

Hope that helps.

Dieter


Hello Dieter,

I tested the demo. There is a pseudo unrelated bug on the exit of the
application.

Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable,
found non-freed data

I will add a call to a _mesa_HashDeleteAll to fix it.
i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb, ctx);

Now let's go back to the test behavior. The benchmarks will send 4s of
asynchronous PBO transfer commands. And then will sync gl_thread which
mean the application thread will be blocked until all PBO transfers are
done. Gl_thread is faster to dispatch command so you will need to wait
more before the thread goes back to real life.

On my side, I need to wait around 45 seconds for 6 millions of 
commands.

Result:  6,440,627 reads (gl thread on + PBO patches)
Result:274,960 reads (gl thread off)

In your case, "Result:  77,444,412 reads", I hope you're patient.
I think you must wait at least 10 minutes.


Now, I was patient...
Tried 2 times but after ~20 minutes I've killed it at first and attached 
gdb at it during second run.


0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0

(gdb) bt
#0  0x7fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0

#1  0x7fbda5359453 in ?? () from /usr/local/lib/dri/r600_dri.so
#2  0x7fbda53661f4 in ?? () from /usr/local/lib/dri/r600_dri.so
#3  0x00401e18 in ?? ()
#4  0x004028c7 in ?? ()
#5  0x7fbda9925781 in fghRedrawWindow () from 
/usr/lib64/libglut.so.3

#6  0x7fbda9925c08 in ?? () from /usr/lib64/libglut.so.3
#7  0x7fbda9926cf9 in fgEnumWindows () from /usr/lib64/libglut.so.3
#8  0x7fbda9925ce4 in glutMainLoopEvent () from 
/usr/lib64/libglut.so.3

#9  0x7fbda9925d85 in glutMainLoop () from /usr/lib64/libglut.so.3
#10 0x004019fc in ?? ()
#11 0x7fbda957e541 in __libc_start_main () from /lib64/libc.so.6
#12 0x00401afa in ?? ()

Should I do more or not worth it?

Dieter


> mesa-demos/tests> ./pbo
> ATTENTION: default value of option mesa_glthread overridden by
> environment.
> GL_VERSION = 4.1 Mesa 17.1.0-devel (git-7c8fe31e1c)
> GL_RENDERER = Gallium 0.4 on AMD TURKS (DRM 2.49.0 /
> 4.11.0-rc6-1.g5a51416-default, LLVM 5.0.0)
> Loaded 194 by 188 image
> Converting RGB image to RGBA
> Benchmarking...
> Result:  7712 reads in 4.00 seconds = -383971576.00
> pixels/sec
>
> top - 02:04:42 up 10:05,  4 users,  load average: 1,03, 0,77, 0,71
> Tasks: 265 total,   1 running, 264 sleeping,   0 stopped,   0 zombie
> %Cpu0  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu1  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu2  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu3  :  2,3 us,  0,3 sy,  0,0 ni, 97,3 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu4  :  1,7 us,  0,3 sy,  0,0 ni, 98,0 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu5  : 98,3 us,  1,7 sy,  0,0 ni,  0,0 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu6  :  2,0 us,  0,3 sy,  0,0 ni, 97,7 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> %Cpu7  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,
>  0,0 st
> KiB Mem : 24680300 total,  8155356 free,  5751864 used, 10773080
> buff/cache
> KiB Swap:0 total,0 free,0 used. 18437888 avail
> Mem
>
>   PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+
> COMMAND
> 19380 dieter20   0 3259764 2,911g  22472 S 100,3 12,37   2:28.48
> pbo
> 27937 dieter20   0 4029572 570236 166116 S 5,980 2,310   9:45.53
> konqueror
> 13432 dieter20   0 1922820 269892 129152 S 5,648 1,094   4:33.80
> Web Content
>
> Other than that:
>
> For the series:
>
> Tested-by: Dieter Nützel 
> r600g, Turks XT (6670)
>
> Dieter
>
> Am 13.04.2017 19:32, schrieb Gregory Hainaut:
>> Hello,
>>
>> Please find a new version to handle invalid buffer handles.
>>
>> Allow to handle this kind of case:
>>genBuffer();
>>BindBuffer(pbo)
>>DeleteBuffer(pbo);
>>BindBuffer(rand_pbo)
>>TexSubImage2D(user_memory_pointer); // Data transfer will be
>> synchronous
>>
>> There are various subtely to handle multi threaded shared context. In
>> order to
>> keep the code sane, I've considered a buffer invalid when it is
>> deleted by a
>> context even it is still bound to others contexts. It will force a
>> synchronous
>> transfer which

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-14 Thread gregory hainaut

On Fri, 14 Apr 2017 07:53:06 +0200
gregory hainaut  wrote:

> On Fri, 14 Apr 2017 05:20:38 +0200
> Dieter Nützel  wrote:
> 
> > Am 14.04.2017 02:06, schrieb Dieter Nützel:
> > > Hello Gregory,
> > > 
> > > have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
> > > It result in crazy numbers and do not 'return' (one core stays @ 100%).
> > 
> > This is related to 'mesa_glthread=true'.
> > If I disable (unset) it, all is fine after 'b' benchmark and 'pbo' exit 
> > with ESC as expeted.
> > Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)
> > 
> > Hope that helps.
> > 
> > Dieter
> 
> Hello Dieter,
> 
> I tested the demo. There is a pseudo unrelated bug on the exit of the 
> application.
> 
> Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable, found 
> non-freed data
> 
> I will add a call to a _mesa_HashDeleteAll to fix it.
> i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb, ctx);
> 
> Now let's go back to the test behavior. The benchmarks will send 4s of
> asynchronous PBO transfer commands. And then will sync gl_thread which
> mean the application thread will be blocked until all PBO transfers are
> done. Gl_thread is faster to dispatch command so you will need to wait
> more before the thread goes back to real life.
> 
> On my side, I need to wait around 45 seconds for 6 millions of commands.
> Result:  6,440,627 reads (gl thread on + PBO patches)
> Result:274,960 reads (gl thread off)
> 
> In your case, "Result:  77,444,412 reads", I hope you're patient.
> I think you must wait at least 10 minutes.
> 
> Best regards,
> Gregory

And to complete my answer on the crazy number. Yes there is an integer
overflow when reads is too big. It would be better to promote it to double.

  pixelsPerSecond = reads * ImgWidth * ImgHeight / seconds;

That being said, pixelsPerSecond doesn't have any physical meaning.
reads * ImgWidth * ImgHeight is the number of pixels that *will* be transfered
not the number of pixels that *was* transfered. As you have notice
the transfer will take more than the value of seconds (which is 4).

  
> > > mesa-demos/tests> ./pbo
> > > ATTENTION: default value of option mesa_glthread overridden by 
> > > environment.
> > > GL_VERSION = 4.1 Mesa 17.1.0-devel (git-7c8fe31e1c)
> > > GL_RENDERER = Gallium 0.4 on AMD TURKS (DRM 2.49.0 /
> > > 4.11.0-rc6-1.g5a51416-default, LLVM 5.0.0)
> > > Loaded 194 by 188 image
> > > Converting RGB image to RGBA
> > > Benchmarking...
> > > Result:  7712 reads in 4.00 seconds = -383971576.00 
> > > pixels/sec
> > > 
> > > top - 02:04:42 up 10:05,  4 users,  load average: 1,03, 0,77, 0,71
> > > Tasks: 265 total,   1 running, 264 sleeping,   0 stopped,   0 zombie
> > > %Cpu0  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
> > >  0,0 st
> > > %Cpu1  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
> > >  0,0 st
> > > %Cpu2  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
> > >  0,0 st
> > > %Cpu3  :  2,3 us,  0,3 sy,  0,0 ni, 97,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
> > >  0,0 st
> > > %Cpu4  :  1,7 us,  0,3 sy,  0,0 ni, 98,0 id,  0,0 wa,  0,0 hi,  0,0 si, 
> > >  0,0 st
> > > %Cpu5  : 98,3 us,  1,7 sy,  0,0 ni,  0,0 id,  0,0 wa,  0,0 hi,  0,0 si, 
> > >  0,0 st
> > > %Cpu6  :  2,0 us,  0,3 sy,  0,0 ni, 97,7 id,  0,0 wa,  0,0 hi,  0,0 si, 
> > >  0,0 st
> > > %Cpu7  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
> > >  0,0 st
> > > KiB Mem : 24680300 total,  8155356 free,  5751864 used, 10773080 
> > > buff/cache
> > > KiB Swap:0 total,0 free,0 used. 18437888 avail 
> > > Mem
> > > 
> > >   PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ 
> > > COMMAND
> > > 19380 dieter20   0 3259764 2,911g  22472 S 100,3 12,37   2:28.48 
> > > pbo
> > > 27937 dieter20   0 4029572 570236 166116 S 5,980 2,310   9:45.53 
> > > konqueror
> > > 13432 dieter20   0 1922820 269892 129152 S 5,648 1,094   4:33.80 
> > > Web Content
> > > 
> > > Other than that:
> > > 
> > > For the series:
> > > 
> > > Tested-by: Dieter Nützel 
> > > r600g, Turks XT (6670)
> > > 
> > > Dieter
> > > 
> > > Am 13.04.2017 19:32, schrieb Gregory Hainaut:
> > >> Hello,
> > >> 
> > >> Please find a new version to handle invalid buffer handles.
> > >> 
> > >> Allow to handle this kind of case:
> > >>genBuffer();
> > >>BindBuffer(pbo)
> > >>DeleteBuffer(pbo);
> > >>BindBuffer(rand_pbo)
> > >>TexSubImage2D(user_memory_pointer); // Data transfer will be 
> > >> synchronous
> > >> 
> > >> There are various subtely to handle multi threaded shared context. In 
> > >> order to
> > >> keep the code sane, I've considered a buffer invalid when it is 
> > >> deleted by a
> > >> context even it is still bound to others contexts. It will force a 
> > >> synchronous
> > >> transfer which is always safe.
> > >> 
> > >> An example could be
> >

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-13 Thread gregory hainaut

On Fri, 14 Apr 2017 05:20:38 +0200
Dieter Nützel  wrote:

> Am 14.04.2017 02:06, schrieb Dieter Nützel:
> > Hello Gregory,
> > 
> > have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
> > It result in crazy numbers and do not 'return' (one core stays @ 100%).
> 
> This is related to 'mesa_glthread=true'.
> If I disable (unset) it, all is fine after 'b' benchmark and 'pbo' exit 
> with ESC as expeted.
> Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)
> 
> Hope that helps.
> 
> Dieter

Hello Dieter,

I tested the demo. There is a pseudo unrelated bug on the exit of the 
application.

Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable, found 
non-freed data

I will add a call to a _mesa_HashDeleteAll to fix it.
i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb, ctx);

Now let's go back to the test behavior. The benchmarks will send 4s of
asynchronous PBO transfer commands. And then will sync gl_thread which
mean the application thread will be blocked until all PBO transfers are
done. Gl_thread is faster to dispatch command so you will need to wait
more before the thread goes back to real life.

On my side, I need to wait around 45 seconds for 6 millions of commands.
Result:  6,440,627 reads (gl thread on + PBO patches)
Result:274,960 reads (gl thread off)

In your case, "Result:  77,444,412 reads", I hope you're patient.
I think you must wait at least 10 minutes.

Best regards,
Gregory
 
> > mesa-demos/tests> ./pbo
> > ATTENTION: default value of option mesa_glthread overridden by 
> > environment.
> > GL_VERSION = 4.1 Mesa 17.1.0-devel (git-7c8fe31e1c)
> > GL_RENDERER = Gallium 0.4 on AMD TURKS (DRM 2.49.0 /
> > 4.11.0-rc6-1.g5a51416-default, LLVM 5.0.0)
> > Loaded 194 by 188 image
> > Converting RGB image to RGBA
> > Benchmarking...
> > Result:  7712 reads in 4.00 seconds = -383971576.00 
> > pixels/sec
> > 
> > top - 02:04:42 up 10:05,  4 users,  load average: 1,03, 0,77, 0,71
> > Tasks: 265 total,   1 running, 264 sleeping,   0 stopped,   0 zombie
> > %Cpu0  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
> >  0,0 st
> > %Cpu1  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
> >  0,0 st
> > %Cpu2  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
> >  0,0 st
> > %Cpu3  :  2,3 us,  0,3 sy,  0,0 ni, 97,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
> >  0,0 st
> > %Cpu4  :  1,7 us,  0,3 sy,  0,0 ni, 98,0 id,  0,0 wa,  0,0 hi,  0,0 si, 
> >  0,0 st
> > %Cpu5  : 98,3 us,  1,7 sy,  0,0 ni,  0,0 id,  0,0 wa,  0,0 hi,  0,0 si, 
> >  0,0 st
> > %Cpu6  :  2,0 us,  0,3 sy,  0,0 ni, 97,7 id,  0,0 wa,  0,0 hi,  0,0 si, 
> >  0,0 st
> > %Cpu7  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
> >  0,0 st
> > KiB Mem : 24680300 total,  8155356 free,  5751864 used, 10773080 
> > buff/cache
> > KiB Swap:0 total,0 free,0 used. 18437888 avail 
> > Mem
> > 
> >   PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ 
> > COMMAND
> > 19380 dieter20   0 3259764 2,911g  22472 S 100,3 12,37   2:28.48 
> > pbo
> > 27937 dieter20   0 4029572 570236 166116 S 5,980 2,310   9:45.53 
> > konqueror
> > 13432 dieter20   0 1922820 269892 129152 S 5,648 1,094   4:33.80 
> > Web Content
> > 
> > Other than that:
> > 
> > For the series:
> > 
> > Tested-by: Dieter Nützel 
> > r600g, Turks XT (6670)
> > 
> > Dieter
> > 
> > Am 13.04.2017 19:32, schrieb Gregory Hainaut:
> >> Hello,
> >> 
> >> Please find a new version to handle invalid buffer handles.
> >> 
> >> Allow to handle this kind of case:
> >>genBuffer();
> >>BindBuffer(pbo)
> >>DeleteBuffer(pbo);
> >>BindBuffer(rand_pbo)
> >>TexSubImage2D(user_memory_pointer); // Data transfer will be 
> >> synchronous
> >> 
> >> There are various subtely to handle multi threaded shared context. In 
> >> order to
> >> keep the code sane, I've considered a buffer invalid when it is 
> >> deleted by a
> >> context even it is still bound to others contexts. It will force a 
> >> synchronous
> >> transfer which is always safe.
> >> 
> >> An example could be
> >>Ctx A: glGenBuffers(1, );
> >>Ctx A: glBindBuffer(PIXEL_UNPACK_BUFFER, pbo);
> >>Ctx B: glDeleteBuffers(1, );
> >>Ctx A: glTexSubImage2D(...); // will be synchronous, even though it
> >>_could_ be asynchronous (because the PBO that was generated first 
> >> is
> >>still bound!)
> >> 
> >> V3: I mixed up the number so I jumped right away to v4...
> >> V4: improve commments based on Nicolai feedback
> >> 
> >> Best regards,
> >> 
> >> Gregory Hainaut (3):
> >>   mesa/glthread: track buffer creation/destruction
> >>   mesa/glthread: add tracking of PBO binding
> >>   mapi/glthread: generate asynchronous code for PBO transfer
> >> 
> >>  src/mapi/glapi/gen/ARB_direct_state_access.xml |  18 +--
> >>  src/mapi/glapi/gen/ARB_robustness.xml  |   2 +-
> >>  src/mapi/glapi/gen/gl_API.dtd

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-13 Thread Dieter Nützel


Am 14.04.2017 02:06, schrieb Dieter Nützel:

Hello Gregory,

have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
It result in crazy numbers and do not 'return' (one core stays @ 100%).


This is related to 'mesa_glthread=true'.
If I disable (unset) it, all is fine after 'b' benchmark and 'pbo' exit 
with ESC as expeted.

Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)

Hope that helps.

Dieter


mesa-demos/tests> ./pbo
ATTENTION: default value of option mesa_glthread overridden by 
environment.

GL_VERSION = 4.1 Mesa 17.1.0-devel (git-7c8fe31e1c)
GL_RENDERER = Gallium 0.4 on AMD TURKS (DRM 2.49.0 /
4.11.0-rc6-1.g5a51416-default, LLVM 5.0.0)
Loaded 194 by 188 image
Converting RGB image to RGBA
Benchmarking...
Result:  7712 reads in 4.00 seconds = -383971576.00 
pixels/sec


top - 02:04:42 up 10:05,  4 users,  load average: 1,03, 0,77, 0,71
Tasks: 265 total,   1 running, 264 sleeping,   0 stopped,   0 zombie
%Cpu0  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
 0,0 st
%Cpu1  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
 0,0 st
%Cpu2  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
 0,0 st
%Cpu3  :  2,3 us,  0,3 sy,  0,0 ni, 97,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
 0,0 st
%Cpu4  :  1,7 us,  0,3 sy,  0,0 ni, 98,0 id,  0,0 wa,  0,0 hi,  0,0 si, 
 0,0 st
%Cpu5  : 98,3 us,  1,7 sy,  0,0 ni,  0,0 id,  0,0 wa,  0,0 hi,  0,0 si, 
 0,0 st
%Cpu6  :  2,0 us,  0,3 sy,  0,0 ni, 97,7 id,  0,0 wa,  0,0 hi,  0,0 si, 
 0,0 st
%Cpu7  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si, 
 0,0 st
KiB Mem : 24680300 total,  8155356 free,  5751864 used, 10773080 
buff/cache
KiB Swap:0 total,0 free,0 used. 18437888 avail 
Mem


  PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ 
COMMAND
19380 dieter20   0 3259764 2,911g  22472 S 100,3 12,37   2:28.48 
pbo
27937 dieter20   0 4029572 570236 166116 S 5,980 2,310   9:45.53 
konqueror
13432 dieter20   0 1922820 269892 129152 S 5,648 1,094   4:33.80 
Web Content


Other than that:

For the series:

Tested-by: Dieter Nützel 
r600g, Turks XT (6670)

Dieter

Am 13.04.2017 19:32, schrieb Gregory Hainaut:

Hello,

Please find a new version to handle invalid buffer handles.

Allow to handle this kind of case:
   genBuffer();
   BindBuffer(pbo)
   DeleteBuffer(pbo);
   BindBuffer(rand_pbo)
   TexSubImage2D(user_memory_pointer); // Data transfer will be 
synchronous


There are various subtely to handle multi threaded shared context. In 
order to
keep the code sane, I've considered a buffer invalid when it is 
deleted by a
context even it is still bound to others contexts. It will force a 
synchronous

transfer which is always safe.

An example could be
   Ctx A: glGenBuffers(1, );
   Ctx A: glBindBuffer(PIXEL_UNPACK_BUFFER, pbo);
   Ctx B: glDeleteBuffers(1, );
   Ctx A: glTexSubImage2D(...); // will be synchronous, even though it
   _could_ be asynchronous (because the PBO that was generated first 
is

   still bound!)

V3: I mixed up the number so I jumped right away to v4...
V4: improve commments based on Nicolai feedback

Best regards,

Gregory Hainaut (3):
  mesa/glthread: track buffer creation/destruction
  mesa/glthread: add tracking of PBO binding
  mapi/glthread: generate asynchronous code for PBO transfer

 src/mapi/glapi/gen/ARB_direct_state_access.xml |  18 +--
 src/mapi/glapi/gen/ARB_robustness.xml  |   2 +-
 src/mapi/glapi/gen/gl_API.dtd  |  10 +-
 src/mapi/glapi/gen/gl_API.xml  |  32 +++---
 src/mapi/glapi/gen/gl_marshal.py   |  23 +++-
 src/mapi/glapi/gen/marshal_XML.py  |  21 +++-
 src/mesa/main/glthread.h   |  10 ++
 src/mesa/main/marshal.c| 149 
-

 src/mesa/main/marshal.h|  24 
 src/mesa/main/mtypes.h |   5 +
 src/mesa/main/shared.c |   4 +
 11 files changed, 259 insertions(+), 39 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-13 Thread Dieter Nützel


Hello Gregory,

have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
It result in crazy numbers and do not 'return' (one core stays @ 100%).

mesa-demos/tests> ./pbo
ATTENTION: default value of option mesa_glthread overridden by 
environment.

GL_VERSION = 4.1 Mesa 17.1.0-devel (git-7c8fe31e1c)
GL_RENDERER = Gallium 0.4 on AMD TURKS (DRM 2.49.0 / 
4.11.0-rc6-1.g5a51416-default, LLVM 5.0.0)

Loaded 194 by 188 image
Converting RGB image to RGBA
Benchmarking...
Result:  7712 reads in 4.00 seconds = -383971576.00 
pixels/sec


top - 02:04:42 up 10:05,  4 users,  load average: 1,03, 0,77, 0,71
Tasks: 265 total,   1 running, 264 sleeping,   0 stopped,   0 zombie
%Cpu0  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,  
0,0 st
%Cpu1  :  1,3 us,  0,3 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,  
0,0 st
%Cpu2  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,  
0,0 st
%Cpu3  :  2,3 us,  0,3 sy,  0,0 ni, 97,3 id,  0,0 wa,  0,0 hi,  0,0 si,  
0,0 st
%Cpu4  :  1,7 us,  0,3 sy,  0,0 ni, 98,0 id,  0,0 wa,  0,0 hi,  0,0 si,  
0,0 st
%Cpu5  : 98,3 us,  1,7 sy,  0,0 ni,  0,0 id,  0,0 wa,  0,0 hi,  0,0 si,  
0,0 st
%Cpu6  :  2,0 us,  0,3 sy,  0,0 ni, 97,7 id,  0,0 wa,  0,0 hi,  0,0 si,  
0,0 st
%Cpu7  :  1,7 us,  0,0 sy,  0,0 ni, 98,3 id,  0,0 wa,  0,0 hi,  0,0 si,  
0,0 st
KiB Mem : 24680300 total,  8155356 free,  5751864 used, 10773080 
buff/cache
KiB Swap:0 total,0 free,0 used. 18437888 avail 
Mem


  PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ 
COMMAND

19380 dieter20   0 3259764 2,911g  22472 S 100,3 12,37   2:28.48 pbo
27937 dieter20   0 4029572 570236 166116 S 5,980 2,310   9:45.53 
konqueror
13432 dieter20   0 1922820 269892 129152 S 5,648 1,094   4:33.80 Web 
Content


Other than that:

For the series:

Tested-by: Dieter Nützel 
r600g, Turks XT (6670)

Dieter

Am 13.04.2017 19:32, schrieb Gregory Hainaut:

Hello,

Please find a new version to handle invalid buffer handles.

Allow to handle this kind of case:
   genBuffer();
   BindBuffer(pbo)
   DeleteBuffer(pbo);
   BindBuffer(rand_pbo)
   TexSubImage2D(user_memory_pointer); // Data transfer will be 
synchronous


There are various subtely to handle multi threaded shared context. In 
order to
keep the code sane, I've considered a buffer invalid when it is deleted 
by a
context even it is still bound to others contexts. It will force a 
synchronous

transfer which is always safe.

An example could be
   Ctx A: glGenBuffers(1, );
   Ctx A: glBindBuffer(PIXEL_UNPACK_BUFFER, pbo);
   Ctx B: glDeleteBuffers(1, );
   Ctx A: glTexSubImage2D(...); // will be synchronous, even though it
   _could_ be asynchronous (because the PBO that was generated first is
   still bound!)

V3: I mixed up the number so I jumped right away to v4...
V4: improve commments based on Nicolai feedback

Best regards,

Gregory Hainaut (3):
  mesa/glthread: track buffer creation/destruction
  mesa/glthread: add tracking of PBO binding
  mapi/glthread: generate asynchronous code for PBO transfer

 src/mapi/glapi/gen/ARB_direct_state_access.xml |  18 +--
 src/mapi/glapi/gen/ARB_robustness.xml  |   2 +-
 src/mapi/glapi/gen/gl_API.dtd  |  10 +-
 src/mapi/glapi/gen/gl_API.xml  |  32 +++---
 src/mapi/glapi/gen/gl_marshal.py   |  23 +++-
 src/mapi/glapi/gen/marshal_XML.py  |  21 +++-
 src/mesa/main/glthread.h   |  10 ++
 src/mesa/main/marshal.c| 149 
-

 src/mesa/main/marshal.h|  24 
 src/mesa/main/mtypes.h |   5 +
 src/mesa/main/shared.c |   4 +
 11 files changed, 259 insertions(+), 39 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

2017-04-13 Thread Gregory Hainaut

Hello,

Please find a new version to handle invalid buffer handles.

Allow to handle this kind of case:
   genBuffer();
   BindBuffer(pbo)
   DeleteBuffer(pbo);
   BindBuffer(rand_pbo)
   TexSubImage2D(user_memory_pointer); // Data transfer will be synchronous

There are various subtely to handle multi threaded shared context. In order to
keep the code sane, I've considered a buffer invalid when it is deleted by a
context even it is still bound to others contexts. It will force a synchronous
transfer which is always safe.

An example could be
   Ctx A: glGenBuffers(1, );
   Ctx A: glBindBuffer(PIXEL_UNPACK_BUFFER, pbo);
   Ctx B: glDeleteBuffers(1, );
   Ctx A: glTexSubImage2D(...); // will be synchronous, even though it
   _could_ be asynchronous (because the PBO that was generated first is
   still bound!)

V3: I mixed up the number so I jumped right away to v4...
V4: improve commments based on Nicolai feedback

Best regards,

Gregory Hainaut (3):
  mesa/glthread: track buffer creation/destruction
  mesa/glthread: add tracking of PBO binding
  mapi/glthread: generate asynchronous code for PBO transfer

 src/mapi/glapi/gen/ARB_direct_state_access.xml |  18 +--
 src/mapi/glapi/gen/ARB_robustness.xml  |   2 +-
 src/mapi/glapi/gen/gl_API.dtd  |  10 +-
 src/mapi/glapi/gen/gl_API.xml  |  32 +++---
 src/mapi/glapi/gen/gl_marshal.py   |  23 +++-
 src/mapi/glapi/gen/marshal_XML.py  |  21 +++-
 src/mesa/main/glthread.h   |  10 ++
 src/mesa/main/marshal.c| 149 -
 src/mesa/main/marshal.h|  24 
 src/mesa/main/mtypes.h |   5 +
 src/mesa/main/shared.c |   4 +
 11 files changed, 259 insertions(+), 39 deletions(-)

-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Re: [Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

[Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

17 matches

Site Navigation

Mail list logo

Footer information