Modify __down_read_trylock() to make it generate slightly better code
(smaller and maybe a tiny bit faster).
Before this patch, down_read_trylock:
0x <+0>: callq 0x5
0x0005 <+5>: jmp0x18
0x0007 <+7>: lea0x1(%rdx),%rcx
0x00
On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote:
> Modify __down_read_trylock() to make it generate slightly better code
> (smaller and maybe a tiny bit faster).
>
> Before this patch, down_read_trylock:
>
>0x <+0>: callq 0x5
>0x0005 <+5>:
On Tue, Feb 12, 2019 at 02:24:04PM +0100, Peter Zijlstra wrote:
> On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote:
> > Modify __down_read_trylock() to make it generate slightly better code
> > (smaller and maybe a tiny bit faster).
> >
> > Before this patch, down_read_trylock:
> >
> >
On 02/12/2019 08:25 AM, Peter Zijlstra wrote:
> On Tue, Feb 12, 2019 at 02:24:04PM +0100, Peter Zijlstra wrote:
>> On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote:
>>> Modify __down_read_trylock() to make it generate slightly better code
>>> (smaller and maybe a tiny bit faster).
>>>
>>
On 02/12/2019 01:36 PM, Waiman Long wrote:
> On 02/12/2019 08:25 AM, Peter Zijlstra wrote:
>> On Tue, Feb 12, 2019 at 02:24:04PM +0100, Peter Zijlstra wrote:
>>> On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote:
Modify __down_read_trylock() to make it generate slightly better code
>
On Mon, Feb 11, 2019 at 11:31 AM Waiman Long wrote:
>
> Modify __down_read_trylock() to make it generate slightly better code
> (smaller and maybe a tiny bit faster).
This looks good, but I would ask you to try one slightly different approach.
Instead of this:
>long tmp = atomic_long_re
On 02/12/2019 02:58 PM, Linus Torvalds wrote:
> On Mon, Feb 11, 2019 at 11:31 AM Waiman Long wrote:
>> Modify __down_read_trylock() to make it generate slightly better code
>> (smaller and maybe a tiny bit faster).
> This looks good, but I would ask you to try one slightly different approach.
>
>
* Waiman Long wrote:
> I looked at the assembly code in arch/x86/include/asm/rwsem.h. For both
> trylocks (read & write), the count is read first before attempting to
> lock it. We did the same for all trylock functions in other locks.
> Depending on how the trylock is used and how contended th
On 02/13/2019 02:45 AM, Ingo Molnar wrote:
> * Waiman Long wrote:
>
>> I looked at the assembly code in arch/x86/include/asm/rwsem.h. For both
>> trylocks (read & write), the count is read first before attempting to
>> lock it. We did the same for all trylock functions in other locks.
>> Depending