On Mon, 22 May 2023, Sui Jingfeng <15330273...@189.cn> wrote:
> Hi,
>
> On 2023/5/22 19:29, Jani Nikula wrote:
>> On Thu, 18 May 2023, Sui Jingfeng <15330273...@189.cn> wrote:
>>> On 2023/5/17 18:59, David Laight wrote:
>>>> From: 15330273...@189.cn
>>>>> Sent: 16 May 2023 18:30
>>>>>
>>>>> From: Sui Jingfeng <suijingf...@loongson.cn>
>>>>>
>>>>> Both mode->crtc_htotal and mode->crtc_vtotal are u16 type,
>>>>> mode->crtc_htotal * mode->crtc_vtotal will results a unsigned type.
>>>> Nope, u16 gets promoted to 'signed int' and the result of the
>>>> multiply is also signed.
>>> I believe that signed or unsigned is dependent on the declaration.
>>>
>>> I am talk about the math, while you are talking about compiler.
>>>
>>> I admit that u16 gets promoted to 'signed int' is true, but this is
>>> irrelevant,
>>>
>>> the point is how to understand the returned value.
>>>
>>>
>>> How does the compiler generate the code is one thing, how do we
>>> interpret the result is another
>>>
>>> How does the compiler generate the code is NOT determined by us, while
>>> how do we interpret the result is determined by us.
>>>
>>>
>>> I believe that using a u32 type to interpret the result(u16 * u16) is
>>> always true, it is true in the perspective of *math*.
>>>
>>> Integer promotions is the details of C program language. If the result
>>> of the multiply is signed, then there are risks that
>>>
>>> the result is negative, what's the benefit to present this risk to the
>>> programmer?
>>>
>>> What's the benefit to tell me(and others) that u16 * u16 yield a signed
>>> value? and can be negative?
>>>
>>> Using int type as the return type bring concerns to the programmer and
>>> the user of the function,
>>>
>>> even though this is not impossible in practice.
>> In general, do not use unsigned types in arithmethic to avoid negative
>> values, because most people will be tripped over by integer promotion
>> rules, and you'll get negative values anyway.
>>
>> I'll bet most people will be surprised to see what this prints:
>>
>> #include <stdio.h>
>> #include <stdint.h>
>>
>> int main(void)
>> {
>>      uint16_t x = 0xffff;
>>      uint16_t y = 0xffff;
>>      uint64_t z = x * y;
>>
>>      printf("0x%016lx\n", z);
>>      printf("%ld\n", z);
>
> Here, please replace the "%ld\n" with the "%lu\n", then you will see the 
> difference.
>
> you are casting the variable 'z' to signed value,  "%d" is for printing 
> signed value, and "%u" is for printing unsigned value.
>
>
> Your simple code explained exactly why you are still in confusion,

Am I?

Take a look at the values, and explain the math.


BR,
Jani.

>
> that is u16 * u16  can yield a negative value if you use the int as the 
> return type. Because it overflowed.
>
>>      printf("%d\n", x * y);
>> }
>>
>> And it's not that different from what you have below. Your patch doesn't
>> change anything, and doesn't make it any less confusing.
>>
>> BR,
>> Jani.
>>
>>
>>>>> Using a u32 is enough to store the result, but considering that the
>>>>> result will be casted to u64 soon after. We use a u64 type directly.
>>>>> So there no need to cast it to signed type and cast back then.
>>>> ....
>>>>> -         int frame_size = mode->crtc_htotal * mode->crtc_vtotal;
>>>>> +         u64 frame_size = mode->crtc_htotal * mode->crtc_vtotal;
>>>> ...
>>>>> -         framedur_ns = div_u64((u64) frame_size * 1000000, dotclock);
>>>>> +         framedur_ns = div_u64(frame_size * 1000000, dotclock);
>>>> The (u64) cast is there to extend the value to 64bits, not
>>>> because the original type is signed.
>>> Sorry about my expression, I think my sentence did not mention anything
>>> about 'because the original type is signed'.
>>>
>>> In the contrary, my patch eliminated the concerns to the reviewer. It
>>> say that the results of the multiply can't be negative.
>>>
>>> My intent is to tell the compiler we want a unsigned return type, but
>>> GCC emit 'imul' instruction for the multiply......
>>>
>>> I'm using u64 as the return type, because div_u64() function accept a
>>> u64 type value as its first argument.
>>>
>>>> The compiler will detect that the old code is a 32x32 multiply
>>>> where a 64bit result is needed, that may not be true for the
>>>> changed code (it would need to track back as far as the u16s).
>>> I don't believe my code could be wrong.
>>>
>>> when you use the word 'may', you are saying that it could be wrong after
>>> apply my patch.
>>>
>>> Then you have to find at least one test example to prove you point, in
>>> which case my codes generate wrong results.
>>>
>>> Again I don't believe you could find one.
>>>
>>>> It is not uncommon to force a 64bit result from a multiply
>>>> by making the constant 64bit. As in:
>>>>    div_u64(frame_size * 1000000ULL, dotclock);
>>> In fact, After apply this patch, the ASM code generated is same with before.
>>>
>>> This may because the GCC is smart enough to generate optimized code in
>>> either case,
>>>
>>> I think It could be different with a different optimization-level.
>>>
>>> I have tested this patch on three different architecture,  I can not
>>> find error still.
>>>
>>> Below is the assembly extract on x86-64: because GCC generate the same
>>> code in either case,
>>>
>>> so I pasted only one copy here.
>>>
>>>
>>> 0000000000000530 <drm_calc_timestamping_constants>:
>>>        530:    f3 0f 1e fa              endbr64
>>>        534:    e8 00 00 00 00           callq  539
>>> <drm_calc_timestamping_constants+0x9>
>>>        539:    55                       push   %rbp
>>>        53a:    48 89 e5                 mov    %rsp,%rbp
>>>        53d:    41 57                    push   %r15
>>>        53f:    41 56                    push   %r14
>>>        541:    41 55                    push   %r13
>>>        543:    41 54                    push   %r12
>>>        545:    53                       push   %rbx
>>>        546:    48 83 ec 18              sub    $0x18,%rsp
>>>        54a:    4c 8b 3f                 mov    (%rdi),%r15
>>>        54d:    41 8b 87 6c 01 00 00     mov    0x16c(%r15),%eax
>>>        554:    85 c0                    test   %eax,%eax
>>>        556:    0f 84 ec 00 00 00        je     648
>>> <drm_calc_timestamping_constants+0x118>
>>>        55c:    44 8b 87 90 00 00 00     mov    0x90(%rdi),%r8d
>>>        563:    49 89 fc                 mov    %rdi,%r12
>>>        566:    44 39 c0                 cmp    %r8d,%eax
>>>        569:    0f 86 40 01 00 00        jbe    6af
>>> <drm_calc_timestamping_constants+0x17f>
>>>        56f:    44 8b 76 1c              mov    0x1c(%rsi),%r14d
>>>        573:    49 8b 8f 40 01 00 00     mov    0x140(%r15),%rcx
>>>        57a:    48 89 f3                 mov    %rsi,%rbx
>>>        57d:    45 85 f6                 test   %r14d,%r14d
>>>        580:    0f 8e d5 00 00 00        jle    65b
>>> <drm_calc_timestamping_constants+0x12b>
>>>        586:    0f b7 43 2a              movzwl 0x2a(%rbx),%eax
>>>        58a:    49 63 f6                 movslq %r14d,%rsi
>>>        58d:    31 d2                    xor    %edx,%edx
>>>        58f:    48 89 c7                 mov    %rax,%rdi
>>>        592:    48 69 c0 40 42 0f 00     imul   $0xf4240,%rax,%rax
>>>        599:    48 f7 f6                 div    %rsi
>>>        59c:    31 d2                    xor    %edx,%edx
>>>        59e:    48 89 45 d0              mov    %rax,-0x30(%rbp)
>>>        5a2:    0f b7 43 38              movzwl 0x38(%rbx),%eax
>>>        5a6:    0f af c7                 imul   %edi,%eax
>>>        5a9:    48 98                    cltq
>>>        5ab:    48 69 c0 40 42 0f 00     imul   $0xf4240,%rax,%rax
>>>        5b2:    48 f7 f6                 div    %rsi
>>>        5b5:    41 89 c5                 mov    %eax,%r13d
>>>        5b8:    f6 43 18 10              testb  $0x10,0x18(%rbx)
>>>        5bc:    74 0a                    je     5c8
>>> <drm_calc_timestamping_constants+0x98>
>>>        5be:    41 c1 ed 1f              shr    $0x1f,%r13d
>>>        5c2:    41 01 c5                 add    %eax,%r13d
>>>        5c5:    41 d1 fd                 sar    %r13d
>>>        5c8:    4b 8d 04 c0              lea    (%r8,%r8,8),%rax
>>>        5cc:    48 89 de                 mov    %rbx,%rsi
>>>        5cf:    49 8d 3c 40              lea    (%r8,%rax,2),%rdi
>>>        5d3:    8b 45 d0                 mov    -0x30(%rbp),%eax
>>>        5d6:    48 c1 e7 04              shl    $0x4,%rdi
>>>        5da:    48 01 cf                 add    %rcx,%rdi
>>>        5dd:    89 47 78                 mov    %eax,0x78(%rdi)
>>>        5e0:    48 83 ef 80              sub $0xffffffffffffff80,%rdi
>>>        5e4:    44 89 6f f4              mov    %r13d,-0xc(%rdi)
>>>        5e8:    e8 00 00 00 00           callq  5ed
>>> <drm_calc_timestamping_constants+0xbd>
>>>        5ed:    0f b7 53 2e              movzwl 0x2e(%rbx),%edx
>>>        5f1:    0f b7 43 38              movzwl 0x38(%rbx),%eax
>>>        5f5:    44 0f b7 4b 2a           movzwl 0x2a(%rbx),%r9d
>>>        5fa:    45 8b 44 24 60           mov    0x60(%r12),%r8d
>>>        5ff:    4d 85 ff                 test   %r15,%r15
>>>        602:    0f 84 87 00 00 00        je     68f
>>> <drm_calc_timestamping_constants+0x15f>
>>>        608:    49 8b 77 08              mov    0x8(%r15),%rsi
>>>        60c:    52                       push   %rdx
>>>        60d:    31 ff                    xor    %edi,%edi
>>>        60f:    48 c7 c1 00 00 00 00     mov    $0x0,%rcx
>>>        616:    50                       push   %rax
>>>        617:    31 d2                    xor    %edx,%edx
>>>        619:    e8 00 00 00 00           callq  61e
>>> <drm_calc_timestamping_constants+0xee>
>>>        61e:    45 8b 44 24 60           mov    0x60(%r12),%r8d
>>>        623:    4d 8b 7f 08              mov    0x8(%r15),%r15
>>>        627:    5f                       pop    %rdi
>>>        628:    41 59                    pop    %r9
>>>        62a:    8b 45 d0                 mov    -0x30(%rbp),%eax
>>>        62d:    48 c7 c1 00 00 00 00     mov    $0x0,%rcx
>>>        634:    4c 89 fe                 mov    %r15,%rsi
>>>        637:    45 89 f1                 mov    %r14d,%r9d
>>>        63a:    31 d2                    xor    %edx,%edx
>>>        63c:    31 ff                    xor    %edi,%edi
>>>        63e:    50                       push   %rax
>>>        63f:    41 55                    push   %r13
>>>        641:    e8 00 00 00 00           callq  646
>>> <drm_calc_timestamping_constants+0x116>
>>>        646:    59                       pop    %rcx
>>>        647:    5e                       pop    %rsi
>>>        648:    48 8d 65 d8              lea    -0x28(%rbp),%rsp
>>>        64c:    5b                       pop    %rbx
>>>        64d:    41 5c                    pop    %r12
>>>        64f:    41 5d                    pop    %r13
>>>        651:    41 5e                    pop    %r14
>>>        653:    41 5f                    pop    %r15
>>>        655:    5d                       pop    %rbp
>>>        656:    e9 00 00 00 00           jmpq   65b
>>> <drm_calc_timestamping_constants+0x12b>
>>>        65b:    41 8b 54 24 60           mov    0x60(%r12),%edx
>>>        660:    49 8b 7f 08              mov    0x8(%r15),%rdi
>>>        664:    44 89 45 c4              mov    %r8d,-0x3c(%rbp)
>>>        668:    45 31 ed                 xor    %r13d,%r13d
>>>        66b:    48 c7 c6 00 00 00 00     mov    $0x0,%rsi
>>>        672:    48 89 4d c8              mov    %rcx,-0x38(%rbp)
>>>        676:    e8 00 00 00 00           callq  67b
>>> <drm_calc_timestamping_constants+0x14b>
>>>        67b:    c7 45 d0 00 00 00 00     movl   $0x0,-0x30(%rbp)
>>>        682:    44 8b 45 c4              mov    -0x3c(%rbp),%r8d
>>>        686:    48 8b 4d c8              mov    -0x38(%rbp),%rcx
>>>        68a:    e9 39 ff ff ff           jmpq   5c8
>>> <drm_calc_timestamping_constants+0x98>
>>>        68f:    52                       push   %rdx
>>>        690:    48 c7 c1 00 00 00 00     mov    $0x0,%rcx
>>>        697:    31 d2                    xor    %edx,%edx
>>>        699:    31 f6                    xor    %esi,%esi
>>>        69b:    50                       push   %rax
>>>        69c:    31 ff                    xor    %edi,%edi
>>>        69e:    e8 00 00 00 00           callq  6a3
>>> <drm_calc_timestamping_constants+0x173>
>>>        6a3:    45 8b 44 24 60           mov    0x60(%r12),%r8d
>>>        6a8:    58                       pop    %rax
>>>        6a9:    5a                       pop    %rdx
>>>        6aa:    e9 7b ff ff ff           jmpq   62a
>>> <drm_calc_timestamping_constants+0xfa>
>>>        6af:    49 8b 7f 08              mov    0x8(%r15),%rdi
>>>        6b3:    4c 8b 67 50              mov    0x50(%rdi),%r12
>>>        6b7:    4d 85 e4                 test   %r12,%r12
>>>        6ba:    74 25                    je     6e1
>>> <drm_calc_timestamping_constants+0x1b1>
>>>        6bc:    e8 00 00 00 00           callq  6c1
>>> <drm_calc_timestamping_constants+0x191>
>>>        6c1:    48 c7 c1 00 00 00 00     mov    $0x0,%rcx
>>>        6c8:    4c 89 e2                 mov    %r12,%rdx
>>>        6cb:    48 c7 c7 00 00 00 00     mov    $0x0,%rdi
>>>        6d2:    48 89 c6                 mov    %rax,%rsi
>>>        6d5:    e8 00 00 00 00           callq  6da
>>> <drm_calc_timestamping_constants+0x1aa>
>>>        6da:    0f 0b                    ud2
>>>        6dc:    e9 67 ff ff ff           jmpq   648
>>> <drm_calc_timestamping_constants+0x118>
>>>        6e1:    4c 8b 27                 mov    (%rdi),%r12
>>>        6e4:    eb d6                    jmp    6bc
>>> <drm_calc_timestamping_constants+0x18c>
>>>        6e6:    66 2e 0f 1f 84 00 00     nopw   %cs:0x0(%rax,%rax,1)
>>>        6ed:    00 00 00
>>>        6f0:    90                       nop
>>>        6f1:    90                       nop
>>>        6f2:    90                       nop
>>>        6f3:    90                       nop
>>>        6f4:    90                       nop
>>>        6f5:    90                       nop
>>>        6f6:    90                       nop
>>>        6f7:    90                       nop
>>>        6f8:    90                       nop
>>>        6f9:    90                       nop
>>>        6fa:    90                       nop
>>>        6fb:    90                       nop
>>>        6fc:    90                       nop
>>>        6fd:    90                       nop
>>>        6fe:    90                       nop
>>>        6ff:    90                       nop
>>>
>>>
>>>>    David
>>>>
>>>> -
>>>> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
>>>> 1PT, UK
>>>> Registration No: 1397386 (Wales)
>>>>

-- 
Jani Nikula, Intel Open Source Graphics Center

Reply via email to