It uses AVX2, so... thanks for the extra seconds. :D

Cheers.
Elias.

2018-06-09 0:43 GMT-03:00 Philip Guenther <[email protected]>:
> On Fri, Jun 8, 2018 at 10:13 AM Elias M. Mariani <[email protected]>
> wrote:
>>
>> I usually run long computations on OpenBSD-current, in the last few
>> days I see an upgrade in the performance of the process (in this case
>> I have 6 threads running a very optimized assembler code).
>> Each iteration of the code was about 14 sec. and now is around 13 sec.
>> Don't mind the computation, the question is more about if something
>> was changed (and if so what) to hit the performance so much, the
>> compilation is the same as before, so the code did not change and the
>> software does not use anything from ports, only the standard C
>> library.
>>
>> A more accurate description would be that the threads are more
>> homogeneous in relation with the iterations time, maybe one thread
>> suddenly give a 11 sec. result, and another 15, probably related with
>> thread reallocations, now I have a steady time in all the threads and
>> in average the numbers are better.
>> Anyways... Better is better. Just curious to know why is working better.
>> :D
>
>
> If the "optimized assembler"  that the threads are running uses AVX or
> similar "extended CPU state" extensions then the improvement is almost
> certainly from my switch amd64 from "lazy FPU switching" to what I'll call
> "semi-eager switching", where the current thread's registers are always
> ensured to be loaded before returning to userspace, eliminating the need for
> extra userspace->kernel->userspace transitions and IPIs to load the
> registers in the current CPU.
>
> Glad to hear it's such a large improvement for your processing.  Enjoy, and
> remember to tip your OS vendor!  <wink>
>
>
> Philip Guenther
>

Reply via email to