Hi,
It's quite odd that both sqrt_i and result were zeroed out at the same
time. Does the problem appear in other ISA FS mode, e.g. x86 FS mode? Can
you show the objdump of the loop as well?
Regards,
Hoa Nguyen
On Thu, Oct 6, 2022, 04:06 Νικόλαος Ταμπουρατζής
wrote:
> Dear Jason, all,
>
> I am
Dear Jason & Boddy,
Unfortunately, I have tried my simple example without the sqrt
function and the problem remains. Specifically, I have the following
simple code:
#include
#include
int main(){
int dim = 1024;
double result;
for (int iter = 0; iter < 2; iter++){
On 10/7/2022 1:30 PM, Aritra Bagchi wrote:
Hi Eliot,
Thanks for the response. The unrolled loop, despite having the same dependency across "j", can send
multiple loads simultaneously. So the limitation might not be due to that dependency across "j" of
different iterations. But in the non-unrol
Hi Eliot,
Thanks for the response. The unrolled loop, despite having the same
dependency across "j", can send multiple loads simultaneously. So the
limitation might not be due to that dependency across "j" of different
iterations. But in the non-unrolled loop, the control dependency is there,
whic
On 10/7/2022 1:13 PM, Eliot Moss wrote:
On 10/7/2022 1:03 PM, Aritra Bagchi wrote:
Hi all,
Any suggestions on this are most helpful.
Thanks and regards,
Aritra
My guess is that it is because the non-unrolled loop
has a test of i against 1000 before each access to A[i].
That test guards the l
On 10/7/2022 1:03 PM, Aritra Bagchi wrote:
Hi all,
Any suggestions on this are most helpful.
Thanks and regards,
Aritra
My guess is that is is because the non-unrolled loop
has a test of i against 1000 before each access to A[i].
That test guards the load, so must be completed before
the load
Hi all,
Any suggestions on this are most helpful.
Thanks and regards,
Aritra
On Thu, Oct 6, 2022 at 6:01 PM Aritra Bagchi
wrote:
> Hi all,
>
> *for (i = 0; i < 1000; i++) {*
> * j = j + A[ i ]*
> *}*
>
> Suppose such a loop program is executed on gem5 (single-core execution,
> with O3 CU
I have an idea...
Have you put a breakpoint in the implementation of the fsqrt_d function? I
would like to know if when running in SE mode and running in FS mode we are
using the same rounding mode. My hypothesis is that in FS mode the rounding
mode is set differently.
Cheers,
Jason
On Fri, Oct
Dear Boddy,
Thanks a lot for the effort! I looked in detail and I observe that the
problem is created only using float and double variables (in the case
of int it is working properly in FS mode). Specifically, in the case
of float the variables are set to "nan", while in the case of double