On Mon, 2 Jun 2003 at 16:01 +0400, Dmitry <[email protected]> wrote:

> it depends on a loop body code.
> So, if the loop body is not too big, then 32X and more is Ok (somehow).
> That's the idea.
> The problem here is that the loop body should be:
>       asm("; %0" "+r"(i));
>
> I thought this is in 'tips&tricks'

I tried the a test case using the
above asm statement and get an error

unroll.c: In function `unroll':
unroll.c:10: called object is not a function
unroll.c:10: argument of `asm' is not a constant string
where line 10 uses the above syntax.

until I added a colon between the template and the output parameter
(I have much to learn about the inline assembler still...)

'tips&tricks'
   http://mspgcc.sourceforge.net/doc_appendixE.html
has the following example
  for(i=0;i<1234;i++) __asm__ __volatile__("; loop");
but Appendix D -  Inline assembly
  http://mspgcc.sourceforge.net/doc_appendixD.html

using your new example cause no loop unrolling
    asm("; %0" : "+r" (i));
but __without__ the output parameter, the number of
times the loop is expanded, and the code
generated seems accurate, but 'somewhat unpredictable'.

Also, sometimes the generated code uses "sub" to subtract a value from the
index, but most of the time it uses "add" to add a negative value.

LIM - requested number of iterations
    times loop expanded

34  x02       53  x04       72  x18       91  x07       110 x10
35  x07       54  x27       73  x04       92  x04       111 x03
36  x18       55  x05       74  x02       93  x03       112 x28
37  x04       56  x28       75  x25       94  x02       113 x04
38  x02       57  x03       76  x04       95  x05       114 x06
39  x03       58  x02       77  x07       96  x24       115 x05
40  x20       59  x04       78  x06       97  x04       116 x04
41  x04       60  x30       79  x04       98  x14       117 x09
42  x21       61  x04       80  x20       99  x09       118 x02
43  x04       62  x02       81  x27       100 x25       119 x07
44  x04       63  x21       82  x02       101 x04       120 x30
45  x15       64  x32       83  x04       102 x06       121 x04
46  x02       65  x05       84  x21       103 x04       122 x02
47  x04       66  x06       85  x05       104 x08       123 x03
48  x24       67  x04       86  x02       105 x21       124 x04
49  x07       68  x04       87  x03       106 x02       125 x25
50  x25       69  x03       88  x08       107 x04       126 x21
51  x03       70  x14       89  x04       108 x27       127 x04
52  x04       71  x04       90  x30       109 x04       128 x32

It is interesting to see that the loop expansion is limited to
32 times, but that I couldn't find a time when it was expanded to
11, 13, 17, 19, 22, 23, 26, 29, or 31 times.




---
/* unroll.c */
#if !defined LIM
#define LIM 1234
#endif

void unroll(void)
{
    int i;
#if 1
    for(i=0;i<LIM;i++) asm("; %0" : "+r" (i));
#else
    for(i=0;i<LIM;i++) __asm__ __volatile__("; loop");
#endif
}

Reply via email to