> Has anyone faced a similar problem before? Are there targets for which 
> both VLIW and DBR are enabled? Perhaps ia64?

I did something similar a few months ago.

The problem is that haifa and the delayed branch scheduling passes don't
really fit together. delayed branch scheduling happily undoes all the
haifa decisions.

The question is how much you gain by delayed branch scheduling. I don't
have numbers, but it wasn't much in my case. And since your company name
is picochip, you certainly value size more than speed ?!

I pursued two approaches. The first one was to insert "stop bit" pseudo
insns into the RTL stream in machdep reorg, so I didn't have to rely on
TImode insn flags during output. But then delayed branch scheduling just
took one insn out of an insn group and put it into the delay slot,
meaning there was usually no cycle gain at all, just larger code size
(due to insn duplication).

The second approach was having lots of parallel insns (using match
parallel and a custom predicate). machdep reorg then converts insn
bundles into a single parallel insn. Delayed branch scheduling then does
the right thing. This approach works fairly well for me, but there are a
few complications. My output code is pretty hackish, as I didn't want to
duplicate outputing a single insn / outputing the same insn as component
of a parallel insn group.

Tom


Reply via email to