On 08/29/2013 11:31 AM, Joe Perches wrote:
> On Thu, 2013-08-29 at 05:55 +0000, Vineet Gupta wrote:
>
>> The intent of writing orig code was to generate only 1 MPYHU insn (32*32 =
>> high-part-64) for the whole math, at any optimization level whatsoever. If 
>> the
>> first MPY is overflowing, u r likely spinning for > 10,000 usec (10ms) which 
>> is 1
>> scheduling tick on ARC - not good - presumably for hardware debug. It would 
>> be
>> better to use a tight loop there and throw it out later.
> It's a delay loop.  Does it matter whether
> or not a multiply or division is used?

I know what you mean here. Your suggestion from a different mail,

> I think the whole thing is odd and it should simply be
>
>       loops = loops_per_jiffy * usecs_to_jiffies(usecs)


This adds an additional MPYHU (ignoring the large limms and check for max 
jiffies).
FWIW, most arches do optimize this routine a bit - so ARC not using a standard
kernel API is not that big a sin ;-)

On the topic of multiply vs. divide (which probably is not relevant to topic at
hand though), since ARCompact doesn't have native divide, we end up emulating it
using libgcc routines. That makes it slightly non-deterministic (not a big deal)
and also adds to boot time (which those delays sprinkled all over the place in
crazy device probes and such). Seriously we got hammered by a customer for that 
once.

-Vineet
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to