I will test it with a table derived mask and see what happens. I would
like the table to be large enough to encompass 19x19 GO too, but I'll
test it with a small table first.
- Don
On Fri, 2006-12-08 at 00:39 +0100, Antoine de Maricourt wrote:
> Last time I checked MT was under 20cc per call (I assume it's
> inlined). On a P4, the shift operator is supposed to have a 4 cc
> latency. The dependency chain to get the mask yields to 25 cc latency
> before the mask can be used (4 per shift + 1 for OR) * 5. So it's
> quite possible that this sequence dominates the call to rand(). Just
> give a try... unless you use another processor with a better latency
> for shift operations. But even with those processors it might be
> benefic to replace 15 instructions by only one. Of course if you make
> a specific implementation for small n (below 256), you can remove the
> 2 last shift as well.
>
> Antoine
> > I'm pretty sure the time of this function is dominated by the call to
> > rand(), but it never occurred to do a table lookup for the mask,
> > interesting idea.
> >
> >
> > - Don
> >
> >
> > On Thu, 2006-12-07 at 22:36 +0100, Antoine de Maricourt wrote:
> >
> > > If this randint routine is critical, you can save some calls to rand()
> > > when you know that n is always below some value (see my previous post
> > > about bitmap go).
> > > For instance if n < 128 (probably true for 9x9 go), you could try:
> > >
> > > while (true) {
> > > r = rand();
> > >
> > > if ((r & v) < n) return (r & v);
> > > r >>= 7;
> > > if ((r & v) < n) return (r & v);
> > > r >>= 7;
> > > ... // 2 more times
> > > }
> > >
> > > Also it could be better to read the mask (v) from a table (provided
> > > the upper limit of n is small enough). At least on a P4, it will avoid
> > > a long data dependency chain and might give a (probably small)
> > > improvment. It might even be benefical on other processors.
> > >
> > > Antoine
> > >
> > >
> > > > On Thu, 2006-12-07 at 16:05 +0100, Łukasz Lew wrote:
> > > >
> > > >
> > > > > ii = pm::rand () % empty_v_cnt; // TODO improve speed "%"
> > > > >
> > > > >
> > > > Try this, I think it could be faster, not sure, but has the advantage
> > > > that it's slightly more correct.
> > > >
> > > > // returns an integer between 0 and n-1 inclusive
> > > > //
> > > > unsigned long randint(unsigned long n)
> > > > {
> > > > unsigned long v = n;
> > > > unsigned long r;
> > > >
> > > > v--;
> > > > v |= v >> 1;
> > > > v |= v >> 2;
> > > > v |= v >> 4;
> > > > v |= v >> 8;
> > > > v |= v >> 16;
> > > >
> > > > do { r = rand(); } while ( (r & v) >= n );
> > > >
> > > > return( r & v );
> > > > }
> > > >
> > > >
> > > >
> > > > - Don
> > > >
> > > >
> > > > _______________________________________________
> > > > computer-go mailing list
> > > > [email protected]
> > > > http://www.computer-go.org/mailman/listinfo/computer-go/
> > > >
> > > >
> > > >
> > > >
> >
> >
> >
> >
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/