On Tuesday, 3 July 2012 at 17:25:18 UTC, bearophile wrote:
ixid:
In any case with large values of k the branch prediction will
be right almost all of the time, explaining why this form is
faster than modulo as modulo is fairly slow while this is a
correctly predicted branch doing an addition
ixid:
In any case with large values of k the branch prediction will
be right almost all of the time, explaining why this form is
faster than modulo as modulo is fairly slow while this is a
correctly predicted branch doing an addition if it doesn't make
it branchless.
That seems the explanat
Oops! I have a bad habit of thinking of the power operator as a
part of the value rather than as an operator.
ixid:
I take your point but I think most people know that the equals
operators have the lowest associativity.
Sorry I meant:
nums[iter_next] = total % (10 ^^ 8);
Instead of:
nums[iter_next] = total % 10^^8;
But I presume lot of people know that powers are higher
precedence :-)
Bye,
bearop
In any case with large values of k the branch prediction will be
right almost all of the time, explaining why this form is faster
than modulo as modulo is fairly slow while this is a correctly
predicted branch doing an addition if it doesn't make it
branchless. The branchless version gives the
I tested that, modulus is slower. The compiler is surely
converting it to something branchless like:
uint iter_next = (iter + 1) * !(iter + 1 > k);
I take your point but I think most people know that the equals
operators have the lowest associativity.
This is becoming a "fixed size circular queue". But maybe a
modulus is faster than a branch here.
I have tried, and it's slower both in D and C. I don't know why.
#include
#include
#include
uint64_t hFib(const uint32_t k, const uint32_t n) {
uint64_t *nums = calloc(k + 1, sizeof(uint64
ixid:
int iter_next = iter + 1 > k? 0 : iter + 1;
This is becoming a "fixed size circular queue". But maybe a
modulus is faster than a branch here. (It's better when k is
always a power of two, you don't need a modulus. And even better
if your size is a multiple of the page size
Thank you, that was faster. I decided to try the obvious method
of reusing the same array giving similar speed to your suggestion
while using less memory:
ulong f2(int k, int n) {
auto nums = new ulong[k + 1];
nums[$ - 1] = 1;
int iter = k;
ulong total = 0;
On Tuesday, July 03, 2012 15:29:51 bearophile wrote:
> ixid:
> > Given popFront is advancing a range does this mean the
> > underlying array is not being deleted? How would one delete the
> > information you don't need any more if so?
>
> I think the GC can't deallocate part of an array. It's a si
ixid:
It's a pity that the D standard library seems to lack rather a
lot of these things.
I've taken a better look at your code, for this program a deque
wasn't so needed.
This is quick:
import std.stdio;
ulong hFib(in uint k, in uint n) pure nothrow {
auto nums = new ulong[k + n + 1];
It's a pity that the D standard library seems to lack rather a
lot of these things. Permutation functions are also an annoying
one not to have. =/
ixid:
Given popFront is advancing a range does this mean the
underlying array is not being deleted? How would one delete the
information you don't need any more if so?
I think the GC can't deallocate part of an array. It's a single
memory zone, and it has to go all at once.
Maybe you need
That fixed the memory behaviour and is faster. Not using popFront
at all uses the same memory and is faster still. Given popFront
is advancing a range does this mean the underlying array is not
being deleted? How would one delete the information you don't
need any more if so?
On Tuesday, 3 July 2012 at 02:20:23 UTC, ixid wrote:
I'm not sure what's going on here. Using this code to solve a
reddit programming challenge to return the Nth term of a
Fibonacci-like sequence of arbitrary length:
module main;
import std.stdio, std.array, std.datetime;
ulong f(
15 matches
Mail list logo