I'm working on implementing a binary search using Inline::C.
Eventually I want to be able to call it like this:
my $found_ix = binary_search { $a <=> $b } @ordered_numbers;
or
my $found_ix = binary_search { $a cmp $b } @ordered_strings;
Currently I've got it to this point:
my $found = binary_search sub{ $a <=> $b }, \@ordered_numbers;
...in other words, I haven't worked on prototypes yet. But at the
moment the reason I'm writing is to ask about a couple of issues.
First, I'm leaking memory. I'm pretty sure it's not in the comparator
sub, as I tested it in isolation with valgrind and came out ok after
millions of iterations.
But when I invoke binary_search and run 'top' in another process I see
the memory consumption growing.
Second, I'm clobbering $a and $b. I thought that
SV* save_scalar(GV *gv)
would do the trick, but it expects a glob, and I'm grabbing $a and $b
as scalars.
Aside from a leak, and clobbering $a and $b, everything else works!
Here's the code:
use strict;
use warnings;
use Test::More;
use Inline C => 'DATA';
my @tests = (
[ 'Odd number of elements', [ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29,
31, 37, 41, 43, 53, 59, 61 ] ],
[ 'Even number of elements', [ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29,
31, 37, 41, 43, 53, 59 ] ],
[ 'Single element', [ 2
] ],
[ 'Two elements', [ 2, 3
] ],
[ 'Empty list', [
] ],
);
( $a, $b ) = ( "Hello", "world" );
for( 1 .. 1 ) {
foreach my $test ( @tests ) {
my( $name, $list ) = @{$test};
test_list( $name, $list );
}
}
is( "$a $b!", "Hello world!", "\$a and \$b are not clobbered." );
done_testing();
sub test_list {
my( $name, $aref ) = @_;
foreach my $needle ( 0 .. $#{$aref} + 2 ) {
my( $known_index ) = grep { $needle == $aref->[$_] } 0 .. $#{$aref};
my $found_index = binary_search( sub{ $a <=> $b }, $needle,
$aref );
my $found = defined($found_index) ? "Found." : "Not found.";
is( $found_index, $known_index, "$name. Needle:$needle. $found" );
}
}
__DATA__
__C__
int comparator( SV*, SV*, SV* );
SV* binary_search( SV* block, SV* needle, SV* aref_haystack ) {
I32 min = 0;
I32 max = 0;
if(
! SvROK( aref_haystack ) // We have a reference.
|| SvTYPE(SvRV(aref_haystack)) != SVt_PVAV // We have an array ref.
) {
croak( "Argument must be an array ref.\n" );
}
max = av_top_index( (AV*)SvRV(aref_haystack) );
while( max > min ) {
I32 mid = ( min + max ) / 2;
SV* hay = *av_fetch( (AV*)SvRV(aref_haystack), mid, 0 ); // Fetch
value at aref_haystack->[mid]
if( comparator(block,needle,hay) == 1 ) { // If needle > hay.
min = mid + 1;
}
else {
max = mid;
}
}
if( max == min && comparator(block, needle, *av_fetch(
(AV*)SvRV(aref_haystack),min,0) ) == 0 ) {
return newSViv(min);
}
return newSV(0); // undef: Not found.
}
int comparator ( SV* block, SV* needle, SV* hay ) {
int ret;
{
dSP;
int count;
ENTER;
SAVETMPS;
PUSHMARK(SP);
// We're not passing params.
PUTBACK;
sv_setsv(get_sv("a",0),needle);
sv_setsv(get_sv("b",0),hay);
count = call_sv(block, G_SCALAR);
SPAGAIN;
if (count != 1)
croak("Callback had improper return value count.\n");
ret = POPi;
PUTBACK;
FREETMPS;
LEAVE;
}
return ret;
}
So, if anyone knows what I ought to try in eliminating the leak, or to
avoid clobbering $a and $b, I'd appreciate hearing.