Hi,

I'm working on some code that needs to be able to seek in an LDIF file based
on DN. The idea is to iterate through the file, finding each entry in turn
and using tell($ldif->handle) to get the position in the file. That provides
an index to allow something like random read access to the file. I knocked
up this test case to make sure it was feasible:

# ldif fpos test
use strict;
use warnings;
use Net::LDAP::LDIF;

my $ldif = Net::LDAP::LDIF->new("dump.ldif","r",onerror=>'die');

my $entry = $ldif->read_entry;
my $fpos = tell($ldif->handle);
print "Fpos =$fpos\n";
$entry = $ldif->read_entry;



print "DN = ".$entry->dn()."\nfpos = ",tell($ldif->handle)."\n\n";
for (1..100)
{
        $entry = $ldif->read_entry;
}
print "Current position = ".tell($ldif->handle)."\n";
print "DN way down in file = ".$entry->dn."\n";

seek($ldif->handle,$fpos,0);
print "fps = ".tell($ldif->handle)."\n";
$entry = $ldif->read_entry;
print "Sought dn = ".$entry->dn."\n";
print "fps = ".tell($ldif->handle)."\n";


Unfortunately, although seeking the filehandle back to the chosen DN works
to the extent that the filehandle pointer is in the right place,
$ldif->read_entry returns the entry after the last one read in the 1..100
loop - i.e. as if the seek hadn't happened. Digging around, it appears that
Net::LDAP::LDIF is caching the next entry in the file as it reads each
entry, in Net::LDAP::LDIF::_read_lines. Each time _read_lines is called, if
the cached next entry exists it returns that, otherwise it reads the file.
If the second do{}until block in this sub is commented out as follows the
test case works:

sub _read_lines {
  my $self = shift;
  my @ldif;
  {
    local $/ = "";
    my $fh = $self->{'fh'};
    my $ln;
    do {        # allow comments separated by blank lines
      $ln = $self->{_next_lines} || scalar <$fh>;
      unless ($ln) {
         $self->{_next_lines} = '';
         $self->{_current_lines} = '';
         $self->eof(1);
         return;
      }
      $ln =~ s/\n //sg;
      $ln =~ s/^#.*\n//mg;
      chomp($ln);
      $self->{_current_lines} = $ln;
    } until ($self->{_current_lines} || $self->eof());
    chomp(@ldif = split(/^/, $ln));
    #do {
    #  $ln = scalar <$fh> || '';
    #  $self->eof(1) unless $ln;
    #  $ln =~ s/\n //sg;
    #  $ln =~ s/^#.*\n//mg;
    #  chomp($ln);
    #  $self->{_next_lines} = $ln;
    #} until ($self->{_next_lines} || $self->eof());

  }

  @ldif;
}


but I'm not sure why the code is caching the next entry, and what commenting
out this block might break; plus of course I don' t really want to have a
customised version of the module on my installation. The alternative would
be to directly remove the private attrib _next_lines, or to recreate the
object before each seek - neither of which strike me as a good idea. Anyone
care to comment?

cheers,

Charles Colbourn.



Reply via email to