David,

 

Thanks for the detailed response. It seems that in my own code, there is 
nothing I can do, because the ‘new’ method returns only either an empty object 
reference or NULL in these cases—nothing I can use to invoke the ‘close’ method.

 

I think a fix within the IO::HDF::SD method might work as you describe, but 
this should probably be handled by steadier hands than mine. Let’s have a look, 
in HDF/SD/SD.pd, we find this:

   

    $self->{SDID} = PDL::IO::HDF::SD::_SDstart( $self->{FILE_NAME}, 
$self->{ACCESS_MODE} );

    die "_ERR::SDstart\n" if( $self->{SDID} == -1 );

 

This is the part where it exits with ‘die’ when it cannot start the interface, 
requiring me to use ‘eval{}’ in my invocation. I consider this a bug, BUT it is 
possible it was done this way because the HDF library interface does not reset 
itself properly, meaning a clean exit won’t help.

 

If we then go and look at ‘_SDstart’, we should find where it is bailing out 
without cleaning up properly. But, I think that’s actually a direct call to the 
HDF libraries in C, so pretty much a dead end for PDL. I think—someone who 
knows what they are doing should have a look to confirm.

 

Is there some magic I can use to call ‘eval{}’ that will force it to do the 
cleanup it’s not currently doing? Any ideas?

 

Thanks again,

 

--Edward H.

 

From: David Mertens [mailto:[email protected]] 
Sent: Monday, May 11, 2015 11:54 AM
To: Hyer, Dr. Edward
Cc: [email protected]
Subject: Re: [Pdl-general] PDL::IO::HDF::SD housekeeping 'Too many open files'

 

Hello Edward,

I'm not an expert with this module, but I can sniff out the problem. The SD 
file handle gets closed using the close 
<https://metacpan.org/source/CHM/PDL-2.007/IO/HDF/SD/SD.pd#L1578>  method, 
which is usually called by DESTROY 
<https://metacpan.org/source/CHM/PDL-2.007/IO/HDF/SD/SD.pd#L1586> . However, if 
the new <https://metacpan.org/source/CHM/PDL-2.007/IO/HDF/SD/SD.pd#L405>  
method fails before blessing the reference 
<https://metacpan.org/source/CHM/PDL-2.007/IO/HDF/SD/SD.pd#L579> , then the 
destroy method is never called. Quite a bit happens before $self is blessed, 
and that is probably the thing creating the leak. This clearly seems to be the 
case for you.

I think the easiest solution is to bless the reference as soon as it's created, 
and to add logic to the DESTROY and/or close method to not attempt to close the 
SDID if there is none. Based on the comment just before the close method, I'm 
not 100% sure that'll solve the problem, though. Do you know how to implement 
this idea and see if it fixes the problem, or would you like some guidance?

David

 

On Mon, May 11, 2015 at 2:04 PM, Hyer, Dr. Edward <[email protected]> 
wrote:

Hello PDL wizards,

My code churns through hundreds of thousands of HDF files, and some fraction
of them are corrupted (bit-rot, as far as I can tell).

Because PDL::IO::HDF::SD does not fail gracefully, I have to catch these
files like this:
    # Open HDF file for input
    my $hdfopen_status = eval{$hdfobj =
PDL::IO::HDF::SD->new("$hdf_file");};
    unless($hdfobj){
        warn "$routine: Cannot open $hdf_file : $@\n";
        print "$routine: Cannot open $hdf_file : $@\n";
        #my $hdfclose_status=$hdfobj->close(); # can't do this because
$hdfobj doesn't exist!
        return -2;};

This accomplishes my goals of putting information into the log about bad
files (the parent routine then deals with the files), and moving on.

However, lately I have been running a long reprocessing job, and at a
certain point it stops with the "Too many open files" error, always right
after encountering a corrupted file. My suspicion is that
PDL::IO::HDF::SD->new() is leaving an open filehandle when it exits
uncleanly.

I don't know the precise number of corrupted files that accumulates to
generate this error, but it's on the order of 1000.

Any ideas?

Thanks,

--Edward H.

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
pdl-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pdl-general




-- 

 "Debugging is twice as hard as writing the code in the first place.
  Therefore, if you write the code as cleverly as possible, you are,
  by definition, not smart enough to debug it." -- Brian Kernighan

Attachment: smime.p7s
Description: S/MIME cryptographic signature

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
pdl-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pdl-general

Reply via email to