Hi Greg,

I know what you mean.  :)  I had tried that before, but executing an
rdmol.delete() at the end of the loop didn't help.  And, I just re-tried
that to no avail.

I remember having a similar issue with the SDMolSupplier before, where just
reading the file consumed a ton of memory.  This was patched, and all of
the rest of my code runs well.  But if I want to sample from the
SDMolSupplier stream, things go weird.  I had hoped to copy the each rdmol
to a new object (reducing the leak) if I wanted to hold it for a time, but
that didn't help either.  I am deleting every molecule that I hold, but
there appears to be no impact on memory consumption.  I think that the JVM
is asleep killing these objects, as forcing it to do so (well, as much as
one can) doesn't fix things.

I may just have to write this in Python, where I am pretty certain the
memory issues are non-existant.  :)  I was hopeful that someone else may
have encountered this issue, and had a path around it.

Thanks for taking a look Greg!
Matt


On Wed, Jul 15, 2015 at 1:57 AM, Greg Landrum <greg.land...@gmail.com>
wrote:

> Hi,
>
> It's not easy (for me) to read through the Java code and figure out what
> is going on, but it looks to me like you are "leaking" rdmol in each
> iteration of your loop.
>
> The problem that the RDKit Java wrappers (really any Java wrapper created
> with SWIG) has here is that the JVM doesn't know how big the underlying C++
> object is, so it's not aggressive enough while cleaning up memory. I think
> calling rdmol.delete() at the end of each iteration (this frees the
> underlying C++ object) should help.
>
> -greg
>
>
> On Tuesday, July 14, 2015, Matthew Lardy <mla...@gmail.com> wrote:
>
>> Hi all,
>>
>> I have had a strange issue that I can't seem to find a way around.  The
>> following code block consumes a ton of memory, which is strange as just
>> using the SD File reader I have no memory issues.  I think that the issue
>> is related to the java garbage collection not being picked up, even though
>> I have attempted to force that (to no success).
>>
>> All the following block does is iterate through an SD file and look for
>> the highest (or lowest) scoring molecule for each molecule.  The assumption
>> is that all molecules of the same type will be next to each other in the
>> file (which is not my problem).  Running this on a SD file of around 400K
>> molecules consumes around 23GB of memory, so if anyone has an idea I will
>> be most appreciative!
>>
>>    public static void main(String argv[]) throws IOException,
>> InterruptedException
>>    {
>>       CommandLineParser cParser;
>>       String[] modes    = {};
>>       String[] parms    = {"-in", "-filterTag", "-direction", "-out"};
>>       String[] reqParms = {"-in", "-filterTag", "-direction", "-out"};
>>
>>       String rdkitSO = System.getenv("RDKIT_SO");
>>       System.load(rdkitSO);
>>
>>
>>       String currentDir   = System.getProperty("user.dir");
>>       File dir = new File(currentDir);
>>
>>       cParser = new
>> CommandLineParser(EXPLAIN,0,0,argv,modes,parms,reqParms);
>>
>>       ROMol rdmol  = null;
>>       ROMol rdmol2 = null;
>>
>>       SDMolSupplier suppl = new SDMolSupplier(cParser.getValue("-in"));
>>       SDWriter writer = new SDWriter(cParser.getValue("-out"));
>>       int count = 0;
>>
>>       while (!suppl.atEnd())
>>       {
>>           count++;
>>           if (count % 1000 == 0)
>>           {
>>              System.out.println(count);
>>           }
>>           rdmol = suppl.next();
>>           if (rdmol2 == null)
>>           {
>> //             rdmol2.delete();
>>              rdmol2 = new ROMol(rdmol);
>>              continue;
>>           }
>>           if (rdmol.MolToSmiles().equals(rdmol2.MolToSmiles()))
>>           {
>>               if ( cParser.getValue("-direction").equals("highest") )
>>               {
>>                  double value1 =
>> Double.parseDouble(rdmol.getProp(cParser.getValue("-filterTag")));
>>                  double value2 =
>> Double.parseDouble(rdmol2.getProp(cParser.getValue("-filterTag")));
>>                  //System.out.println("Val1 " + value1 + " Val2 " +
>> value2);
>>                  if (value1 > value2)
>>                  {
>>                      rdmol2.delete();
>>                      rdmol2 = new ROMol(rdmol);
>>                  }
>>               }
>>               else
>>               {
>>                  if (
>> Double.parseDouble(rdmol.getProp(cParser.getValue("-filterTag"))) <
>> Double.parseDouble(rdmol2.getProp(cParser.getValue("-filterTag"))) )
>>                  {
>>                      rdmol2.delete();
>>                      rdmol2 = new ROMol(rdmol);
>>                  }
>>               }
>>           } else {
>>               writer.write(rdmol2);
>>               rdmol2.delete();
>>               rdmol2 = new ROMol(rdmol);
>>           }
>>       }
>>    }
>>
>>
------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to