Hi Greg, thanks for your swift response! I tried what suggested and I did not observe any increased memory consumption. I investigated further and eventually identified an issue with numpy (in Graphs.CharacteristicPolynomial) as the main cause of the memory problem. Updating numpy to a recent version solved it.
Best, Michael On Thu, Oct 15, 2015 at 6:36 AM, Greg Landrum <greg.land...@gmail.com> wrote: > Hi Michael, > > On Wed, Oct 14, 2015 at 7:06 PM, Michael Reutlinger <rd...@mulchi.de> > wrote: > >> >> I observed a memory leak while using the RDKit to calculate descriptors >> for a large library of compounds. >> >> I tracked it down to the Ipc descriptor and it is reproducible with this >> small script: >> >> from rdkit.ML.Descriptors import MoleculeDescriptors >> from rdkit import Chem >> >> calculator = MoleculeDescriptors.MolecularDescriptorCalculator(['Ipc']) >> for n in range(100000): >> mol = Chem.MolFromSmiles('CC(C)Cc1ccc(cc1)C(C)C(=O)O') >> x = calculator.CalcDescriptors(mol) >> if not n % 100: print n >> >> I tested it on my Linux workstation (Redhat 6). The process memory >> consumption increases to several hundred mb. Interestingly, I can't >> reproduce it on my Mac running the latest os. >> > > I can't reproduce it on my Mac either. I'm on vacation and don't have > access to my linux box, but I will see if I can reproduce it when I'm back > next week. Which version(s) of python are you using on the machines? > > My guess is that the leak is caused by getDistanceMatrix in MolOps.cpp. >> Specifically, a missing delete for the distMat pointer (in the getDistanceMat >> documentation is a note that the pointer should be deleted by the >> caller). However, I am not a c++ programmer myself and this analysis might >> not be the true cause. >> > > The docs actually say that the pointer should *not* be deleted by the > caller, but that's not relevant here anyway. The C++ object is copied into > a new python numpy array object before being returned to the user. > > >> I hope it is reproducible on other systems and easy to fix :-) If you >> need additional information please let me know. >> > > The simplest possible test would be to see if you get the same leak when > you just call Chem.GetDistanceMatrix(mol,0) repeatedly. > > Best, > -greg > >
------------------------------------------------------------------------------
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss