Hi Greg,

thanks for your swift response! I tried what suggested and I did not
observe any increased memory consumption. I investigated further and
eventually identified an issue with numpy (in
Graphs.CharacteristicPolynomial) as the main cause of the memory problem.
Updating numpy to a recent version solved it.

Best,
Michael

On Thu, Oct 15, 2015 at 6:36 AM, Greg Landrum <greg.land...@gmail.com>
wrote:

> Hi Michael,
>
> On Wed, Oct 14, 2015 at 7:06 PM, Michael Reutlinger <rd...@mulchi.de>
> wrote:
>
>>
>> I observed a memory leak while using the RDKit to calculate descriptors
>> for a large library of compounds.
>>
>> I tracked it down to the Ipc descriptor and it is reproducible with this
>> small script:
>>
>> from rdkit.ML.Descriptors import MoleculeDescriptors
>> from rdkit import Chem
>>
>> calculator = MoleculeDescriptors.MolecularDescriptorCalculator(['Ipc'])
>> for n in range(100000):
>> mol = Chem.MolFromSmiles('CC(C)Cc1ccc(cc1)C(C)C(=O)O')
>> x = calculator.CalcDescriptors(mol)
>> if not n % 100: print n
>>
>> I tested it on my Linux workstation (Redhat 6). The process memory
>> consumption increases to several hundred mb. Interestingly, I can't
>> reproduce it on my Mac running the latest os.
>>
>
> I can't reproduce it on my Mac either. I'm on vacation and don't have
> access to my linux box, but I will see if I can reproduce it when I'm back
> next week. Which version(s) of python are you using on the machines?
>
> My guess is that the leak is caused by getDistanceMatrix in MolOps.cpp.
>> Specifically, a missing delete for the distMat pointer (in the getDistanceMat
>> documentation is a note that the pointer should be deleted by the
>> caller). However, I am not a c++ programmer myself and this analysis might
>> not be the true cause.
>>
>
> The docs actually say that the pointer should *not* be deleted by the
> caller, but that's not relevant here anyway. The C++ object is copied into
> a new python numpy array object before being returned to the user.
>
>
>> I hope it is reproducible on other systems and easy to fix :-) If you
>> need additional information please let me know.
>>
>
> The simplest possible test would be to see if you get the same leak when
> you just call Chem.GetDistanceMatrix(mol,0) repeatedly.
>
> Best,
> -greg
>
>
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to