Hi Greg,
Thanks for the pointer. I’ll take a look. If it could go in the next patch
release that would be really useful.
Dave


On Sat, 22 Oct 2022 at 10:52, Greg Landrum <greg.land...@gmail.com> wrote:

>
> Hi Dave,
>
> We have multiple examples of this in the code, here’s one:
>
> https://github.com/rdkit/rdkit/blob/b208da471f8edc88e07c77ed7d7868649ac75100/Code/GraphMol/ForceFieldHelpers/Wrap/rdForceFields.cpp#L40
>
> I’m not sure how this would interact with the call to Python::extract
> that’s in the bulk functions though
>
> It might be better to handle the multithreading on the C++ side by adding
> an optional nThreads argument to  the bulk similarity functions. (Though
> this would have to wait for the next release since it’s a feature addition…
> we can declare releasing the GIL as a bug fix)
>
> -greg
>
>
> On Sat, 22 Oct 2022 at 09:48, David Cosgrove <davidacosgrov...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I'm doing a lot of tanimoto similarity calculations on large datasets
>> using BulkTanimotoSimilarity.  It is an obvious candidate for
>> parallelisation, so I am using concurrent.futures to do so.  If I use
>> ProcessPoolExectuor, I get good speed-up but each process needs a copy of
>> the fingerprint set and for the sizes I'm dealing with that uses too much
>> memory.  With ThreadPoolExecutor I only need 1 copy of the fingerprints,
>> but the GIL means it only runs on 1 thread at a time so there's no gain.
>> Would it be possible to amend the C++ BulkTanimotoSimilarity to free the
>> GIL whilst it's doing the calculation, and recapture it afterwards?  I
>> understand things like numpy do this for some of their functions.  I'm
>> happy to attempt it myself if someone who knows about these things can
>> advise that it could be done, it would help, and they could provide a few
>> pointers.
>>
>> Thanks,
>> Dave
>>
>>
>> --
>> David Cosgrove
>> Freelance computational chemistry and chemoinformatics developer
>> http://cozchemix.co.uk
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> --
David Cosgrove
Freelance computational chemistry and chemoinformatics developer
http://cozchemix.co.uk
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to