Re: [Rdkit-discuss] GPU Implementation of shape-based 3D overlap on rdkit?

2020-11-08 Thread Francois Berenger

On 04/11/2020 04:26, Lewis Martin wrote:

Ive had an initial go at something like this using JAX. I chose JAX
since it has a shallow learning curve, essentially being numpy on a
GPU. This is great for vectorized calculations, but less so for
applications that involve a lot of control flow (ie if/else
statements), which as i understand it most point cloud registration
algorithms use, such as iterative closest point or anything available
in open3d.

No guarantee ill make any progress of course, but would someone mind
recommending a paper explaining a nice subshape alignment algorithm?


Grant, J.A.; Gallardo, M.A.; Pickup, B.T. (1996) ‘A fast method of 
molecular shape comparison: a simple application of a Gaussian 
description of molecular shape’, J. Comp. Chem. 17, 1653-1666 
[wiley/19961115]


From the abstract:
"A Gaussian description of molecular shape is used to compare the shapes 
of two molecules by analytically optimizing their volume intersection."


The Shape-it open-source program might have some code also.

Regards,
F.


Thanks :)
Lewis

On Wed, 4 Nov 2020 at 3:52 am, Andy Jennings
 wrote:


Hi Greg,

Thanks for the response and background. Here's hoping someone is
smart enough to code this up and generous enough to donate it back
to the community.

Best,
Andy

On Mon, Nov 2, 2020 at 8:52 PM Greg Landrum 
wrote:

Hi Andy,

At the moment the RDKit doesn't have either high-quality shape-based
alignment code[1] or GPU support.

I think having good shape-based alignment available would be a
really useful complement to the Open3DAlign code that's already
there, but it's certainly not a small project.

-greg
[1] The python implementation of the subshape alignment algorithm is
essentially just a proof-of-concept and not performant enough for
real usage.

On Mon, Nov 2, 2020 at 7:16 PM Andy Jennings
 wrote:

Hi,

I see that back in 2014 there was some discussion of using CUDA
inside of RDKit and how it may be possible to produce a
FastROCS-like open source alternative. I was curious if anyone had
made such a breakthrough. Since GPU availability is now so common,
and datasets are becoming so large, I figured that more and more
people would be thinking RDKit + GPU = :-)

Thanks in advance.
Andy ___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

 ___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Sent from Gmail Mobile
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] GPU Implementation of shape-based 3D overlap on rdkit?

2020-11-03 Thread Greg Landrum
Mark gave a nice overview of the literature for alignment based on gaussian
overlap (thanks Mark!).

The algorithm that's currently implemented in the RDKit is from some former
colleagues and is described here:
http://pubs.acs.org/doi/abs/10.1021/ci0256384

-greg

On Tue, Nov 3, 2020 at 8:28 PM Lewis Martin 
wrote:

> Ive had an initial go at something like this using JAX. I chose JAX since
> it has a shallow learning curve, essentially being numpy on a GPU. This is
> great for vectorized calculations, but less so for applications that
> involve a lot of control flow (ie if/else statements), which as i
> understand it most point cloud registration algorithms use, such as
> iterative closest point or anything available in open3d.
>
> No guarantee ill make any progress of course, but would someone mind
> recommending a paper explaining a nice subshape alignment algorithm?
>
> Thanks :)
> Lewis
>
> On Wed, 4 Nov 2020 at 3:52 am, Andy Jennings 
> wrote:
>
>> Hi Greg,
>>
>> Thanks for the response and background. Here's hoping someone is smart
>> enough to code this up and generous enough to donate it back to the
>> community.
>>
>> Best,
>> Andy
>>
>> On Mon, Nov 2, 2020 at 8:52 PM Greg Landrum 
>> wrote:
>>
>>> Hi Andy,
>>>
>>> At the moment the RDKit doesn't have either high-quality shape-based
>>> alignment code[1] or GPU support.
>>>
>>> I think having good shape-based alignment available would be a really
>>> useful complement to the Open3DAlign code that's already there, but it's
>>> certainly not a small project.
>>>
>>> -greg
>>> [1] The python implementation of the subshape alignment algorithm is
>>> essentially just a proof-of-concept and not performant enough for real
>>> usage.
>>>
>>> On Mon, Nov 2, 2020 at 7:16 PM Andy Jennings 
>>> wrote:
>>>
 Hi,

 I see that back in 2014 there was some discussion of using CUDA inside
 of RDKit and how it may be possible to produce a FastROCS-like open source
 alternative. I was curious if anyone had made such a breakthrough. Since
 GPU availability is now so common, and datasets are becoming so large, I
 figured that more and more people would be thinking RDKit + GPU = :-)

 Thanks in advance.
 Andy
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

>>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> --
> Sent from Gmail Mobile
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] GPU Implementation of shape-based 3D overlap on rdkit?

2020-11-03 Thread Mark Mackey
Hi Lewis,

The standard shape alignment algorithm that everyone uses is from Grant & 
Pickup 1996 
(https://onlinelibrary.wiley.com/doi/abs/10.1002/%28SICI%291096-987X%2819961115%2917%3A14%3C1653%3A%3AAID-JCC7%3E3.0.CO%3B2-K).

It’s a Taylor-series-like expansion using spherical Gaussians as stand-ins for 
hard spheres - you take the atomic volumes, subtract off the pairwise overlaps, 
add back in the three-way overlaps, subtract off the four-way overlaps, and so 
on. I did a fair few tests some years back and you really need to go to 6 terms 
to get decent accuracy. However, all of the commercial algorithms (ROCS, Phase 
Shape, etc) seem to truncate at 2, so go figure. OTOH the “high throughput” 
versions all seem to be operated with ludicrously low number of conformations 
so the error in incomplete coverage of conformer space dwarfs the 5% noise that 
you get from truncating at 2 terms rather than 6.

If you want something slightly more accurate at the same computational cost, 
look at WEGA (https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.23603 and 
references therein) which heuristically corrects for some flaws in the 
truncated Grant&Pickup calculations.

If you want a fast GPU-accelerated version, then forget about actually applying 
the algorithm directly[*]. Instead, to compare a reference molecule A to a 
database molecule B, precompute a grid over A containing the pairwise overlap 
value of an atom at each point in the grid with A. You can then compute the 
shape overlap for a given orientation of B by a simple 3D texture lookup rather 
than faffing around trying to compute exponential functions.. This is 
simplified by assuming that all atoms have the same atomic radius and 
neglecting hydrogens (we’re going for speed over accuracy here, remember?) You 
can get a similar lookup texture for gradients, I think. One thing GPUs are 
really good at is texture lookups and interpolation. They’re less good at 
evaluating exponential functions. Your GPU algorithm is then a massively 
parallel CG or NR optimiser with the objective function computing shape overlap 
values for as many molecules as you can cram into GPU memory all in parallel.

[*] gWEGA (I believe) is a GPU-accelerated version of the standard WEGA 
algorithm and based on the published timings is an order of magnitude or more 
slower than fastROCS

Having said all of that, our GPU-accelerated shape similarity function just 
brute forces through the overlap series to sixth order, as (a) my happy place 
is on the accuracy side of the speed/accuracy tradeoff, and (b) our 
electrostatic similarity calculations are sufficiently complex that making the 
shape function faster wouldn’t be that much of a net win. As a result, take all 
of the above with a grain of salt 😊.

Regards,
Mark

--
Mark Mackey
Chief Scientific Officer
Cresset
New Cambridge House, Bassingbourn Road, Litlington, Cambridgeshire, SG8 0SS, UK
tel: +44 (0)1223 858890mobile: +44 (0)7595 099165fax: +44 (0)1223 853667
email: m...@cresset-group.com<mailto:m...@cresset-group.com>web: 
www.cresset-group.com<http://www.cresset-group.com/>skype: mark_cresset



From: Lewis Martin 
Sent: 03 November 2020 19:27
To: RDKit Discuss 
Subject: Re: [Rdkit-discuss] GPU Implementation of shape-based 3D overlap on 
rdkit?

Ive had an initial go at something like this using JAX. I chose JAX since it 
has a shallow learning curve, essentially being numpy on a GPU. This is great 
for vectorized calculations, but less so for applications that involve a lot of 
control flow (ie if/else statements), which as i understand it most point cloud 
registration algorithms use, such as iterative closest point or anything 
available in open3d.

No guarantee ill make any progress of course, but would someone mind 
recommending a paper explaining a nice subshape alignment algorithm?

Thanks :)
Lewis

On Wed, 4 Nov 2020 at 3:52 am, Andy Jennings 
mailto:andy.j.jenni...@gmail.com>> wrote:
Hi Greg,

Thanks for the response and background. Here's hoping someone is smart enough 
to code this up and generous enough to donate it back to the community.

Best,
Andy

On Mon, Nov 2, 2020 at 8:52 PM Greg Landrum 
mailto:greg.land...@gmail.com>> wrote:
Hi Andy,

At the moment the RDKit doesn't have either high-quality shape-based alignment 
code[1] or GPU support.

I think having good shape-based alignment available would be a really useful 
complement to the Open3DAlign code that's already there, but it's certainly not 
a small project.

-greg
[1] The python implementation of the subshape alignment algorithm is 
essentially just a proof-of-concept and not performant enough for real usage.

On Mon, Nov 2, 2020 at 7:16 PM Andy Jennings 
mailto:andy.j.jenni...@gmail.com>> wrote:
Hi,

I see that back in 2014 there was some discussion of using CUDA inside of RDKit 
and how it may be possible to produce a FastROCS-like open source alternative. 
I was

Re: [Rdkit-discuss] GPU Implementation of shape-based 3D overlap on rdkit?

2020-11-03 Thread Lewis Martin
Ive had an initial go at something like this using JAX. I chose JAX since
it has a shallow learning curve, essentially being numpy on a GPU. This is
great for vectorized calculations, but less so for applications that
involve a lot of control flow (ie if/else statements), which as i
understand it most point cloud registration algorithms use, such as
iterative closest point or anything available in open3d.

No guarantee ill make any progress of course, but would someone mind
recommending a paper explaining a nice subshape alignment algorithm?

Thanks :)
Lewis

On Wed, 4 Nov 2020 at 3:52 am, Andy Jennings 
wrote:

> Hi Greg,
>
> Thanks for the response and background. Here's hoping someone is smart
> enough to code this up and generous enough to donate it back to the
> community.
>
> Best,
> Andy
>
> On Mon, Nov 2, 2020 at 8:52 PM Greg Landrum 
> wrote:
>
>> Hi Andy,
>>
>> At the moment the RDKit doesn't have either high-quality shape-based
>> alignment code[1] or GPU support.
>>
>> I think having good shape-based alignment available would be a really
>> useful complement to the Open3DAlign code that's already there, but it's
>> certainly not a small project.
>>
>> -greg
>> [1] The python implementation of the subshape alignment algorithm is
>> essentially just a proof-of-concept and not performant enough for real
>> usage.
>>
>> On Mon, Nov 2, 2020 at 7:16 PM Andy Jennings 
>> wrote:
>>
>>> Hi,
>>>
>>> I see that back in 2014 there was some discussion of using CUDA inside
>>> of RDKit and how it may be possible to produce a FastROCS-like open source
>>> alternative. I was curious if anyone had made such a breakthrough. Since
>>> GPU availability is now so common, and datasets are becoming so large, I
>>> figured that more and more people would be thinking RDKit + GPU = :-)
>>>
>>> Thanks in advance.
>>> Andy
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
-- 
Sent from Gmail Mobile
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] GPU Implementation of shape-based 3D overlap on rdkit?

2020-11-03 Thread Andy Jennings
Hi Greg,

Thanks for the response and background. Here's hoping someone is smart
enough to code this up and generous enough to donate it back to the
community.

Best,
Andy

On Mon, Nov 2, 2020 at 8:52 PM Greg Landrum  wrote:

> Hi Andy,
>
> At the moment the RDKit doesn't have either high-quality shape-based
> alignment code[1] or GPU support.
>
> I think having good shape-based alignment available would be a really
> useful complement to the Open3DAlign code that's already there, but it's
> certainly not a small project.
>
> -greg
> [1] The python implementation of the subshape alignment algorithm is
> essentially just a proof-of-concept and not performant enough for real
> usage.
>
> On Mon, Nov 2, 2020 at 7:16 PM Andy Jennings 
> wrote:
>
>> Hi,
>>
>> I see that back in 2014 there was some discussion of using CUDA inside of
>> RDKit and how it may be possible to produce a FastROCS-like open source
>> alternative. I was curious if anyone had made such a breakthrough. Since
>> GPU availability is now so common, and datasets are becoming so large, I
>> figured that more and more people would be thinking RDKit + GPU = :-)
>>
>> Thanks in advance.
>> Andy
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] GPU Implementation of shape-based 3D overlap on rdkit?

2020-11-02 Thread Greg Landrum
Hi Andy,

At the moment the RDKit doesn't have either high-quality shape-based
alignment code[1] or GPU support.

I think having good shape-based alignment available would be a really
useful complement to the Open3DAlign code that's already there, but it's
certainly not a small project.

-greg
[1] The python implementation of the subshape alignment algorithm is
essentially just a proof-of-concept and not performant enough for real
usage.

On Mon, Nov 2, 2020 at 7:16 PM Andy Jennings 
wrote:

> Hi,
>
> I see that back in 2014 there was some discussion of using CUDA inside of
> RDKit and how it may be possible to produce a FastROCS-like open source
> alternative. I was curious if anyone had made such a breakthrough. Since
> GPU availability is now so common, and datasets are becoming so large, I
> figured that more and more people would be thinking RDKit + GPU = :-)
>
> Thanks in advance.
> Andy
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] GPU Implementation of shape-based 3D overlap on rdkit?

2020-11-02 Thread Andy Jennings
Hi,

I see that back in 2014 there was some discussion of using CUDA inside of
RDKit and how it may be possible to produce a FastROCS-like open source
alternative. I was curious if anyone had made such a breakthrough. Since
GPU availability is now so common, and datasets are becoming so large, I
figured that more and more people would be thinking RDKit + GPU = :-)

Thanks in advance.
Andy
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss