Re: [Rdkit-discuss] Possible rotatable bonds replacement

2014-01-31 Thread Gerebtzoff, Gregori
Hi,

I would also go for the second option (i.e. replace the current SMART): I
also see it as a bug fix. What you could do is to highlight in the release
notes or somewhere else the call one would have to do to mimic the behavior
of previous releases:
Lipinski.RotatableBondSmarts =
Chem.MolFromSmarts('[!$(*#*)&!D1]-&!@[!$(*#*)&!D1]')
Lipinski.NumRotatableBonds(m)

Grégori

Date: Fri, 31 Jan 2014 05:05:56 +0100
> From: Greg Landrum 
> Subject: [Rdkit-discuss] Possible rotatable bonds replacement
> To: RDKit Discuss ,Toby
> Wright
> 
> Message-ID:
>  xpfjmyue6x0_p...@mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Dear all,
>
> A question for the community:
>
> Toby Wright submitted a pull request this week that introduces a new,
> stricter, rotatable bond definition:
> https://github.com/rdkit/rdkit/pull/211/files
> The new SMARTS, re-formatted to be somewhat more readable, is:
> [!$(*#*)&\
>  !D1&\
>  !$(C(F)(F)F)&\
>  !$(C(Cl)(Cl)Cl)&\
>  !$(C(Br)(Br)Br)&\
>  !$(C([CH3])([CH3])[CH3])&\
>  !$([CD3](=[N,O,S])-!@[#7,O,S!D1])&\
>  !$([#7,O,S!D1]-!@[CD3]=[N,O,S])&\
>  !$([CD3](=[N+])-!@[#7!D1])&\
>  !$([#7!D1]-!@[CD3]=[N+])]\
> -!@\
> [!$(*#*)&\
>  !D1&\
>  !$(C(F)(F)F)&\
>  !$(C(Cl)(Cl)Cl)&\
>  !$(C(Br)(Br)Br)&\
>  !$(C([CH3])([CH3])[CH3])]
>
> Toby was quite careful and added a new descriptor -
> NumStrictRotatableBonds() - that uses this SMARTS.
>
> I see a few options to deal with this:
>
> I could add the new descriptor as Toby provided it. People are then free to
> pick between NumRotatableBonds() and NumStrictRotatableBonds(). This has
> the advantage of maintaining strict backwards compatibility, but I could
> imagine it being confusing/irritating to people using the code to have to
> choose between them (or, worse, using both).
>
> Another option is to just replace the current NumRotatableBonds() SMARTS
> with the new one.
> This loses backwards compatibility, but replaces NumRotableBonds() with
> something more correct.
>
> Finally, I could take a hybrid approach: replace the default
> NumRotatableBonds() with the new one, but add an extra argument that allows
> the old one to be used.
>
> I'm leaning towards the second option. I'd normally go with the third, but
> I almost view this as a bug fix for the rotatable bonds definition.
>
> Comments? suggestions? Other options?
> -greg
>
--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Possible rotatable bonds replacement

2014-01-31 Thread Toby Wright
Hi,

I favour option 1 but not strongly over option 3. Option 2 is cleanest but
I think the cost to users that expect the existing behaviour is too high. I
don't see much difference in the confusion levels between:
numRotatableBonds() vs numStrictRotatableBonds()
and
numRotatableBonds() vs numRotatableBonds(strict=true)
as neither is truly clean if the user thinks that the two definitions are
interchangable.
The invariant numRotatableBonds(X)>=numStrictRotatableBonds(X) holds which
is why I was thinking that one is a strict version of the other, but I'd
welcome a better name for the new function/variable.

Yours,

Toby Wright

--
InhibOx Ltd
Oxford



On 31 January 2014 11:05, JP  wrote:

> My 2p worth:
>
> I am not a big fan of outright replacing the NumRotatableBonds
> implementation (option 2).  This is quite a popular descriptor which is
> used in many ways (e.g. QSAR models, conformer generation, property
> calculation, etc.).  IF we are lucky (or skilful, or have had enough time),
> we have tests written out for everything which will break as soon as soon
> as we get different rotatable bonds count, and different results.  We can
> then revalidate our protocols using the new (strict) rotatable counts.
>  Perhaps we get better correlations/enrichments/AUCs etc ! Yeah!
>
> On the other hand option (1), having two methods NumRotatableBonds() and
> NumStrictRotatableBonds() will lead to some confusion.  Greg has a point
> about different people and/or libraries intermixing between the two.
>
> Like Paul, I prefer option (3) - with the default behaviour giving the old
> rotatable counts (not strict).  This does not come for free either, as the
> API becomes slightly less clean (and what to do in the future when, for
> example, someone finds a non-SMARTS based way to do this -- add another
> parameter?).  Still I think this is the less of all evils.
>
> Thanks Toby & Greg!
> JP
>
>
> On 31 January 2014 06:54,  wrote:
>
>> > I could add the new descriptor as Toby provided it. People are then
>> > free to pick between NumRotatableBonds() and NumStrictRotatableBonds
>> > (). This has the advantage of maintaining strict backwards
>> > compatibility, but I could imagine it being confusing/irritating to
>> > people using the code to have to choose between them (or, worse, using
>> both).
>> >
>> > Another option is to just replace the current NumRotatableBonds()
>> > SMARTS with the new one.
>> > This loses backwards compatibility, but replaces NumRotableBonds()
>> > with something more correct.
>> >
>> > Finally, I could take a hybrid approach: replace the default
>> > NumRotatableBonds() with the new one, but add an extra argument that
>> > allows the old one to be used.
>>
>> >
>> > I'm leaning towards the second option. I'd normally go with the
>> > third, but I almost view this as a bug fix for the rotatable bonds
>> definition.
>> >
>> > Comments? suggestions? Other options?
>>
>> I like your idea of your hybrid approach which would mean backwards
>> compatibility.
>>
>>
>> paul
>>
>>
>>
>> This message and any attachment are confidential and may be privileged or
>> otherwise protected from disclosure. If you are not the intended recipient,
>> you must not copy this message or attachment or disclose the contents to
>> any other person. If you have received this transmission in error, please
>> notify the sender immediately and delete the message and any attachment
>> from your system. Merck KGaA, Darmstadt, Germany and any of its
>> subsidiaries do not accept liability for any omissions or errors in this
>> message which may arise as a result of E-Mail-transmission or for damages
>> resulting from any unauthorized changes of the content of this message and
>> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
>> subsidiaries do not guarantee that this message is free of viruses and does
>> not accept liability for any damages caused by any virus transmitted
>> therewith.
>>
>> Click http://www.merckgroup.com/disclaimer to access the German, French,
>> Spanish and Portuguese versions of this disclaimer.
>>
>>
>> --
>> WatchGuard Dimension instantly turns raw network data into actionable
>> security intelligence. It gives you real-time visual feedback on key
>> security issues and trends.  Skip the complicated setup - simply import
>> a virtual appliance and go from zero to informed in seconds.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
>
>
> --
> WatchGuard Dimension instantly turns raw network data into actionable
> security intelligence. It gives you real-time visual feedback on key
> security issues and trends.  S

Re: [Rdkit-discuss] Possible rotatable bonds replacement

2014-01-31 Thread JP
My 2p worth:

I am not a big fan of outright replacing the NumRotatableBonds
implementation (option 2).  This is quite a popular descriptor which is
used in many ways (e.g. QSAR models, conformer generation, property
calculation, etc.).  IF we are lucky (or skilful, or have had enough time),
we have tests written out for everything which will break as soon as soon
as we get different rotatable bonds count, and different results.  We can
then revalidate our protocols using the new (strict) rotatable counts.
 Perhaps we get better correlations/enrichments/AUCs etc ! Yeah!

On the other hand option (1), having two methods NumRotatableBonds() and
NumStrictRotatableBonds() will lead to some confusion.  Greg has a point
about different people and/or libraries intermixing between the two.

Like Paul, I prefer option (3) - with the default behaviour giving the old
rotatable counts (not strict).  This does not come for free either, as the
API becomes slightly less clean (and what to do in the future when, for
example, someone finds a non-SMARTS based way to do this -- add another
parameter?).  Still I think this is the less of all evils.

Thanks Toby & Greg!
JP


On 31 January 2014 06:54,  wrote:

> > I could add the new descriptor as Toby provided it. People are then
> > free to pick between NumRotatableBonds() and NumStrictRotatableBonds
> > (). This has the advantage of maintaining strict backwards
> > compatibility, but I could imagine it being confusing/irritating to
> > people using the code to have to choose between them (or, worse, using
> both).
> >
> > Another option is to just replace the current NumRotatableBonds()
> > SMARTS with the new one.
> > This loses backwards compatibility, but replaces NumRotableBonds()
> > with something more correct.
> >
> > Finally, I could take a hybrid approach: replace the default
> > NumRotatableBonds() with the new one, but add an extra argument that
> > allows the old one to be used.
>
> >
> > I'm leaning towards the second option. I'd normally go with the
> > third, but I almost view this as a bug fix for the rotatable bonds
> definition.
> >
> > Comments? suggestions? Other options?
>
> I like your idea of your hybrid approach which would mean backwards
> compatibility.
>
>
> paul
>
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://www.merckgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.
>
>
> --
> WatchGuard Dimension instantly turns raw network data into actionable
> security intelligence. It gives you real-time visual feedback on key
> security issues and trends.  Skip the complicated setup - simply import
> a virtual appliance and go from zero to informed in seconds.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Possible rotatable bonds replacement

2014-01-30 Thread Paul . Czodrowski
> I could add the new descriptor as Toby provided it. People are then 
> free to pick between NumRotatableBonds() and NumStrictRotatableBonds
> (). This has the advantage of maintaining strict backwards 
> compatibility, but I could imagine it being confusing/irritating to 
> people using the code to have to choose between them (or, worse, using 
both).
> 
> Another option is to just replace the current NumRotatableBonds() 
> SMARTS with the new one.
> This loses backwards compatibility, but replaces NumRotableBonds() 
> with something more correct.
> 
> Finally, I could take a hybrid approach: replace the default 
> NumRotatableBonds() with the new one, but add an extra argument that
> allows the old one to be used.

> 
> I'm leaning towards the second option. I'd normally go with the 
> third, but I almost view this as a bug fix for the rotatable bonds 
definition.
> 
> Comments? suggestions? Other options?

I like your idea of your hybrid approach which would mean backwards 
compatibility.


paul



This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, you 
must not copy this message or attachment or disclose the contents to any other 
person. If you have received this transmission in error, please notify the 
sender immediately and delete the message and any attachment from your system. 
Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept 
liability for any omissions or errors in this message which may arise as a 
result of E-Mail-transmission or for damages resulting from any unauthorized 
changes of the content of this message and any attachment thereto. Merck KGaA, 
Darmstadt, Germany and any of its subsidiaries do not guarantee that this 
message is free of viruses and does not accept liability for any damages caused 
by any virus transmitted therewith.

Click http://www.merckgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.

--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Possible rotatable bonds replacement

2014-01-30 Thread Greg Landrum
Dear all,

A question for the community:

Toby Wright submitted a pull request this week that introduces a new,
stricter, rotatable bond definition:
https://github.com/rdkit/rdkit/pull/211/files
The new SMARTS, re-formatted to be somewhat more readable, is:
[!$(*#*)&\
 !D1&\
 !$(C(F)(F)F)&\
 !$(C(Cl)(Cl)Cl)&\
 !$(C(Br)(Br)Br)&\
 !$(C([CH3])([CH3])[CH3])&\
 !$([CD3](=[N,O,S])-!@[#7,O,S!D1])&\
 !$([#7,O,S!D1]-!@[CD3]=[N,O,S])&\
 !$([CD3](=[N+])-!@[#7!D1])&\
 !$([#7!D1]-!@[CD3]=[N+])]\
-!@\
[!$(*#*)&\
 !D1&\
 !$(C(F)(F)F)&\
 !$(C(Cl)(Cl)Cl)&\
 !$(C(Br)(Br)Br)&\
 !$(C([CH3])([CH3])[CH3])]

Toby was quite careful and added a new descriptor -
NumStrictRotatableBonds() - that uses this SMARTS.

I see a few options to deal with this:

I could add the new descriptor as Toby provided it. People are then free to
pick between NumRotatableBonds() and NumStrictRotatableBonds(). This has
the advantage of maintaining strict backwards compatibility, but I could
imagine it being confusing/irritating to people using the code to have to
choose between them (or, worse, using both).

Another option is to just replace the current NumRotatableBonds() SMARTS
with the new one.
This loses backwards compatibility, but replaces NumRotableBonds() with
something more correct.

Finally, I could take a hybrid approach: replace the default
NumRotatableBonds() with the new one, but add an extra argument that allows
the old one to be used.

I'm leaning towards the second option. I'd normally go with the third, but
I almost view this as a bug fix for the rotatable bonds definition.

Comments? suggestions? Other options?
-greg
--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss