Thanks Peter and Greg!  I had a three atom query to restrict were I was
putting F's, otherwise I would have done as Peter had suggested.  Granted
my path to flush out the duplicates by pushing this out into Java (using
the RDKit Swig bindings) was way more involved than this!  Thanks for the
walkthrough Greg!  It was very helpful!

Thanks again!
Matthew



On Sat, Jan 31, 2015 at 1:58 AM, Greg Landrum <greg.land...@gmail.com>
wrote:

> For anyone interested in this topic, I just did an RDKit blog post that
> has a somewhat expanded version of this answer:
> http://rdkit.blogspot.com/2015/01/chemical-reaction-notes-i.html
>
> Best,
> -greg
>
> On Sat, Jan 31, 2015 at 7:59 AM, Greg Landrum <greg.land...@gmail.com>
> wrote:
>
>> Hi Matthew,
>>
>> On Fri, Jan 30, 2015 at 11:06 PM, Matthew Lardy <mla...@gmail.com> wrote:
>>
>>>
>>> I am having an issue using the Smarts based Reaction transformations in
>>> RDKit.  This is a weird transformation, but I wanted to replace any or all
>>> of the protons on an aromatic ring with an F.
>>>
>>> The original transformation that I tried was:
>>> ccc>>cc(F)c
>>>
>>> But that didn't work.  So then I tried a couple of other transformations:
>>>
>>> [c:1][c:2][c:3]>>[c:1][c:2]([F])[c:3]
>>>
>>> That failed (as these things generally were failing):
>>> >>> ps = rxn.RunReactants(mol1)
>>> Traceback (most recent call last):
>>>   File "<stdin>", line 1, in <module>
>>> Boost.Python.ArgumentError: Python argument types in
>>>     ChemicalReaction.RunReactants(ChemicalReaction, Mol)
>>> did not match C++ signature:
>>>     RunReactants(class RDKit::ChemicalReaction *, class
>>> boost::python::list)
>>>     RunReactants(class RDKit::ChemicalReaction *, class
>>> boost::python::tuple)
>>>
>>>
>> The hint to what is going on is in the error message: you called the
>> RunReactants method with a Mol (the ChemicalReaction in the argument list
>> is the "self" argument) and it was expecting either a list or a tuple.
>> Here's a version that works:
>>
>> In [8]: rxn =
>> AllChem.ReactionFromSmarts('[c:1][c:2][c:3]>>[c:1][c:2]([F])[c:3]')
>> In [9]: m = Chem.MolFromSmiles('c1ccccc1')
>> In [10]: ps = rxn.RunReactants((m,))
>> In [11]: len(ps)
>> Out[11]: 12
>> In [12]: Chem.MolToSmiles(ps[0][0])
>> Out[12]: 'Fc1ccccc1'
>>
>> Note that this still doesn't really do what you want, because it's
>> encoded to add an F to an aromatic carbon. Here's an example that shows
>> that:
>>
>> In [15]: m = Chem.MolFromSmiles('c1ccc(C)cc1')
>> In [16]: ps = rxn.RunReactants((m,))
>> In [17]: len(ps)
>> Out[17]: 12
>> In [18]: set([Chem.MolToSmiles(x[0],True) for x in ps])
>> Out[18]: {'Cc1(F)ccccc1', 'Cc1ccc(F)cc1', 'Cc1cccc(F)c1', 'Cc1ccccc1F'}
>>
>> Note the first product: the F was also added to the carbon with the
>> methyl group.
>>
>> We can fix that by specifying that the reacting carbon must have an H
>> attached:
>>
>> In [22]: rxn =
>> AllChem.ReactionFromSmarts('[c:1][cH:2][c:3]>>[c:1][c:2]([F])[c:3]')
>> In [23]: ps = rxn.RunReactants((m,))
>> In [24]: len(ps)
>> Out[24]: 10
>> In [25]: set([Chem.MolToSmiles(x[0],True) for x in ps])
>> Out[25]: {'Cc1ccc(F)cc1', 'Cc1cccc(F)c1', 'Cc1ccccc1F'}
>>
>> There's still the question of why so many products are being produced.
>> Look at Out[24], why do we get 10 different products?
>>
>> The answer is the symmetry in the query describing the reactant.
>> Everywhere this query can match, it matches twice - frontwards and
>> backwards. So instead of five products, three of which are unique, we get
>> ten.
>>
>> This can be handled by recognizing that [c:1] and [c:3] are not actually
>> involved in the reaction, they are just there to define the environment of
>> [c:2]. We can do the same thing with a recursive SMARTS:
>>
>> In [30]: rxn = AllChem.ReactionFromSmarts('[cH&$(c(c)c):2]>>[c:2][F]')
>> In [31]: ps = rxn.RunReactants((m,))
>> In [32]: len(ps)
>> Out[32]: 5
>> In [33]: set([Chem.MolToSmiles(x[0],True) for x in ps])
>> Out[33]: {'Cc1ccc(F)cc1', 'Cc1cccc(F)c1', 'Cc1ccccc1F'}
>>
>> Hope this helps,
>> -greg
>>
>>
>>
>>
>>> Then I got desperate:
>>>
>>> [#6:1][#6:2]([#1])[#6:3].[H][#9:4]>>[#6:1][#6:2]([#9:4])[#6:3]
>>>
>>> Any mention of an explicit H caused issues, so then I dropped it and
>>> re-ran things again.
>>>
>>> No luck.  I should mention that I am using the pre-built python RDKit
>>> wrappers for windows, and if I use the java wrappers on linux I get
>>> different errors but the same outcome.
>>>
>>> I should add, that the molecule that I read (and the molecule for HF)
>>> were both loaded without issue.
>>>
>>> Anyone else try to do something like this?
>>>
>>> Matthew
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Dive into the World of Parallel Programming. The Go Parallel Website,
>>> sponsored by Intel and developed in partnership with Slashdot Media, is
>>> your
>>> hub for all things parallel software development, from weekly thought
>>> leadership blogs to news, videos, case studies, tutorials and more. Take
>>> a
>>> look and join the conversation now. http://goparallel.sourceforge.net/
>>> _______________________________________________
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is
> your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to