Re: [Rdkit-discuss] SMARTS pattern

2022-06-07 Thread Greg Landrum
Hi Eduardo,

If I'm understanding what you want to do correctly, then you could try
extending your SMARTS pattern to include a ring bond to a neighbor from
each atom in the ring:
*@*~1~*(@*)~*(@*)~*(@*)~*(@*)~*~1@*

If you only want the indices of the ring atoms, you can then just pick
those out of the match results you get back

-greg


On Tue, Jun 7, 2022 at 7:23 PM Eduardo Mayo 
wrote:

> Greetings!!
>
> I hope this email finds you well.
>
> I need a SMARTS pattern that matches this molecule fragment
> [image: image.png]
> The first pattern I used was:
> [*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1
>
> However, it also matches this fragment. This is not the expected behavior
> but it agrees with the pattern, so I tried adding the ring size constrain.
> [image: image.png]
> Now the pattern I am using is this:
> [*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1
>
> It worked quite well but now it fail to find matches in this molecule
> [image: image.png]
>
> Does anyone know what I am doing wrong??
>
> Code:
> ---
>
> m1 = Chem.MolFromSmiles(
> "c1ccc2cc3c(ccc4c5c5c5cc6c7cc8c(cc7c6cc5c34)c3cccnc38)cc2c1")
> m2 = Chem.MolFromSmiles(
> "b12c1c1c(c3ccc4ccc4c3c3c4c5cc[nH]c5c4c13)c1ncc3c3c21")
> m3 = Chem.MolFromSmiles(
> "b1ccbc2c1c1ccoc1c1c2c2ccsc2c2[nH]c3ncc4c(c3c21)=c1n1=4")
>
> p = Chem.MolFromSmarts("[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1")
> for m, expected_value in zip([m1,m2,m3],[1,2,2]):
> print(len(m.GetSubstructMatches(p)) == expected_value)
>
>
> p = Chem.MolFromSmarts(
> "[*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1")
> for m, expected_value in zip([m1,m2,m3],[1,2,2]):
> print(len(m.GetSubstructMatches(p)) == expected_value)
>
> All the best,
> Eduardo
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS pattern

2022-06-07 Thread Wim Dehaen
The above solution with !r4 doesn't work because for sssr reasons these
atoms are considered to be in a 4 membered ring also if the 4 membered ring
is "exo" to the central 6 membered one. AFAIK there is no good way to do a
general ring size filter in an atom definition using SMARTS. Below is a
quite ugly, but working solution

def GetSubstructMatches_filtered(mol,pattern):
matches = mol.GetSubstructMatches(pattern)
filtered_matches = []
for match in matches:
if Chem.MolFragmentToSmiles(mol, atomsToUse=match).count("2") == 0:
filtered_matches.append(match)
return tuple(filtered_matches)

m1 =
Chem.MolFromSmiles("c1ccc2cc3c(ccc4c5c5c5cc6c7cc8c(cc7c6cc5c34)c3cccnc38)cc2c1")
m2 =
Chem.MolFromSmiles("b12c1c1c(c3ccc4ccc4c3c3c4c5cc[nH]c5c4c13)c1ncc3c3c21")
m3 =
Chem.MolFromSmiles("b1ccbc2c1c1ccoc1c1c2c2ccsc2c2[nH]c3ncc4c(c3c21)=c1n1=4")

p = Chem.MolFromSmarts("[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1")
for m, expected_value in zip([m1,m2,m3],[1,2,2]):
print(len(GetSubstructMatches_filtered(m,p)) == expected_value)



how does it work? the function GetSubstructMatches_filtered checks if there
is more than one ring in the substructure (by converting to substruct to
SMILES using atom indices from the GetSubstructMatches result and searching
for "2" in the string) and rejects it if so.
wim




On Tue, Jun 7, 2022 at 8:52 PM Geoffrey Hutchison 
wrote:

> Nevermind, x3 won't exclude the fused 4-atom rings from your first
> example. I'll let you know if I think of some other way. :-)
>
>
> I think you'd want something like this, perhaps - to exclude atoms in ring
> size 4?
>
> [*;R2!r4]~1~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~1
>
> I also don't know if you're trying to ensure that each of the atoms are
> aromatic, in which case, you'd want something like:
>
> [a;R2!r4]~1~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~1
>
> Hope that helps,
> -Geoff
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS pattern

2022-06-07 Thread Geoffrey Hutchison
> Nevermind, x3 won't exclude the fused 4-atom rings from your first example. 
> I'll let you know if I think of some other way. :-)

I think you'd want something like this, perhaps - to exclude atoms in ring size 
4?

[*;R2!r4]~1~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~[*;R2!r4]~1

I also don't know if you're trying to ensure that each of the atoms are 
aromatic, in which case, you'd want something like:

[a;R2!r4]~1~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~[a;R2!r4]~1

Hope that helps,
-Geoff___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS pattern

2022-06-07 Thread Ivan Tubert-Brohman
On Tue, Jun 7, 2022 at 1:39 PM Ivan Tubert-Brohman <
ivan.tubert-broh...@schrodinger.com> wrote:

> Perhaps using x3 instead (means "number of ring bonds") would work for
> your purposes?
>

Nevermind, x3 won't exclude the fused 4-atom rings from your first example.
I'll let you know if I think of some other way. :-)
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS pattern

2022-06-07 Thread Ivan Tubert-Brohman
Hi Eduardo,

I believe the problem is that r6 means "in *smallest* SSSR ring of size
", where "smallest" in this context means that, for example, for an atom
at the ring fusion between a 5-member ring and a 6-member ring, r5 would
match that atom but r6 wouldn't.

Perhaps using x3 instead (means "number of ring bonds") would work for your
purposes?

Hope this helps,
Ivan


On Tue, Jun 7, 2022 at 1:22 PM Eduardo Mayo 
wrote:

> Greetings!!
>
> I hope this email finds you well.
>
> I need a SMARTS pattern that matches this molecule fragment
> [image: image.png]
> The first pattern I used was:
> [*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1
>
> However, it also matches this fragment. This is not the expected behavior
> but it agrees with the pattern, so I tried adding the ring size constrain.
> [image: image.png]
> Now the pattern I am using is this:
> [*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1
>
> It worked quite well but now it fail to find matches in this molecule
> [image: image.png]
>
> Does anyone know what I am doing wrong??
>
> Code:
> ---
>
> m1 = Chem.MolFromSmiles(
> "c1ccc2cc3c(ccc4c5c5c5cc6c7cc8c(cc7c6cc5c34)c3cccnc38)cc2c1")
> m2 = Chem.MolFromSmiles(
> "b12c1c1c(c3ccc4ccc4c3c3c4c5cc[nH]c5c4c13)c1ncc3c3c21")
> m3 = Chem.MolFromSmiles(
> "b1ccbc2c1c1ccoc1c1c2c2ccsc2c2[nH]c3ncc4c(c3c21)=c1n1=4")
>
> p = Chem.MolFromSmarts("[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1")
> for m, expected_value in zip([m1,m2,m3],[1,2,2]):
> print(len(m.GetSubstructMatches(p)) == expected_value)
>
>
> p = Chem.MolFromSmarts(
> "[*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1")
> for m, expected_value in zip([m1,m2,m3],[1,2,2]):
> print(len(m.GetSubstructMatches(p)) == expected_value)
>
> All the best,
> Eduardo
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] SMARTS pattern

2022-06-07 Thread Eduardo Mayo
Greetings!!

I hope this email finds you well.

I need a SMARTS pattern that matches this molecule fragment
[image: image.png]
The first pattern I used was:
[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1

However, it also matches this fragment. This is not the expected behavior
but it agrees with the pattern, so I tried adding the ring size constrain.
[image: image.png]
Now the pattern I am using is this:
[*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1

It worked quite well but now it fail to find matches in this molecule
[image: image.png]

Does anyone know what I am doing wrong??

Code:
---

m1 = Chem.MolFromSmiles(
"c1ccc2cc3c(ccc4c5c5c5cc6c7cc8c(cc7c6cc5c34)c3cccnc38)cc2c1")
m2 = Chem.MolFromSmiles(
"b12c1c1c(c3ccc4ccc4c3c3c4c5cc[nH]c5c4c13)c1ncc3c3c21")
m3 = Chem.MolFromSmiles(
"b1ccbc2c1c1ccoc1c1c2c2ccsc2c2[nH]c3ncc4c(c3c21)=c1n1=4")

p = Chem.MolFromSmarts("[*;R2]~1~[*;R2]~[*;R2]~[*;R2]~[*;R2]~[*;R2]~1")
for m, expected_value in zip([m1,m2,m3],[1,2,2]):
print(len(m.GetSubstructMatches(p)) == expected_value)


p = Chem.MolFromSmarts(
"[*;R2r6]~1~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~[*;R2r6]~1")
for m, expected_value in zip([m1,m2,m3],[1,2,2]):
print(len(m.GetSubstructMatches(p)) == expected_value)

All the best,
Eduardo
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS pattern replacement inside a ring; without breaking the ring open...

2021-01-12 Thread Francois Berenger

On 12/01/2021 15:10, Fiorella Ruggiu wrote:

Hi Francois,

not sure if you have solved this yet. I believe it won't be possible
to use AllChem.ReplaceSubstructs without breaking the rings or
enumerating them. You can however use reactions for this problem.
Here's an example based on yours:

mol = Chem.MolFromSmiles('O=c1[nH]1')

rxn =
AllChem.ReactionFromSmarts('[c:1](=[O:2])[nH:3]>>[c:1]([O:2])[nH0:3]')

ps = rxn.RunReactants([Chem.MolFromSmiles('O=c1[nH]1')])

Chem.MolToSmiles(ps[0][0])

'Oc1n1'


I was fighting other fires. Thanks a lot for this example!


Hope this helps!

Best,

Fio

On Thu, Jan 7, 2021 at 10:33 PM Francois Berenger 
wrote:


Dear list,

I have been trying to replace this SMARTS pattern in a ring:

'c(=O)[nH]'

By this SMILES fragment:

'c(O)n'

My trials using a single SMARTS pattern search then replace
break open the ring, which is not what I want.

My not working trial code:
---
mol = Chem.MolFromSmiles('O=c1[nH]1')
pat = Chem.MolFromSmarts('c(=O)[nH]')
rep = Chem.MolFromSmarts('c(O)n')
res = AllChem.ReplaceSubstructs(mol,pat,rep)
Chem.MolToSmiles(res[0])
'c(n)O'
---

The example molecule is just an example; the ring might be smaller
and/or have more heteroatoms.

Should I use a chemical reaction for this?

Am I forced to describe full rings in both SMARTS patterns?!
I don't want to have to enumerate all the possibilities...

I can make it ~work~ using two replacements:
first 'c(=O)' to 'c(O)'
then
'[nH]' to 'n'
But this is less precise than what I really want
and I believe it will change molecules or places I don't want to
change.

Thanks a lot and happy new year!
F.

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS pattern replacement inside a ring; without breaking the ring open...

2021-01-11 Thread Fiorella Ruggiu
Hi Francois,

not sure if you have solved this yet. I believe it won't be possible to use
AllChem.ReplaceSubstructs without breaking the rings or enumerating them.
You can however use reactions for this problem. Here's an example based on
yours:

mol = Chem.MolFromSmiles('O=c1[nH]1')
rxn = AllChem.ReactionFromSmarts('[c:1](=[O:2])[nH:3]>>[c:1]([O:2])[nH0:3]')
ps = rxn.RunReactants([Chem.MolFromSmiles('O=c1[nH]1')])
Chem.MolToSmiles(ps[0][0])

'Oc1n1'


Hope this helps!

Best,

Fio



On Thu, Jan 7, 2021 at 10:33 PM Francois Berenger  wrote:

> Dear list,
>
> I have been trying to replace this SMARTS pattern in a ring:
>
> 'c(=O)[nH]'
>
> By this SMILES fragment:
>
> 'c(O)n'
>
> My trials using a single SMARTS pattern search then replace
> break open the ring, which is not what I want.
>
> My not working trial code:
> ---
> mol = Chem.MolFromSmiles('O=c1[nH]1')
> pat = Chem.MolFromSmarts('c(=O)[nH]')
> rep = Chem.MolFromSmarts('c(O)n')
> res = AllChem.ReplaceSubstructs(mol,pat,rep)
> Chem.MolToSmiles(res[0])
> 'c(n)O'
> ---
>
> The example molecule is just an example; the ring might be smaller
> and/or have more heteroatoms.
>
> Should I use a chemical reaction for this?
>
> Am I forced to describe full rings in both SMARTS patterns?!
> I don't want to have to enumerate all the possibilities...
>
> I can make it ~work~ using two replacements:
> first 'c(=O)' to 'c(O)'
> then
> '[nH]' to 'n'
> But this is less precise than what I really want
> and I believe it will change molecules or places I don't want to change.
>
> Thanks a lot and happy new year!
> F.
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] SMARTS pattern replacement inside a ring; without breaking the ring open...

2021-01-07 Thread Francois Berenger

Dear list,

I have been trying to replace this SMARTS pattern in a ring:

'c(=O)[nH]'

By this SMILES fragment:

'c(O)n'

My trials using a single SMARTS pattern search then replace
break open the ring, which is not what I want.

My not working trial code:
---
mol = Chem.MolFromSmiles('O=c1[nH]1')
pat = Chem.MolFromSmarts('c(=O)[nH]')
rep = Chem.MolFromSmarts('c(O)n')
res = AllChem.ReplaceSubstructs(mol,pat,rep)
Chem.MolToSmiles(res[0])
'c(n)O'
---

The example molecule is just an example; the ring might be smaller
and/or have more heteroatoms.

Should I use a chemical reaction for this?

Am I forced to describe full rings in both SMARTS patterns?!
I don't want to have to enumerate all the possibilities...

I can make it ~work~ using two replacements:
first 'c(=O)' to 'c(O)'
then
'[nH]' to 'n'
But this is less precise than what I really want
and I believe it will change molecules or places I don't want to change.

Thanks a lot and happy new year!
F.


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS Pattern and scaffold

2018-02-05 Thread Paolo Tosco

Dear Colin,

you might specify the number of implicit Hs that you want on the carbons 
of the indazole nucleus, e.g.:


'[#7]1:[#6&h1]:[#6]2:[#6&h1]:[#6&h1]:[#6&h1]:[#6&h1]:[#6]:2:[#7]:1-[#6]-[#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1'

This would rule out substituted indazoles.

HTH, cheers
p.


On 02/05/18 09:26, Colin Bournez wrote:

Hello everyone,

I have trouble finding what I want using smarts pattern :
Let's say I have for example these molecules :

smis 
=('n2cc1c1n2Cc1c1CC','n2cc1c1n2Cc1c(CC)cc(CCl)cc1','n2cc1c(CC)1n2Cc1c1CC','n2cc1cc(CF)ccc1n2Cc1cc(CC)ccc1CC')


ms = [Chem.MolFromSmiles(x) for x in smis]
Chem.Draw.MolsToGridImage(ms)

So, I have this smarts pattern :
patt = 
Chem.MolFromSmarts('[#7]1:[#6]:[#6]2:[#6]:[#6]:[#6]:[#6]:[#6]:2:[#7]:1-[#6]-[#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1')


When I run :
for smi,m in zip(smis,ms):
    print(smi,m.HasSubstructMatch(patt))

I have logically :
n2cc1c1n2Cc1c1CC True
n2cc1c1n2Cc1c(CC)cc(CCl)cc1 True
n2cc1c(CC)1n2Cc1c1CC True
n2cc1cc(CF)ccc1n2Cc1cc(CC)ccc1CC True

My goal is to have :
n2cc1c1n2Cc1c1CC True
n2cc1c1n2Cc1c(CC)cc(CCl)cc1 True
n2cc1c(CC)1n2Cc1c1CC False
n2cc1cc(CF)ccc1n2Cc1cc(CC)ccc1CC False
So precisely, I want to "block" the indazole from any substitutions and 
retrieve only molecules with changes on the phenyl.

Thanks in advance.

Colin Bournez
-- *Colin Bournez* PhD Student, Structural Bioinformatics & 
Chemoinformatics Institut de Chimie Organique et Analytique (ICOA), 
UMR CNRS-Université d'Orléans 7311 Rue de Chartres, 45067 Orléans, 
France T. +33 238 494 577



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] SMARTS Pattern and scaffold

2018-02-05 Thread Colin Bournez

Hello everyone,

I have trouble finding what I want using smarts pattern :
Let's say I have for example these molecules :

smis 
=('n2cc1c1n2Cc1c1CC','n2cc1c1n2Cc1c(CC)cc(CCl)cc1','n2cc1c(CC)1n2Cc1c1CC','n2cc1cc(CF)ccc1n2Cc1cc(CC)ccc1CC')


ms = [Chem.MolFromSmiles(x) for x in smis]
Chem.Draw.MolsToGridImage(ms)

So, I have this smarts pattern :
patt = 
Chem.MolFromSmarts('[#7]1:[#6]:[#6]2:[#6]:[#6]:[#6]:[#6]:[#6]:2:[#7]:1-[#6]-[#6]1:[#6]:[#6]:[#6]:[#6]:[#6]:1')


When I run :
for smi,m in zip(smis,ms):
print(smi,m.HasSubstructMatch(patt))

I have logically :

n2cc1c1n2Cc1c1CC True
n2cc1c1n2Cc1c(CC)cc(CCl)cc1 True
n2cc1c(CC)1n2Cc1c1CC True
n2cc1cc(CF)ccc1n2Cc1cc(CC)ccc1CC True

My goal is to have :
n2cc1c1n2Cc1c1CC True
n2cc1c1n2Cc1c(CC)cc(CCl)cc1 True
n2cc1c(CC)1n2Cc1c1CC False
n2cc1cc(CF)ccc1n2Cc1cc(CC)ccc1CC False


So precisely, I want to "block" the indazole from any substitutions and 
retrieve only molecules with changes on the phenyl.

Thanks in advance.

Colin Bournez

-- *Colin Bournez* PhD Student, Structural Bioinformatics & 
Chemoinformatics Institut de Chimie Organique et Analytique (ICOA), UMR 
CNRS-Université d'Orléans 7311 Rue de Chartres, 45067 Orléans, France T. 
+33 238 494 577
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS pattern matching of canonical forms of aromatic molecules

2017-09-08 Thread Jason Biggs
Start with your benzene molecule

m = Chem.MolFromSmiles('c1c1')


make a pattern using Peter's example, with three aromatic atoms connected
by three aromatic bonds

patt = Chem.MolFromSmarts('a:a:a')


and it's a match:

m.HasSubstructMatch(patt)

>True


Kekulize your mol, and the pattern doesn't match

Chem.rdmolops.Kekulize(m)
m.HasSubstructMatch(patt)
>False


but if you change the smarts pattern to match aromatic atoms connected by
kekulized bonds, it matches

patt2 = Chem.MolFromSmarts('[a]=[a]-[a]')
m.HasSubstructMatch(patt2)
>True

Your original SMARTS query doesn't match, because C in a smarts string is
specifically an aliphatic carbon.  Change it to c and it will match.  It
would work, if you had removed the aromatic flags when kekulizing


m = Chem.MolFromSmiles('c1c1')
Chem.rdmolops.Kekulize(m, clearAromaticFlags = True)
patt = Chem.MolFromSmarts('[C]=[C]-[C]')
m.HasSubstructMatch(patt)
>True



So when you kekulize, without using the clearAromaticFlags option, then
aromatic atoms will still only match 'a', not 'A', but the bonds will only
match '=' or '-', but not ':'  (they will also match '@' or '~', but that's
beside the point here)

As Peter mentions, by default if you read in a kekulized SMILES string, the
mol you create will not be kekulized, but it sounds like you are
intentionally kekulizing before doing substructure matching.



Jason Biggs


On Fri, Sep 8, 2017 at 5:19 PM, James T. Metz via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> Hello,
>
> Suppose I read in the SMILES of an aromatic molecule e.g., for
> benzene
>
> c1c1
>
> I then want to convert the molecule to a Kekule representation and
> then perform various SMARTS pattern recognition e.g.
>
> [C]=[C]-[C]
>
> I have tried various Kekule commands in RDkit, but I can not figure
> out how to (or if it is possible) to recognize a SMARTS pattern for
> a portion of a molecule which is aromatic, but is currently being
> stored as a Kekule structure.
>
> Also, is it possible to generate and store more than one Kekule
> form in RDkit?
>
> Thank you.
>
> Regards,
> Jim Metz
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS pattern matching of canonical forms of aromatic molecules

2017-09-08 Thread Peter S. Shenkin
Hi,

In SMARTS, 'a' matches an aromatic atom. So you would match your molecule
with the pattern 'aaa', or if you wanted to restrict yourself to carbons,
'ccc'.

This would match whether you created the molecule from a Kekulized or an
aromatic SMILES. Remember that it's the molecular recognition code, not the
form of the input SMILES, that determines whether a molecule is aromatic.

-P.

On Fri, Sep 8, 2017 at 6:19 PM, James T. Metz via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> Hello,
>
> Suppose I read in the SMILES of an aromatic molecule e.g., for
> benzene
>
> c1c1
>
> I then want to convert the molecule to a Kekule representation and
> then perform various SMARTS pattern recognition e.g.
>
> [C]=[C]-[C]
>
> I have tried various Kekule commands in RDkit, but I can not figure
> out how to (or if it is possible) to recognize a SMARTS pattern for
> a portion of a molecule which is aromatic, but is currently being
> stored as a Kekule structure.
>
> Also, is it possible to generate and store more than one Kekule
> form in RDkit?
>
> Thank you.
>
> Regards,
> Jim Metz
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] SMARTS pattern matching of canonical forms of aromatic molecules

2017-09-08 Thread James T. Metz via Rdkit-discuss
Hello,


Suppose I read in the SMILES of an aromatic molecule e.g., for

benzene


c1c1



I then want to convert the molecule to a Kekule representation and

then perform various SMARTS pattern recognition e.g.


[C]=[C]-[C]



I have tried various Kekule commands in RDkit, but I can not figure

out how to (or if it is possible) to recognize a SMARTS pattern for
a portion of a molecule which is aromatic, but is currently being
stored as a Kekule structure.


Also, is it possible to generate and store more than one Kekule

form in RDkit?


Thank you.


Regards,

Jim Metz





--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss