Hi Andrew,

First off for the SMARTS matcher you can turn off the "prepare" or use the
lower level APIs and work on the input aromaticity.

IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();

SmartsPattern pat = SmartsPattern.create("C=CC=N");
pat.setPrepare(false); // turn off auto ring+arom perception

IAtomContainer mol = new
SmilesParser(bldr).parseSmiles("OCCO[P+]1(OCCO)n2c3ccc2/C(c2ccccc2)=C2/C=CC(=N2)/C(c2ccccc2)=c2/cc/c(n21)=C(\\c1ccccc1)C1=NC(=C3c2ccccc2)C=C1
CHEMBL2369103");
Cycles.markRingAtomsAndBonds(mol); // we need to do this manually because
System.err.println(pat.matchAll(mol).count());

I'm not sure how you got that output:

Aromaticity.apply(Aromaticity.Model.Daylight, mol);
System.err.println(new SmilesGenerator(SmiFlavor.Default +
SmiFlavor.UseAromaticSymbols).create(mol));

Gives me:

OCCO[P+]1(OCCO)n2c3ccc2c(-c4ccccc4)c5C=Cc(n5)c(-c6ccccc6)c7ccc(n71)c(-c8ccccc8)c9nc(c3-c%10ccccc%10)C=C9

On Tue, 24 Jun 2025 at 13:09, Andrew Dalke <da...@dalkescientific.com>
wrote:

> Hi all,
>
>   Given a molecule, how do I generate a SMILES which reflects the internal
> aromaticity used?
>
> I'm cross-comparing some work using RDKit with CDK. The differences appear
> to be due to differences in aromaticity perception, as expected.
>
> I'm trying to figure out how to verify these differences. Consider the
> following input SMILES:
>
> OCCO[P+]1(OCCO)n2c3ccc2/C(c2ccccc2)=C2/C=CC(=N2)/C(c2ccccc2)=c2/cc/c(n21)=C(\c1ccccc1)C1=NC(=C3c2ccccc2)C=C1
> CHEMBL2369103
>
> and SMARTS:
>
> C=CC=N
>
> While the SMARTS seems like it would match the "C=CC(=N2)" in the SMILES,
> toolkits of course can perceive their own aromaticity.
> Testing with CDK Depict shows CDK perceives all four nitrogens as aromatic.
>
> A SMARTS which does match is C=C-c:n and using "a" for the SMARTS verifies
> that all nitrogens are aromatic.
>
> I wanted to verify this by visual inspection of the SMILES. When I
> generate the SMILES with the default flavor I get, as I should have
> expected, a Kekule form:
>
>
> C1=CC=C(C=C1)/C/2=C/3\\C=CC(=N3)C(=C4C=CC5=C(C6=CC=CC=C6)C7=NC(=C(C8=CC=CC=C8)C9=CC=C2N9[P+](N45)(OCCO)OCCO)C=C7)C%10=CC=CC=C%10
>
> When I remembered to add UseAromaticSymbols to the flavor I get:
>
>
> c1ccc(cc1)/C/2=C/3\C=CC(=N3)C(=c4ccc5=C(c6ccccc6)C7=NC(=C(c8ccccc8)c9ccc2n9[P+](n45)(OCCO)OCCO)C=C7)c%10ccccc%10
>
> This shows two aromatic nitrogens and two aliphatic nitrogens, which I
> expected four "n" terms.
>
> This SMILES contains "C=CC(=N3)" which I would expect to match the SMARTS
> "C=CC=N", so I can't use this approach for manual verification.
>
> I didn't see any other relevant flavors to add. Is there something else I
> should do?
>
> Cheers,
>
>                                 Andrew
>                                 da...@dalkescientific.com
>
>
>
>
>
> _______________________________________________
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to