Hi Andrew,

if I dug this up correctly in the CDK Depict and CDK code, the web
application always applies the CDK Daylight aromaticity model to the input
structure to prepare it for SMARTS matching, and this way overrides all
existing aromaticity flags 
(see here:
https://github.com/cdk/depict/blob/21169bbe14668a947331164d0cd17a72f46c620e/
cdkdepict-lib/src/main/java/org/openscience/cdk/app/DepictController.java#L1
212 and here:
https://github.com/cdk/cdk/blob/ffa903da9e44ea03e4c29fe1831eaeda3be8e9ac/too
l/smarts/src/main/java/org/openscience/cdk/smarts/SmartsPattern.java#L112 ->
SmartPattern.matchAll() is called in the web app, which internally calls
SmartsPattern.prepare(), which applies the aromaticity perception; this can
be turned off, but not in the web app, as far as I can see). 

If I do this explicitly in CDK code, I also get the four aromatic n:

SmilesParser smiPar = new
SmilesParser(SilentChemObjectBuilder.getInstance());
IAtomContainer mol =
smiPar.parseSmiles("OCCO[P+]1(OCCO)n2c3ccc2/C(c2ccccc2)=C2/C=CC(=N2)/C(c2ccc
cc2)=c2/cc/c(n21)=C(\\c1ccccc1)C1=NC(=C3c2ccccc2)C=C1");
Cycles.markRingAtomsAndBonds(mol);
Aromaticity.apply(Aromaticity.Model.Daylight, mol);
SmilesGenerator smiGen = new SmilesGenerator(SmiFlavor.Canonical |
SmiFlavor.UseAromaticSymbols);
System.out.println(smiGen.create(mol));

Output:
OCCO[P+]1(OCCO)n2c3ccc2c(c4nc(C=C4)c(-c5ccccc5)c6ccc(c(c7nc(C=C7)c3-c8ccccc8
)-c9ccccc9)n61)-c%10ccccc%10

> Given a molecule, how do I generate a SMILES which reflects the internal
aromaticity used?

Not sure I understand this correctly, but what you are doing towards the
end, parsing and re-generating the SMILES code using CDK code (without
aromaticity perception, so basically without line 4 in my example - I
assume) with the SmiFlavor.UseAromaticSymbols, reproduces exactly the
aromaticity information given in the input SMILES code in the output as well
(I depicted your input and output with aromaticity display turned on and
compared them; they appear to be the same).

When it comes to SMARTS matching, you can turn the aromaticity perception
off in CDK code, e.g.:

SmilesParser smiPar = new
SmilesParser(SilentChemObjectBuilder.getInstance());
IAtomContainer mol =
smiPar.parseSmiles("OCCO[P+]1(OCCO)n2c3ccc2/C(c2ccccc2)=C2/C=CC(=N2)/C(c2ccc
cc2)=c2/cc/c(n21)=C(\\c1ccccc1)C1=NC(=C3c2ccccc2)C=C1");
//no cycle and aromaticity perception
SmilesGenerator smiGen = new SmilesGenerator(SmiFlavor.Canonical |
SmiFlavor.UseAromaticSymbols);
SmartsPattern pattern = SmartsPattern.create("C=CC=N");
//prevent the SMARTS pattern from perceiving aromaticity
pattern.setPrepare(false);
System.out.println(pattern.matchAll(mol).count());

Output: 2

Does this help you? I guess the bad news is that you cannot use the CDK
depict web app, but the good news is that it is possible via code.

Kind regards,
Jonas
________________
Dr Jonas Schaub
jonas.sch...@uni-jena.de
http://orcid.org/0000-0003-1554-6666
https://github.com/JonasSchaub
https://www.researchgate.net/profile/Jonas-Schaub

Postdoctoral Researcher 
Steinbeck Research Group 
Friedrich Schiller University Jena, Germany
http://cheminf.uni-jena.de

Institute for Inorganic and Analytical Chemistry
Lessingstr. 8
07743 Jena

-----Ursprüngliche Nachricht-----
Von: Andrew Dalke <da...@dalkescientific.com> 
Gesendet: Dienstag, 24. Juni 2025 13:51
An: CDK users list <cdk-user@lists.sourceforge.net>
Betreff: [Cdk-user] preserve aromaticity on SMILES output

Hi all,

  Given a molecule, how do I generate a SMILES which reflects the internal
aromaticity used?

I'm cross-comparing some work using RDKit with CDK. The differences appear
to be due to differences in aromaticity perception, as expected.

I'm trying to figure out how to verify these differences. Consider the
following input SMILES:

OCCO[P+]1(OCCO)n2c3ccc2/C(c2ccccc2)=C2/C=CC(=N2)/C(c2ccccc2)=c2/cc/c(n21)=C(
\c1ccccc1)C1=NC(=C3c2ccccc2)C=C1 CHEMBL2369103

and SMARTS:

C=CC=N

While the SMARTS seems like it would match the "C=CC(=N2)" in the SMILES,
toolkits of course can perceive their own aromaticity. 
Testing with CDK Depict shows CDK perceives all four nitrogens as aromatic.

A SMARTS which does match is C=C-c:n and using "a" for the SMARTS verifies
that all nitrogens are aromatic.

I wanted to verify this by visual inspection of the SMILES. When I generate
the SMILES with the default flavor I get, as I should have expected, a
Kekule form:

C1=CC=C(C=C1)/C/2=C/3\\C=CC(=N3)C(=C4C=CC5=C(C6=CC=CC=C6)C7=NC(=C(C8=CC=CC=C
8)C9=CC=C2N9[P+](N45)(OCCO)OCCO)C=C7)C%10=CC=CC=C%10

When I remembered to add UseAromaticSymbols to the flavor I get:

c1ccc(cc1)/C/2=C/3\C=CC(=N3)C(=c4ccc5=C(c6ccccc6)C7=NC(=C(c8ccccc8)c9ccc2n9[
P+](n45)(OCCO)OCCO)C=C7)c%10ccccc%10

This shows two aromatic nitrogens and two aliphatic nitrogens, which I
expected four "n" terms.

This SMILES contains "C=CC(=N3)" which I would expect to match the SMARTS
"C=CC=N", so I can't use this approach for manual verification.

I didn't see any other relevant flavors to add. Is there something else I
should do?

Cheers,

                                Andrew
                                da...@dalkescientific.com





_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user



_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to