FYI
Some time ago I re-did this with the PubChemFingerprinter and the issue
was the same. However Today I had some time to investigate and found this:
http://sourceforge.net/tracker/index.php?func=detail&aid=3305550&group_id=20024&atid=120024
I used cdk 1.4.5 and in that version this issue is still present. Anyway
i have not tried it but I am pretty sure this is the cause for the odd
behavior as in my case there are more than 1 PubChemFingerprinter instances.
Once I have time I will redo the test.
Best Regards,
Joos
Am 20.12.2011 14:09, schrieb Joos Kiener:
I now adjusted the test and I get the same amount if hits when using
ExtendenFingerprinter and UIT or UIT only.
CDK ChemFinder
CCC(C)C(C)C(C)C 427 427
O(C)C(C)C(C)C(C)C 77 77
CCCCCC(C)CC 1521 1825
N(C)(C)CC(C)C 10487 11412
O(CC)C(N(C)C)C 64 109
CC(C)C(C)C(C(C)C)C(C)C 0 0
While the differences between cdk and chemfinder are not nice I do see
that it can be due to settings. (chemfinder was set to ignore stereo
chemistry in this case).
For the PubchemFingerprinter I first want to recreate the
Fingerprints. Will get back to you once I have the results.
Regards,
Joos
2011/12/19 Joos Kiener <[email protected] <mailto:[email protected]>>
Hi All,
ok, I found the issue. In the used code I parse SMILES into
IAtomContainer and then create an IQueryAtomContainer from it
(which my search method accepts as parameter)
Taking the IAtomContainer from smiles directly for usage in UIT
and then results are as expected.
Hence I assume the
method/QueryAtomContainerCreator.createBasicQueryContainer(//atomContainer)/
has a bug or I misunderstand the documentation:
/Creates a QueryAtomContainer with SymbolQueryAtom's,
AromaticQueryBond's and OrderQueryBond's. If a IBond of the input
|container| is flagged aromatic, then it disregards bond order
information and only match against an aromatic target atom instead./
As far as I understand this, the method takes Aromaticity into
account by requiring both atoms of the bond to be aromatic for a
match? But obviously none of the atoms are aromatic in my query.
I just created following "test":
/import org.openscience.cdk.smiles.SmilesParser;
import org.openscience.cdk.interfaces.IMolecule;
import org.openscience.cdk.DefaultChemObjectBuilder;
import org.openscience.cdk.tools.manipulator.AtomContainerManipulator;
import org.openscience.cdk.aromaticity.CDKHueckelAromaticityDetector;
import org.openscience.cdk.isomorphism.UniversalIsomorphismTester;
import org.openscience.cdk.isomorphism.matchers.IQueryAtomContainer;
import
org.openscience.cdk.isomorphism.matchers.QueryAtomContainerCreator;
public class UitTest {
public static void main(String[] args) throws Exception {
String querySmiles = "CCC(C)C(C)C(C)C";
String structureOneFromSdFile =
"O=C2NC=1C=CC=CC=1C26(N(C)CC(C=3C=CC=CC=3(F))C46(C(=O)C=5C=CC=CC=5(OC4)))";
// first molecule in the sd-file i previously sent
SmilesParser smilesParser = new
SmilesParser(DefaultChemObjectBuilder.getInstance());
IMolecule queryMol = smilesParser.parseSmiles(querySmiles);
AtomContainerManipulator.percieveAtomTypesAndConfigureUnsetProperties(queryMol);
CDKHueckelAromaticityDetector.detectAromaticity(queryMol);
IQueryAtomContainer query =
QueryAtomContainerCreator.createBasicQueryContainer(queryMol);
//
AtomContainerManipulator.percieveAtomTypesAndConfigureUnsetProperties(query);
//// uncommenting has no impact on result./
/ // CDKHueckelAromaticityDetector.detectAromaticity(query);
// uncommenting has no impact on result.
IMolecule target =
smilesParser.parseSmiles(structureOneFromSdFile);
AtomContainerManipulator.percieveAtomTypesAndConfigureUnsetProperties(target);
CDKHueckelAromaticityDetector.detectAromaticity(target);
//System.out.println(UniversalIsomorphismTester.isSubgraph(target,
queryMol)); /// false
/
System.out.println(UniversalIsomorphismTester.isSubgraph(target,
query)); // true
}
}/
So UIT is working correctly.
Best Regards,
Joos
Am 19.12.2011 17:30, schrieb Egon Willighagen:
Joos,
On Mon, Dec 19, 2011 at 5:16 PM, Nina Jeliazkova
<[email protected]> <mailto:[email protected]> wrote:
For example "CCC(C)C(C)C(C)C" does not (and should not) match aromatic
carbons, as inhttp://tinyurl.com/7h7havf (no highlighted structures in
the
CDK depiction at the top right )
The above SMILES/SMARTS should indeed not match aromatic systems. That
leaves the question why the UIT does match them... Joos, what code are
you using for that experiment?
Egon
------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user