Hi all, I'm trying to use CDK for detecting pharmacophores. It's easy enough to get SMARTSQuery to find patterns but a major problem is, that the returned atom (indices) do not respect the ordering of molecules in the original SMARTS string. For example, the query string "[O;X2]C" will, in one of my examples, return two atoms in this order:
#<Atom Atom(745381933, S:C, 3D:[(-4.8833, 0.8605, 0.0308)], AtomType(745381933, N:C.sp3, FC:0, H:SP3, NC:4, EV:4, Isotope(745381933, Element(745381933, S:C, ID:C1))))> #<Atom Atom(1362034980, S:O, 3D:[(-3.6751, 1.6658, 0.029)], AtomType(1362034980, N:O.sp3, FC:0, H:SP3, NC:2, EV:2, Isotope(1362034980, Element(1362034980, S:O, ID:O1))))> That is, the carbon is returned before the oxygen. I'm only interested in the oxygen as that is the defining point of this feature but I cannot assume that it will be the first atom returned (or the last, for that matter). The order in which SMARTSQuery returns matched atoms is arbitrary, rendering it useless as a feature extraction tool. I have been looking trough the source code of SMARTQuery, SMARTSParser and UniversalIsomorphismTester to no avail. As far as I can tell from the source it is impossible to extract the atoms in the correct order without completely rewriting the SMARTS matching functionality include the isomorphism tester. Does anybody know if I am mistaken? It would really make life much easier for me if it was possible to extract the atoms in the order in which they are matched. Best regards, Thomas ------------------------------------------------------------------------------ Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev _______________________________________________ Cdk-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/cdk-user

