On Jan 26, 2011, at 11:54 AM, Chris Morley wrote: > As may have been discussed here earlier, maybe this option should output > explicit hydrogen as [H] rather than a hydrogen count on another atom. > SMARTS [CH2] matches a C with exactly two Hs; SMARTS [H]C[H] will match a > carbon with at least two Hs and is more versatile in substructure searches. > Would there be any objections to me changing it in the development code?
None from me. Personally, if I have explicit hydrogens in the structure then I want hydrogens in the SMILES output. I will point out that Pascal asked for one of > [H]OC([H])[H] or [OX2H1][CX4H2] as output. Your fix gives him the first. Mine code does not give him the second. For that he needs something more like: Step 1, encode the connectivity into the isotope >>> import pybel >>> mol = pybel.readstring("smi", "OCN") >>> for atom in mol.atoms: ... atom.OBAtom.SetIsotope(100 + atom.implicitvalence) ... >>> mol.write("can") '[103NH2][104CH2][102OH]\t\n' Step 2, remove the atom(s) >>> mol.OBMol.DeleteAtom(mol.atoms[-1].OBAtom) True Step 3, create the SMARTS based on syntactical transformation of the SMILES >>> import re >>> re.sub(r"\[10(\d)", r"[X\1", mol.write("can")) '[X4CH3][X2OH]\t\n' if you (like me) think that SMARTS looks ugly, then >>> re.sub(r"\[10(\d)([^]]+)\]", r"[\2X\1]", mol.write("can")) '[CH3X4][OHX2]\t\n' Not exactly matching Pascal's second option, but it's an equivalent SMARTS. Andrew da...@dalkescientific.com ------------------------------------------------------------------------------ Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss