Hi Nina, dont you still use CDK1.4 in ambit2-smarts? I'll run in to this issue in the first place because I am migrating to 1.5. :-)
Martin -- Dipl-Inf. Martin Gütlein Phone: +49 (0)6131 39 23336 (office) +49 (0)177 623 9499 (mobile) Email: [email protected] Am 02.12.2014 20:59 schrieb Nina Jeliazkova: > Martin, > > Following John's comments, I've realized you might consider writing > SMARTS via SmartsHelper.toSmarts() method from ambit2-smarts package > (takes QueryAtomContainer ) > > http://ambit.sourceforge.net/AMBIT2-LIBS/ambit2-smarts/apidocs/ambit2/smarts/SmartsHelper.html#toSmarts(org.openscience.cdk.isomorphism.matchers.QueryAtomContainer) > [4] > > Regards, > Nina > > On 2 December 2014 at 21:47, John May <[email protected]> > wrote: > >> Just to clarify you can write SMILES in CDK you’re writing SMILES >> and then interpreting this as SMARTS. CDK doesn’t have the ability >> to write a SMARTS. As well as hydrogens you may also have trouble >> with aromaticity, charges, and isotopes. >> >> c(c[cH])c[cH] > > Is probably better as [#6]([#6][#6])[#6][#6]. > > The reason you’re having trouble in CDK 1.5 is SMILES IO now > correctly handles the valence. > > Anyways, There are a couple of solutions > > 1) reset the hydrogen counts to default (i.e atom typing) this will > work for your examples but will also mean you would lose aromaticity > flags (i.e. the example above isn’t a ring) and this wouldn’t fix > nitrogens which also have H displayed when aromatic. I would not > recommend this. > 2) set all hydrogen counts to 0 (not null!) before generating the > SMILES you may also want to do charge and mass. Simply loop over the > MCS and set the implicitH count to 0. removeHydrogens has no effect > because they’re not explicit - > http://nextmovesoftware.com/blog/2013/02/27/explicit-and-implicit-hydrogens-taking-liberties-with-valence/ > [5]. > 3) after parsing the SMILES as SMARTS, traverse the expression tree of > each atom and replace the And(<OtherSmartsAtom>, HydrogenCount) with > <OtherSmartsAtom>. > 4) load the SMILES as a SMILES and do a normal subgraph match opposed > to SMARTS. > > Also > - make sure you use the new SMSD (not part of CDK) the CDK packages > are quite old > - avoid using the DefaultChemObjectBuilder and use > SilentChemObjectBuilder (the naming is the wrong way round but > actually Silent is better as it doesn’t fire off events). > - you’re generating canonical SMILES when this isn’t needed use > SmilesGenerator.generic().aromatic() when creating the > SmilesGenerator. > > J > >> On Dec 2, 2014, at 11:04 AM, Martin Gütlein <[email protected]> >> wrote: >> Hi, >> >> any help with this issue would be very much appreciated, >> >> Kind regards, >> Martin >> >> -------- Originalnachricht -------- >> Betreff: Re: how to print SMARTS pattern without hydrogens >> Datum: 02.12.2014 12:00 >> On 30 September 2014 at 09:30, Martin Guetlein >> <[email protected]> wrote: >> Hi, >> >> I am currently migrating from cdk1.4 to 1.5. I am mining the maximum >> common subgraph of two compounds and then print the resulting >> fragment >> as SMARTS. This is working in 1.4, however in 1.5 the >> SmilesGenerator >> is adding unwanted Hydrogens. How can I get rid of the Hydrogens? >> See example below. >> See also >> > https://www.mail-archive.com/[email protected]/msg02597.html >> [1] >> >> Thanks and kind regards, >> Martin >> >> The following code prints "mcs: c(c[cH])c[cH]" instead of "mcs: >> ccccc" >> [[ >> SmilesParser sp = new >> SmilesParser(DefaultChemObjectBuilder.getInstance()); >> IAtomContainer mol1 = sp.parseSmiles("c1ccccc1NC"); >> IAtomContainer mol2 = sp.parseSmiles("c1cccnc1"); >> org.openscience.cdk.smsd.Isomorphism mcsFinder = new >> org.openscience.cdk.smsd.Isomorphism( >> org.openscience.cdk.smsd.interfaces.Algorithm.DEFAULT, true); >> mcsFinder.init(mol1, mol2, true, true); >> mcsFinder.setChemFilters(true, true, true); >> >> mol1 = mcsFinder.getReactantMolecule(); >> IAtomContainer mcsmolecule = >> > DefaultChemObjectBuilder.getInstance().newInstance(IAtomContainer.class, >> mol1); >> List<IAtom> atomsToBeRemoved = new ArrayList<IAtom>(); >> for (IAtom atom : mcsmolecule.atoms()) >> { >> int index = mcsmolecule.getAtomNumber(atom); >> if (!mcsFinder.getFirstMapping().containsKey(index)) >> atomsToBeRemoved.add(atom); >> } >> for (IAtom atom : atomsToBeRemoved) >> mcsmolecule.removeAtomAndConnectedElectronContainers(atom); >> >> // has no effect >> // mcsmolecule = >> AtomContainerManipulator.removeHydrogens(mcsmolecule); >> >> SmilesGenerator g = new SmilesGenerator().aromatic(); >> System.out.println("mcs: " + g.create(mcsmolecule)); >> ]] >> >> -- >> Dipl-Inf. Martin Gütlein >> Phone: >> +49 (0)761 203 8442 (office) >> +49 (0)177 623 9499 (mobile) >> Email: >> [email protected] >> >> > ------------------------------------------------------------------------------ >> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server >> from Actuate! Instantly Supercharge Your Business Reports and >> Dashboards >> with Interactivity, Sharing, Native Excel Exports, App Integration & >> more >> Get technology previously reserved for billion-dollar corporations, >> FREE >> > http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk >> [2] >> _______________________________________________ >> Cdk-user mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/cdk-user [3] > > ------------------------------------------------------------------------------ > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and > Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & > more > Get technology previously reserved for billion-dollar corporations, > FREE > > http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk > [6] > _______________________________________________ > Cdk-user mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/cdk-user [3] > > > > Links: > ------ > [1] > https://www.mail-archive.com/[email protected]/msg02597.html > [2] > http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk > [3] https://lists.sourceforge.net/lists/listinfo/cdk-user > [4] > http://ambit.sourceforge.net/AMBIT2-LIBS/ambit2-smarts/apidocs/ambit2/smarts/SmartsHelper.html#toSmarts(org.openscience.cdk.isomorphism.matchers.QueryAtomContainer) > [5] > http://nextmovesoftware.com/blog/2013/02/27/explicit-and-implicit-hydrogens-taking-liberties-with-valence/ > [6] > http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk ------------------------------------------------------------------------------ Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk _______________________________________________ Cdk-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/cdk-user

