Hi John,

thanks a lot for your detailed answer, helped me to understand the CDK a 
bit better. For now I'll try my luck with option 2)

Best regards,
Martin



-- 
Dipl-Inf. Martin Gütlein
Phone:
+49 (0)6131 39 23336 (office)
+49 (0)177 623 9499 (mobile)
Email:
[email protected]

Am 02.12.2014 20:47 schrieb John May:
> Just to clarify you can write SMILES in CDK you’re writing SMILES
> and then interpreting this as SMARTS. CDK doesn’t have the ability
> to write a SMARTS. As well as hydrogens you may also have trouble with
> aromaticity, charges, and isotopes.
> 
>>> c(c[cH])c[cH]
> 
> Is probably better as [#6]([#6][#6])[#6][#6].
> 
> The reason you’re having trouble in CDK 1.5 is SMILES IO now
> correctly handles the valence.
> 
> Anyways, There are a couple of solutions
> 
> 1) reset the hydrogen counts to default (i.e atom typing) this will
> work for your examples but will also mean you would lose aromaticity
> flags (i.e. the example above isn’t a ring) and this wouldn’t fix
> nitrogens which also have H displayed when aromatic. I would not
> recommend this.
> 2) set all hydrogen counts to 0 (not null!) before generating the
> SMILES you may also want to do charge and mass. Simply loop over the
> MCS and set the implicitH count to 0. removeHydrogens has no effect
> because they’re not explicit -
> http://nextmovesoftware.com/blog/2013/02/27/explicit-and-implicit-hydrogens-taking-liberties-with-valence/
> [3].
> 3) after parsing the SMILES as SMARTS, traverse the expression tree of
> each atom and replace the And(<OtherSmartsAtom>, HydrogenCount) with
> <OtherSmartsAtom>.
> 4) load the SMILES as a SMILES and do a normal subgraph match opposed
> to SMARTS.
> 
> Also
>  - make sure you use the new SMSD (not part of CDK) the CDK packages
> are quite old
>  - avoid using the DefaultChemObjectBuilder and use
> SilentChemObjectBuilder (the naming is the wrong way round but
> actually Silent is better as it doesn’t fire off events).
>  - you’re generating canonical SMILES when this isn’t needed use
> SmilesGenerator.generic().aromatic() when creating the
> SmilesGenerator.
> 
> J
> 
> On Dec 2, 2014, at 11:04 AM, Martin Gütlein <[email protected]>
> wrote:
> 
>> Hi,
>> 
>> any help with this issue would be very much appreciated,
>> 
>> Kind regards,
>> Martin
>> 
>> -------- Originalnachricht --------
>> Betreff: Re: how to print SMARTS pattern without hydrogens
>> Datum: 02.12.2014 12:00
>> On 30 September 2014 at 09:30, Martin Guetlein
>> <[email protected]> wrote:
>> 
>>> Hi,
>>> 
>>> I am currently migrating from cdk1.4 to 1.5. I am mining the
>>> maximum
>>> common subgraph of two compounds and then print the resulting
>>> fragment
>>> as SMARTS. This is working in 1.4, however in 1.5 the
>>> SmilesGenerator
>>> is adding unwanted Hydrogens. How can I get rid of the Hydrogens?
>>> See example below.
>>> See also
>>> 
>> 
> https://www.mail-archive.com/[email protected]/msg02597.html
>>> [1]
>>> 
>>> Thanks and kind regards,
>>> Martin
>>> 
>>> The following code prints "mcs: c(c[cH])c[cH]" instead of "mcs:
>>> ccccc"
>>> [[
>>> SmilesParser sp = new
>>> SmilesParser(DefaultChemObjectBuilder.getInstance());
>>> IAtomContainer mol1 = sp.parseSmiles("c1ccccc1NC");
>>> IAtomContainer mol2 = sp.parseSmiles("c1cccnc1");
>>> org.openscience.cdk.smsd.Isomorphism mcsFinder = new
>>> org.openscience.cdk.smsd.Isomorphism(
>>> org.openscience.cdk.smsd.interfaces.Algorithm.DEFAULT, true);
>>> mcsFinder.init(mol1, mol2, true, true);
>>> mcsFinder.setChemFilters(true, true, true);
>>> 
>>> mol1 = mcsFinder.getReactantMolecule();
>>> IAtomContainer mcsmolecule =
>>> 
>> 
> DefaultChemObjectBuilder.getInstance().newInstance(IAtomContainer.class,
>>> mol1);
>>> List<IAtom> atomsToBeRemoved = new ArrayList<IAtom>();
>>> for (IAtom atom : mcsmolecule.atoms())
>>> {
>>> int index = mcsmolecule.getAtomNumber(atom);
>>> if (!mcsFinder.getFirstMapping().containsKey(index))
>>> atomsToBeRemoved.add(atom);
>>> }
>>> for (IAtom atom : atomsToBeRemoved)
>>> mcsmolecule.removeAtomAndConnectedElectronContainers(atom);
>>> 
>>> // has no effect
>>> // mcsmolecule =
>>> AtomContainerManipulator.removeHydrogens(mcsmolecule);
>>> 
>>> SmilesGenerator g = new SmilesGenerator().aromatic();
>>> System.out.println("mcs: " + g.create(mcsmolecule));
>>> ]]
>>> 
>>> --
>>> Dipl-Inf. Martin Gütlein
>>> Phone:
>>> +49 (0)761 203 8442 (office)
>>> +49 (0)177 623 9499 (mobile)
>>> Email:
>>> [email protected]
>> 
>> 
> ------------------------------------------------------------------------------
>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>> from Actuate! Instantly Supercharge Your Business Reports and
>> Dashboards
>> with Interactivity, Sharing, Native Excel Exports, App Integration &
>> more
>> Get technology previously reserved for billion-dollar corporations,
>> FREE
>> 
> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
>> [2]
>> _______________________________________________
>> Cdk-user mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
> 
> 
> 
> Links:
> ------
> [1] 
> https://www.mail-archive.com/[email protected]/msg02597.html
> [2]
> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&amp;iu=/4140/ostg.clktrk
> [3]
> http://nextmovesoftware.com/blog/2013/02/27/explicit-and-implicit-hydrogens-taking-liberties-with-valence/

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to