Thanks Greg, What about fingerprints to use for this?
Regards, Evgueni 2009/7/16 Greg Landrum <[email protected]> > HI Evgueni, > > On Thu, Jul 16, 2009 at 11:23 AM, Evgueni Kolossov<[email protected]> > wrote: > > Hi Greg, > > > > What's the best way to check for duplicate in this smart pointers vector > of > > fragments when adding a new fragment to the vector? > > The easy answer would be to use the canonical smiles for the fragment. > This *might* work, and it would be easy, but I'm not sure I'd trust > it. > Here's a simple example where it did work: > [6] >>> m1 = Chem.MolFromSmiles('Occcc',False) > > [7] >>> m2 = Chem.MolFromSmiles('ccccO',False) > > [8] >>> m1.UpdatePropertyCache() > > [9] >>> m2.UpdatePropertyCache() > > [10] >>> Chem.MolToSmiles(m1) > Out[10]: 'ccccO' > > [11] >>> Chem.MolToSmiles(m2) > Out[11]: 'ccccO' > > An answer that's more likely to be correct, but perhaps more difficult > to implement, is the use of subgraph invariants. This is what the > existing RDKit fragment catalog code does. Take a look at the > getDiscrims() method of FragCatalogEntry > ($RDBASE/Code/GraphMol/FragCatalog/FragCatalogEntry.cpp); it shows an > approach that I have more confidence in than the "canonical smiles for > pieces of molecules" method. > > -greg > -- Dr. Evgueni Kolossov (PhD) [email protected] Tel. +44(0)1628 627168 Mob. +44(0)7812070446

