On Jan 24, 2011, at 4:42 AM, Noel O'Boyle wrote: > Hi Dan, > > Regarding (1) the relevant section in the docs is at > http://openbabel.org/docs/2.3.0/Fingerprints/fingerprints.html. I > think that the section on Similiarity Searching answers this question. > > Question (2) is about searching for exact matches. Currently the only > way to do this is matching by canonical SMILES or by InChI, e.g. see > the section on the InChI descriptor at > http://openbabel.org/docs/2.3.0/Command-line_tools/babel.html#inchi-descriptor > > . > If you are doing multiple searches, I would use the substructure > search described at > http://openbabel.org/docs/2.3.0/Fingerprints/fingerprints.html to > extract a small set of potential exact matches and then search those > using the InChI descriptor. > > I hope this answers your questions. I'm ccing to the openbabel-discuss > list where someone else might have a better idea. >
Thanks for the info. I think I need to put together a few detailed examples, but unfortunately I'll be busy for a few days. Anyway, I think part of it is related to what we talked about in Cambridge: how do we "clean up" structure representations to make sure we understand how the tools will behave. To give the list the context: At DTP we accept molecules for testing in the Human Tumor Cell Line Screen (NCI-60) ; see http://dtp.nci.nih.gov/docs/misc/common_files/submit_compounds.html . Our biggest criterion for acceptance is structure novelty. As it stands now the only way you can find out if a structure you are submitting is novel to our program is to submit it and wait to hear back from us. Our open compounds are freely available so you could set a search through those, but about half our compounds are confidential, so that doesn't completely solve the problem. I want to set up a web service that will accept a structure and return the highest tanimoto from all of our structures. That way submitters can check their structures before submitting (or maybe even synthesizing) and not waste time on a 2-3 day submission response turnaround. Any suggestions regarding parameters (like which fp type) to use would be appreciated. DanZ /******************************************** * Daniel Zaharevitz * Chief, Information Technology Branch * Developmental Therapeutics Program * National Cancer Institute * zahar...@mail.nih.gov * ********************************************/ ------------------------------------------------------------------------------ Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss