Dear Jules, On Fri, Jun 11, 2010 at 12:56 PM, Jules Kerssemakers <[email protected]> wrote: > Hi, I'm Jules Kerssemakers, a recently started bioinformatics PhD > student at the group of Gert Vriend (CMBI, netherlands).
I will email you and Gert shortly to try to set up a meeting at the end of July, if suitable for all our agenda's. > I work on the BioMeta database (http://biometa.cmbi.ru.nl) for > metabolites and metabolism, which I took over from my predecessor, > Martin Ott. Cool! > Primary item on the agenda (before I start the actual science-y work) > is an update of the information contained in the database (it's mostly > based on the 2005 version of the KEGG database) Sounds like a good way to start. > In an effort to automate this update procedure, I discovered the CDK. > I can see it's a very powerful toolkit, but I'm having some trouble > navigating the feature-set. > To compare the molecules-to-update, I'm interested in the amount of > defined/undefined stereocenters and defined/undefined double bond > configurations. Depending on the input. I have indeed recently been working on tetrahedral stereochemistry, but accurate identification of stereo centers is non-trivial. But, at least if the input is right, the CDK can now assign absolute stereochemistry, .e.g as R,S using the CIP rules. > Does the CDK have a way to calculate these properties? > > I already found EgonW's blog about the CIPTool, > (http://chem-bla-ics.blogspot.com/2010/04/cip-rules-for-stereochemistry.html), > but I haven't been able to find it in CDK v1.2.5 nor in v1.3.5. The code to calculate the R,S stereochemistry is not yet in 1.3.5, but the foundation is. There is some final testing to be done regarding the CIP code, after which I will prepare a patch against the 1.3 series. > I'm also unsure if the CIPtool would let me detect undefined stereocenters. The CIPTool defines the stereochemistry of a stereocenter. > So, summing up: > -Can the CDK count defined/undefined stereocenters Depends on the exact context. What is your input? What do you mean exactly with defined/undefined? That is, can you put this in the context of, for example, this blog post: http://cactus.nci.nih.gov/blog/?p=679&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+chemicalStructureBlog+%28%2Fchemical%2Fstructure+Blog%29&utm_content=FriendFeed+Bot > and/or double bonds, The CDK has an algorithm to define stereochemistry from 2D coordinates, so if your input has that information... regarding that, it should also have code to define stereochemistry based on wedge bond information... > or do I need to do some (heavy) programming myself? I do not know enough about your particular use case to decide how much programming would be involved, and what of your needs is already available... Egon -- Post-doc @ Uppsala University Proteochemometrics / Bioclipse Group of Prof. Jarl Wikberg Homepage: http://egonw.github.com/ Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers ------------------------------------------------------------------------------ ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo _______________________________________________ Cdk-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/cdk-user

