Dear Jules,

On Fri, Jun 11, 2010 at 12:56 PM, Jules Kerssemakers
<[email protected]> wrote:
> Hi, I'm Jules Kerssemakers, a recently started bioinformatics PhD
> student at the group of Gert Vriend (CMBI, netherlands).

I will email you and Gert shortly to try to set up a meeting at the
end of July, if suitable for all our agenda's.

> I work on the BioMeta database (http://biometa.cmbi.ru.nl) for
> metabolites and metabolism, which I took over from my predecessor,
> Martin Ott.

Cool!

> Primary item on the agenda (before I start the actual science-y work)
> is an update of the information contained in the database (it's mostly
> based on the 2005 version of the KEGG database)

Sounds like a good way to start.

> In an effort to automate this update procedure, I discovered the CDK.
> I can see it's a very powerful toolkit, but I'm having some trouble
> navigating the feature-set.
> To compare the molecules-to-update, I'm interested in the amount of
> defined/undefined stereocenters and defined/undefined double bond
> configurations.

Depending on the input. I have indeed recently been working on
tetrahedral stereochemistry, but accurate identification of stereo
centers is non-trivial. But, at least if the input is right, the CDK
can now assign absolute stereochemistry, .e.g as R,S using the CIP
rules.

> Does the CDK have a way to calculate these properties?
>
> I already found EgonW's blog about the CIPTool,
> (http://chem-bla-ics.blogspot.com/2010/04/cip-rules-for-stereochemistry.html),
> but I haven't been able to find it in CDK v1.2.5 nor in v1.3.5.

The code to calculate the R,S stereochemistry is not yet in 1.3.5, but
the foundation is. There is some final testing to be done regarding
the CIP code, after which I will prepare a patch against the 1.3
series.

> I'm also unsure if the CIPtool would let me detect undefined stereocenters.

The CIPTool defines the stereochemistry of a stereocenter.

> So, summing up:
> -Can the CDK count defined/undefined stereocenters

Depends on the exact context. What is your input? What do you mean
exactly with defined/undefined? That is, can you put this in the
context of, for example, this blog post:

http://cactus.nci.nih.gov/blog/?p=679&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+chemicalStructureBlog+%28%2Fchemical%2Fstructure+Blog%29&utm_content=FriendFeed+Bot

> and/or double bonds,

The CDK has an algorithm to define stereochemistry from 2D
coordinates, so if your input has that information... regarding that,
it should also have code to define stereochemistry based on wedge bond
information...

> or do I need to do some (heavy) programming myself?

I do not know enough about your particular use case to decide how much
programming would be involved, and what of your needs is already
available...

Egon

-- 
Post-doc @ Uppsala University
Proteochemometrics / Bioclipse Group of Prof. Jarl Wikberg
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to