Hi Claus,
thanks for your interest in our work. I try to make the approach used in 
NMRShiftDB a bit clearer. I try to point to relevant classes - if you need 
detailed explanations on code, please ask.
- The database: NMRShiftDB has a database which splits a molecule in tables 
for molecule, atom, bond and their connections. The atom and bond tables have 
pretty much the same purpose as the atom and bond arrays in the cdk molecule 
object. You find an ER-diagram here: 
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/nmrshiftdb/nmrshiftdb/doc/nmrshiftdbhelp.pdf?rev=HEAD&content-type=application/pdf
- NMRShiftDB uses on OR mapper called torque. This has objects for every 
table. In NMRShiftDB their names start with DB, so there is a DBMolecule 
object representing the molecule table.
- The save and load process: There is code (not in DBMolecule, but in 
SubmitingData, but this is bad design and could be changed) which takes a cdk 
molecule. It then checks for duplicates via SMILES, returns the exising 
DBMolecule or saves the data to molecule, atom, bond etc. table and return 
the new DBMolecule. When loading, you get a DBMolecule in some way and do a 
getAsCDKMolecule on it, which reads atom, bond etc. into a cdk molecule and 
returns this.
- Searches: the exact/similarity/substructure search is done in 
GeneralUtils.executeSearch This method does all searches, so it looks 
complicated. It uses SMILES and simple sql for exact searches, fingerprints 
and a UDF for similarity search and performs an isomorphism check (via cdk 
objects/methods) for exact substructure search.
I hope this makes things clear. NMRShiftDB really contains code for maintining 
a structure database. The code is also used on 
http://www.chemistry-development-kit.org/ for the database, but no serches 
are implemented here. This uses a torque library a bit newer the 
nmrshiftdb.org and does not have the spectrum part. It could easily be 
extended with some search code from nmrshiftdb to form a structure database 
library.
If you think this code could be helpfull, please let me now and we should 
start to extend the cdkweb code with searches.
Stefan

Am Thursday 29 December 2005 11:28 schrieb Claus Stie Kallesøe:
> Hi Christoph,
>
>  and thank you for the answer. happy to hear that you would like to help us
> out. My idea actually was to reuse some code from NMRshiftdb
>
>  Here is the story:
>
>  Today we receive the user input structure as a molfile from Marvin. We
> then parse that on to a JChemSearch object. This object (together with an
> updatehandler) pretty much takes care of everything related to substructure
> searching, dublicate check etc. JChem therefore of cause has requirements
> to the structure table.
>
>  We would now like to try to port to first CDK and then JChemPaint to make
> the chemicalinventory true opensource. I read the journals about the design
> of CDK as well as NMRshiftdb and my plan was to get inspiration for the
> structure tables as well as code for searching and storing.
>
>  I have found the sql statements for the tables but I really do have a hard
> time finding the code. So if you could point me in the right direction that
> would be a great help.
>
>  I do understand that there is going to be more coding from our hands in
> order to perform searches and inserts using CDK.
>
>  All we basically want to do is do a dublicate check during insert of new
> structures (to keep the structure table unique (in 2D)) and be able to
> perform exact match, similarity and substructure searches depending on user
> choices.
>
>  Also I think I now (after reading through most of the API yestersday)
> understand the concept of ChemObjects, AtomContainer, Bond and Atoms. But I
> still don't see how I then break up a drawn molecule (molfile) in order to
> store the fingerprints, smiles, bonds and atoms etc.
>
>  Again If you are able to point me to the code so I can see examples in
> order to fully understand I think we can handle the hard work.
>
>  Another option is of cause if any of you would like to join us. I have
> made a new CVS module (chemicalinventory-cdk), where I will start to edit
> the current code to use CDK. Just let me know and I will register you.
>
>  Thank you
>
>  claus
>
>
> Christoph Steinbeck <[EMAIL PROTECTED]> skrev: Those classes are
> indeed "prehistoric material" by CDK means. I guess they have not been used
> or maintained for years.
> Claus, if you are interested in writing structures to and from databases,
> we should discuss the issues.
> As you will know, we have written quite some code for doing this within the
> NMRShiftDB project. The code is part of the NMRShiftDB code base, which
> uses a lot of CDK code.
>
> Cheers,
>
> Chris
>
> Egon Willighagen wrote:
> > On Wednesday 28 December 2005 16:22, Claus Stie Kallesøe wrote:
> >> I have been reading through the cdk library here:
> >>http://cdk.sourceforge.net/api/
> >>
> >> in order to find out how I can use the library.
> >>
> >> As we use a MySql database to store the structures I wanted to use the
> >>DBReader.class as the describtion tells me: " Reader that can read from a
> >>relational database that can be  accessed through JDBC."
> >>
> >> But I can't fint the class in the cdk-20050826.jar file I have?
> >
> > The DBReader, DBWriter and DBAdmin classes are in the CDK module
> > 'orphaned', meaning that no one was maintaining them, and they did not
> > seemed to be used.
> >
> > The sources, however, can be found in CVS at:
> >
> > http://cvs.sourceforge.net/viewcvs.py/cdk/cdk/src/org/openscience/cdk/dat
> >abase/
> >
> >> Under org.openscience.cdk.database I only see the XindiceReader.
> >>
> >> Can you tell me where I can find the DBReader?
> >
> > Note, that the DBReader and DBWriter is closely tied together, and store
> > the molecules CML string in the 'molecules' table:
> >
> > ps = con.prepareStatement("INSERT INTO molecules VALUES('', ?)");
> >
> > This does likely not match the setup you are working with.
> >
> > Nevertheless, the source should give you some insight in how I used MySQL
> > in the past.
> >
> > Egon

-- 
Stefan Kuhn M. A.
Cologne University BioInformatics Center (http://www.cubic.uni-koeln.de)
Zülpicher Str. 47, 50674 Cologne
Tel: +49(0)221-470-7428   Fax: +49 (0) 221-470-7786
My public PGP key is available at http://pgp.mit.edu


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to