On 25 November 2010 10:18, Thomas Strunz <[email protected]> wrote:
> Here some additional comments:
>
> I now changed the "reading code" to :
>
>
> MDLV2000Reader molReader = new MDLV2000Reader(stream);
> Molecule mol = (Molecule) molReader.read((ChemObject) new Molecule());
> AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(mol);
> CDKHueckelAromaticityDetector.detectAromaticity(mol);
>
> Now I get the same amount of results regardless of Fingerprinter or
> ExtendedFingerprinter with my arbitrary query.
>
> But there still is a difference when just using benzene as query, meaning
> normal fingerprinter does not seem to be usable if your dataset has aromatic
> compounds.
> (18k hits compared to 38k with Extended, commercial software gets about 100
> more than with extended).
>
>
> > The correct number of results is obtained by doing a subgraph
> > isomorphism directly without any intervening fingerprint screen
>
> Will try it out to see what I get then.
>
>
> > Do you have any profiling results? Keeping lots of IAtomContainer
> > objects in memory can lead to high memory consumption - these objects
> > are pretty heavyweight
>
> I try to limit it as possible like when creating fingerprints only reading
> them in smaller batches and not all of them (because certain JDBC drivers
> like for hsqldb return all results at once ignoring fetchSize. But sure
> there can be several thousands in memory at once.
>
>
>
Just my two cents.
Besides prescreening, having minimum IAtomContainer objects in memory is the
key to performance. As less than one object doesn't make sense :) one
IATomContainer at a time is the best. Fingerprints can be pre-calculated
and no need to be loaded in-memory at all, let SQL do the prescreening.
We've been doing similar things (CDK, relational database, no cartridges)
in ambit (ambit.sourceforge.net) for quite few years already. There is
downloadable standalone application and a servlet container application war
file (to run your own service), as well as a running OpenTox REST services
for substructure searching , e.g.
https://ambit.uni-plovdiv.bg:8443/ambit2/query/smarts?search=c1ccccc1[Cl,Br,F]
http://apps.ideaconsult.net:8080/ambit2/query/smarts?search=c1ccccc1[Cl,Br,F,I]
Regards,
Nina
> Regards,
>
> Thomas
>
>
> ------------------------------------------------------------------------------
> Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
> Tap into the largest installed PC base & get more eyes on your game by
> optimizing for Intel(R) Graphics Technology. Get started today with the
> Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
> http://p.sf.net/sfu/intelisp-dev2dev
> _______________________________________________
> Cdk-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user