It all depends on the actually use case. Yes, creating AtomContainer is
more costly than loading molfiles but if you often load thousands of
molfiles then that performance becomes relevant too, This OrChem format
solves both issues and will drastically increase performance for any use
case. So I think it's a good idea to use it.

ORM or JDBC Cartridge or Not is a whole other discussion and in the end
everything has it's advantages and downsides and the actual use case
decides whats best. As I see it ambit targets a completely different use
case than my project so there is no need for being hostile towards it? (or
maybe I just misinterpreted your reply). In the end I was only trying to
help.

Best Regards,

Joos


2013/9/25 Nina Jeliazkova <[email protected]>

>
>
>
> On 25 September 2013 08:03, Joos Kiener <[email protected]> wrote:
>
>> Since I have played around with this for a fairly long time, here some of
>> my observations:
>>
>> - loading lots (thousands) of molfiles from relation databases is quiet
>> slow
>>
>
>
> The slow part is not reading molfile from the database, but loading it
> into IAtomContainer. Actually it is rarely needed to load bunch of atom
> containers in memory - especially in web based interface where
> serialisation is to something web friendly, not Java objects.
>
> http://apps.ideaconsult.net:8080/ambit2/dataset?page=0&pagesize=100
>
> And of course check http://ambit.sf.net for full featured MySQL structure
> searchable database + properties (no cartridge , no memory hog ORM,  just
> JDBC) with REST web service API ( i.e. OpenTox API).
>
> Best regards,
> Nina
>
>
>>  - converting and full configuring Atomcontainers from molfiles is a
>> very expensive operation
>> - AtomContainers use a lot of memory so it must be tightly controlled how
>> many are in memory and hence the point before comes into play again
>>
>> This problem was also observed by the creators of OrChem, the Oracle
>> cartridge based on CDK. And hence they created a custom serialization
>> method that take less space than molfiles and stores configuration info of
>> atomcontainers:
>>
>>
>> http://orchem.cvs.sourceforge.net/viewvc/orchem/OrChem/src/uk/ac/ebi/orchem/search/OrchemMoleculeBuilder.java?view=markup
>>
>> This is way faster (at least 10) than using the molfiles.
>>
>>
>> Also when talking about storing chemical structures in a database I can
>> gladly refer you below project of mine:
>>
>> https://bitbucket.org/kienerj/moleculedatabaseframework
>>
>> Best Regards,
>>
>> Joos
>>
>>
>> 2013/9/24 lochana menikarachchi <[email protected]>
>>
>>> What is the recommended method for storing IAtomContainers in a
>>> database. Serialize? MDL Strings?
>>> Is there anyway to get the MDLV2000 representation as a String from
>>> IAtomContainer??
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> October Webinars: Code for Performance
>>> Free Intel webinars can help you accelerate application performance.
>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
>>> from
>>> the latest Intel processors and coprocessors. See abstracts and register
>>> >
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> Cdk-user mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> October Webinars: Code for Performance
>> Free Intel webinars can help you accelerate application performance.
>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
>> from
>> the latest Intel processors and coprocessors. See abstracts and register >
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Cdk-user mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>>
>
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to