Hi,

How do CDK descriptors handle molecules with multiple compounds in it?

I experimented a bit, and found out that it depends on the descriptor:
* most descriptors apparently just add up the values of the single
compounds (like xlogp, that does make no sense does it?)
* some fail for multi-compound molecules
* some compute sth else

My application is building QSAR models. I am not a chemist, but my
feeling is that the clean but complicated solution would be to have
'set-valued features' (a set of values instead of a single value) for
multi-compound molecules. But thats pretty complicated and most of my
molecules have only one compound. But I think that the average value
of the single compounds should be preferred for descriptors like
molecular weight or logp.

Kind regards,
Martin

P.S.: Sorry, If I missed existing discussions/documentation on this
issue, I had some problems to denominate (and therefore google) this
issue.

-- 
Dipl-Inf. Martin Gütlein
Phone:
+49 (0)761 203 8442 (office)
+49 (0)177 623 9499 (mobile)
Email:
[email protected]

------------------------------------------------------------------------------
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to