Hi, How do CDK descriptors handle molecules with multiple compounds in it?
I experimented a bit, and found out that it depends on the descriptor: * most descriptors apparently just add up the values of the single compounds (like xlogp, that does make no sense does it?) * some fail for multi-compound molecules * some compute sth else My application is building QSAR models. I am not a chemist, but my feeling is that the clean but complicated solution would be to have 'set-valued features' (a set of values instead of a single value) for multi-compound molecules. But thats pretty complicated and most of my molecules have only one compound. But I think that the average value of the single compounds should be preferred for descriptors like molecular weight or logp. Kind regards, Martin P.S.: Sorry, If I missed existing discussions/documentation on this issue, I had some problems to denominate (and therefore google) this issue. -- Dipl-Inf. Martin Gütlein Phone: +49 (0)761 203 8442 (office) +49 (0)177 623 9499 (mobile) Email: [email protected] ------------------------------------------------------------------------------ Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk _______________________________________________ Cdk-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/cdk-user

