Thank you very much! This is really helpful!

Ali

On Wed, Aug 29, 2018 at 7:52 AM Richard Cooper <
richardiancooper+rdkitdisc...@gmail.com> wrote:

> I think it depends on what you need the descriptor for. If it were for
> some kind of fingerprinting, the example implementation would be too noisy.
> We used it to estimate how many low energy conformations of a molecule
> might be present in a particular system - and it turned out that correlated
> well with our classifications of the system.  The variability increases
> with RBC: for totally rigid systems RBC and nConf20 are zero. For more
> reproducible results you can increase the number of conformers generated;
> the cost is longer calculations, but if you only have 350 molecules this
> might be OK.
>
> In the paper there are two example molecules with RBC of 1 and 8
> respectively which both have only a single low energy conformation, and it
> was this discrimination beyond simple RBC that drove its development.
>
> Analysis of the spread of nConf20 showed that it was larger than the
> spread of RBC, which might give it slightly better properties as an input
> descriptor. However, if you are finding less variability in your particular
> data set, then it might not be such a good discriminator of whatever you're
> trying to discriminate. I wouldn't recommend adopting it as the 'main
> descriptor' until you test whether it's useful.
>
> Regards,
> Richard
>
>
>
>
> On Wed, Aug 29, 2018 at 3:24 PM Ali Eftekhari <a.b.eftekh...@gmail.com>
> wrote:
>
>> Hi Dr. Cooper,
>>
>> Thanks for your response and the suggestions.  I added randomSeed=737 and
>> I now get value of 14 for descriptor nConf20 for ZINC000290539224 molecule
>> (although it is different than your paper [the value is 10] it does not
>> change on each run).  My concern now is on the general usage of nConf20
>> descriptor.  For instance, is there a limitation on what molecules can be
>> used for estimating their nConf20? Since the conformers are generated
>> randomly, how reliable is this descriptor to use it as a replacement for
>> Rotatable Bond Count (RBC) in all machine learning models.
>>
>> In my application, the calculated values of RBC for 350 molecules range
>> from 0 to 7 with (80% between 0-4 and 20% between 5-7).  The calculated
>> values of nconf20 is between 0-40 but with 95% between 0-3.  Since nConf20
>> for majority of molecules is between 0-3, I am concerned on the usage of
>> nconf20 as the main descriptor.  Could you please comment on that?
>>
>> Thanks,
>> Ali
>>
>> On Wed, Aug 29, 2018 at 6:32 AM Richard Cooper <
>> richardiancooper+rdkitdisc...@gmail.com> wrote:
>>
>>>
>>> Just to follow up with the details - here is the line in the script to
>>> change:
>>>
>>>    conformers = AllChem.EmbedMultipleConfs
>>> (molecule,numConfs,pruneRmsThresh=0.5,  numThreads =3)
>>>
>>> to
>>>
>>>    conformers = AllChem.EmbedMultipleConfs
>>> (molecule,numConfs,pruneRmsThresh=0.5,  numThreads =3,  randomSeed=737 )
>>>
>>> (where 737 is an integer constant of your choice, but not -1).
>>>
>>> Richard
>>>
>>>
>>> On Tue, Aug 28, 2018 at 12:55 PM Richard Cooper <
>>> richardiancooper+rdkitdisc...@gmail.com> wrote:
>>> >
>>> > Hi Ali,
>>> >
>>> > Sorry I missed your email.
>>> >
>>> > The behaviour you describe is correct, due to a random seed in the
>>> conformer generation step. The descriptor value usually doesn't vary by too
>>> much.
>>> >
>>> > I think you can give the conformer generation a constant random seed
>>> if you need a reproducible number for nConf20.
>>> >
>>> > Regards, Richard
>>> >
>>> >
>>> > On Tue, 28 Aug 2018, 00:25 Ali Eftekhari, <a.b.eftekh...@gmail.com>
>>> wrote:
>>> >>
>>> >> Hello all,
>>> >>
>>> >> I am trying to calculate 3D Descriptors following this publication:
>>> >> "Beyond Rotatable Bond Counts: Capturing 3D Conformational
>>> Flexibility in a Single Descriptor", Jerome G. P. Wicker and Richard I.
>>> Cooper.  J. Chem. Inf. Model. 2016, 56, 2347−2352
>>> >>
>>> >> I am essentially using the same script as they have in the supporting
>>> information and i have attached it here as well.  In Table 2 from the above
>>> calculation, the value of the descriptor (nConf20) for ZINC000290539224
>>> molecule is listed as 10.  However, when I run the exact code as the one
>>> they used, I get different value at each run.
>>> >>
>>> >> I have already contacted the authors but got no response.  I am
>>> wondering if the code they have in the supporting information is not right
>>> or the value they listed in the table is wrong?
>>> >>
>>> >> The SMILES string for this particular molecule is:
>>> >> 'CC(C)N2CC(NCc1cnc(C(C)O)s1)CC2=O'
>>> >>
>>> >> Thanks in advance for your help!
>>> >>
>>>
>>>>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to