[Adding the mailing list back on]

On Mon, Dec 2, 2019 at 3:23 PM Thomas Evangelidis <teva...@gmail.com> wrote:

>
> Thank you for the corrections and the explanation! As stated in my
> original email, I want to add extra atomic properties, like the partial
> charges, as atom invariants and assess whether they are beneficial in terms
> of performance.
>

But if you add partial charges (a floating point number) then essentially
every atom is going to end up with its own invariant. That's unlikely to
end well.


> Therefore, I wanted first to re-implement the standard invariants in
> Python in order to have a reference point for comparison later.
>

Ok, that makes sense.


> Your corrections improved the overall agreement, but as you pointed out,
> it is not complete. Although with the two example molecules the
> substructures and hence the number of 'on' bits are the same, on a large
> scale, namely generating fingerprint to train an ML model, the number of
> invariant bits is 2166 using the original ECFP atom invariants, while with
> user-defined invariants are 2171. In terms of performance, the original is
> marginally better.
>

I'm going to guess, and without info on which molecules are generating
different numbers of bits that's all I can do, that this is a result of the
different hashing schemes for the atom invariants. If you really want to
track down what's going on, you'll have to figure out which molecules are
different and share those.


> Btw, why did you use
>
> if(ring_info.NumAtomRings(i)):
>     descriptors.append(1)
>
> and not
>
> descriptors.append(a.IsInRing())
>
> ? Your 2 lines do not add any value to the 'descriptors' list if the atom
> does not belong to a ring. Is this how it is in the original implementation?
>

Yeah, I just copied what's in the original implementation. The results
should be exactly the same.

-greg

>
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to