Re: [Rdkit-discuss] PMI API
Thanks Brian, PBF = 0 <=> 2D & PBF >0 <=> 3D. I forget that point. BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 De : Brian Kelley Envoyé : mardi 17 janvier 2017 14:06 À : Guillaume GODIN Cc : cgearns...@gmail.com; Rdkit-discuss@lists.sourceforge.net; Greg Landrum Objet : Re: [Rdkit-discuss] PMI API In the inertial frame this is trivial, however, with the current RDKit can't you just use the plane of best fit here for the planar/3D? For a linear molecule, you can use the PMI descriptors. See PBF in RDKit http://pubs.acs.org/doi/abs/10.1021/ci300293f Cheers, Brian On Tue, Jan 17, 2017 at 7:58 AM, Guillaume GODIN mailto:guillaume.go...@firmenich.com>> wrote: Great! I also notice confusing usage of moment of Inertia in those descriptors. For exemple in WHIM case, we need to know if the molecule is linear, planar or 3D in order to compute the descriptors. I did not find a easy way to determine this yet. BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 De : Brian Kelley mailto:fustiga...@gmail.com>> Envoyé : mardi 17 janvier 2017 13:44 À : Chris Earnshaw Cc : Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>; Greg Landrum Objet : Re: [Rdkit-discuss] PMI API I think we agree here. Here I was talking about the raw Moment (M1z) not the moment of interia (MI1), I should have made the disctinction more explicit. Moments are not necessarily Moments of inertia. The terminology gets confusing. After a brief discussion with Greg, the Moments.py does the correct calculation which indirectly verifies MOE and the newer RDKit implementation. Cheers, Brian On Tue, Jan 17, 2017 at 7:39 AM, Chris Earnshaw mailto:cgearns...@gmail.com>> wrote: The dimensions along one of the axes of a planar molecule in its inertial frame will be zero, but the principal moments of inertia will all be non-zero. The moment of inertia about an axis can only be zero if all the atoms in the molecule are precisely aligned on that axis. That's only possible for linear molecules. There's no way to draw a straight line axis through all the atoms in a non-linear molecule, which would be a requirement for the corresponding moment of inertia to be zero. Chris On 17 January 2017 at 12:29, Brian Kelley mailto:fustiga...@gmail.com>> wrote: Looks like I'm late to the game. I don't know about the PMI descriptors per-se, but if a planar molecule is in it's inertial frame, one of the axes should be zero (whether it is x, y or z) which means that the one of the M1x, M1y or M1z should be zero. We had some good experimentation with multipole expansion of moments (essentially based on the description of electrostatic multipoles) that might be nice to add to the PMI framework. Greg, I'm assuming that the Moments.py we opensourced a while back is similarly broken? I'm attaching it here for posterity but it does appear to match the moe PMI's. On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw mailto:cgearns...@gmail.com>> wrote: The new version looks good to me as far as I can test it. PMI and NPR are still fine, the radius of gyration is right (for an extremely artificial test system) and the asphericity index also seems right (despite my best efforts to confuse things further - sorry about that!). Also highlights even more confusion in the Todeschini article - the approximate asphericity values for prolate and oblate molecules are reversed. The only (very trivial) thing I've spotted is the comment in the inertialShapeFactor function. 'planar or no coordinates' should be 'linear or no coordinates' to avoid confusion. Chris On 16 January 2017 at 09:30, Greg Landrum mailto:greg.land...@gmail.com>> wrote: On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw mailto:ch...@cge-compchem.co.uk>> wrote: Either way, it makes it rather hard to trust their derivations generally - especially as there appear to be other errors (e.g. the denominator in eq. 16 should be the square root of the given sum of squares, according to their reference). Indeed. Given the problems encountered, I went back and checked some additional references to find definitions of the descriptors. The results are in this PR, which I'd love feedback on if you have time to take a look: https://github.com/rdkit/rdkit/pull/1265 I didn't manage to find any informatio
Re: [Rdkit-discuss] PMI API
In the inertial frame this is trivial, however, with the current RDKit can't you just use the plane of best fit here for the planar/3D? For a linear molecule, you can use the PMI descriptors. See PBF in RDKit http://pubs.acs.org/doi/abs/10.1021/ci300293f Cheers, Brian On Tue, Jan 17, 2017 at 7:58 AM, Guillaume GODIN < guillaume.go...@firmenich.com> wrote: > Great! I also notice confusing usage of moment of Inertia in those > descriptors. > > > For exemple in WHIM case, we need to know if the molecule is linear, > planar or 3D in order to compute the descriptors. > > > I did not find a easy way to determine this yet. > > > BR, > > *Dr. Guillaume GODIN* > Principal Scientist > Chemoinformatic & Datamining > Innovation > CORPORATE R&D DIVISION > DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045> > MOBILE +41 (0)79 536 1039 <+41%2079%20536%2010%2039> > Firmenich SA > RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 > > -- > *De :* Brian Kelley > *Envoyé :* mardi 17 janvier 2017 13:44 > *À :* Chris Earnshaw > *Cc :* Rdkit-discuss@lists.sourceforge.net; Greg Landrum > *Objet :* Re: [Rdkit-discuss] PMI API > > I think we agree here. Here I was talking about the raw Moment (M1z) not > the moment of interia (MI1), I should have made the disctinction more > explicit. Moments are not necessarily Moments of inertia. The terminology > gets confusing. > > After a brief discussion with Greg, the Moments.py does the correct > calculation which indirectly verifies MOE and the newer RDKit > implementation. > > Cheers, > Brian > > On Tue, Jan 17, 2017 at 7:39 AM, Chris Earnshaw > wrote: > >> The dimensions along one of the axes of a planar molecule in its inertial >> frame will be zero, but the principal moments of inertia will all be >> non-zero. The moment of inertia about an axis can only be zero if all the >> atoms in the molecule are precisely aligned on that axis. That's only >> possible for linear molecules. There's no way to draw a straight line axis >> through all the atoms in a non-linear molecule, which would be a >> requirement for the corresponding moment of inertia to be zero. >> >> Chris >> >> On 17 January 2017 at 12:29, Brian Kelley wrote: >> >>> Looks like I'm late to the game. I don't know about the PMI descriptors >>> per-se, but if a planar molecule is in it's inertial frame, one of the axes >>> should be zero (whether it is x, y or z) which means that the one of the >>> M1x, M1y or M1z should be zero. >>> >>> We had some good experimentation with multipole expansion of moments >>> (essentially based on the description of electrostatic multipoles) that >>> might be nice to add to the PMI framework. >>> >>> Greg, I'm assuming that the Moments.py we opensourced a while back is >>> similarly broken? I'm attaching it here for posterity but it does appear >>> to match the moe PMI's. >>> >>> >>> >>> On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw >>> wrote: >>> >>>> The new version looks good to me as far as I can test it. PMI and NPR >>>> are still fine, the radius of gyration is right (for an extremely >>>> artificial test system) and the asphericity index also seems right (despite >>>> my best efforts to confuse things further - sorry about that!). Also >>>> highlights even more confusion in the Todeschini article - the approximate >>>> asphericity values for prolate and oblate molecules are reversed. >>>> >>>> The only (very trivial) thing I've spotted is the comment in the >>>> inertialShapeFactor function. 'planar or no coordinates' should be 'linear >>>> or no coordinates' to avoid confusion. >>>> >>>> Chris >>>> >>>> On 16 January 2017 at 09:30, Greg Landrum >>>> wrote: >>>> >>>>> >>>>> >>>>> On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw < >>>>> ch...@cge-compchem.co.uk> wrote: >>>>> >>>>>> >>>>>> Either way, it makes it rather hard to trust their derivations >>>>>> generally - especially as there appear to be other errors (e.g. the >>>>>> denominator in eq. 16 should be the square root of the given sum of >>>>>> squares, according to their reference). >>>>>> >>>>&
Re: [Rdkit-discuss] PMI API
Great! I also notice confusing usage of moment of Inertia in those descriptors. For exemple in WHIM case, we need to know if the molecule is linear, planar or 3D in order to compute the descriptors. I did not find a easy way to determine this yet. BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 De : Brian Kelley Envoyé : mardi 17 janvier 2017 13:44 À : Chris Earnshaw Cc : Rdkit-discuss@lists.sourceforge.net; Greg Landrum Objet : Re: [Rdkit-discuss] PMI API I think we agree here. Here I was talking about the raw Moment (M1z) not the moment of interia (MI1), I should have made the disctinction more explicit. Moments are not necessarily Moments of inertia. The terminology gets confusing. After a brief discussion with Greg, the Moments.py does the correct calculation which indirectly verifies MOE and the newer RDKit implementation. Cheers, Brian On Tue, Jan 17, 2017 at 7:39 AM, Chris Earnshaw mailto:cgearns...@gmail.com>> wrote: The dimensions along one of the axes of a planar molecule in its inertial frame will be zero, but the principal moments of inertia will all be non-zero. The moment of inertia about an axis can only be zero if all the atoms in the molecule are precisely aligned on that axis. That's only possible for linear molecules. There's no way to draw a straight line axis through all the atoms in a non-linear molecule, which would be a requirement for the corresponding moment of inertia to be zero. Chris On 17 January 2017 at 12:29, Brian Kelley mailto:fustiga...@gmail.com>> wrote: Looks like I'm late to the game. I don't know about the PMI descriptors per-se, but if a planar molecule is in it's inertial frame, one of the axes should be zero (whether it is x, y or z) which means that the one of the M1x, M1y or M1z should be zero. We had some good experimentation with multipole expansion of moments (essentially based on the description of electrostatic multipoles) that might be nice to add to the PMI framework. Greg, I'm assuming that the Moments.py we opensourced a while back is similarly broken? I'm attaching it here for posterity but it does appear to match the moe PMI's. On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw mailto:cgearns...@gmail.com>> wrote: The new version looks good to me as far as I can test it. PMI and NPR are still fine, the radius of gyration is right (for an extremely artificial test system) and the asphericity index also seems right (despite my best efforts to confuse things further - sorry about that!). Also highlights even more confusion in the Todeschini article - the approximate asphericity values for prolate and oblate molecules are reversed. The only (very trivial) thing I've spotted is the comment in the inertialShapeFactor function. 'planar or no coordinates' should be 'linear or no coordinates' to avoid confusion. Chris On 16 January 2017 at 09:30, Greg Landrum mailto:greg.land...@gmail.com>> wrote: On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw mailto:ch...@cge-compchem.co.uk>> wrote: Either way, it makes it rather hard to trust their derivations generally - especially as there appear to be other errors (e.g. the denominator in eq. 16 should be the square root of the given sum of squares, according to their reference). Indeed. Given the problems encountered, I went back and checked some additional references to find definitions of the descriptors. The results are in this PR, which I'd love feedback on if you have time to take a look: https://github.com/rdkit/rdkit/pull/1265 I didn't manage to find any information about "inertial shape factor" and don't have access to the references cited in the Todeschini paper, but I think the others are now reasonably reliable. -greg -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ** DISCLAIMER This email and any files transmitted with it, including replies and forwarded copies (which may contain alterations) subsequently transmitted from Firmenich, are confidential and solely for the use of the i
Re: [Rdkit-discuss] PMI API
I think we agree here. Here I was talking about the raw Moment (M1z) not the moment of interia (MI1), I should have made the disctinction more explicit. Moments are not necessarily Moments of inertia. The terminology gets confusing. After a brief discussion with Greg, the Moments.py does the correct calculation which indirectly verifies MOE and the newer RDKit implementation. Cheers, Brian On Tue, Jan 17, 2017 at 7:39 AM, Chris Earnshaw wrote: > The dimensions along one of the axes of a planar molecule in its inertial > frame will be zero, but the principal moments of inertia will all be > non-zero. The moment of inertia about an axis can only be zero if all the > atoms in the molecule are precisely aligned on that axis. That's only > possible for linear molecules. There's no way to draw a straight line axis > through all the atoms in a non-linear molecule, which would be a > requirement for the corresponding moment of inertia to be zero. > > Chris > > On 17 January 2017 at 12:29, Brian Kelley wrote: > >> Looks like I'm late to the game. I don't know about the PMI descriptors >> per-se, but if a planar molecule is in it's inertial frame, one of the axes >> should be zero (whether it is x, y or z) which means that the one of the >> M1x, M1y or M1z should be zero. >> >> We had some good experimentation with multipole expansion of moments >> (essentially based on the description of electrostatic multipoles) that >> might be nice to add to the PMI framework. >> >> Greg, I'm assuming that the Moments.py we opensourced a while back is >> similarly broken? I'm attaching it here for posterity but it does appear >> to match the moe PMI's. >> >> >> >> On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw >> wrote: >> >>> The new version looks good to me as far as I can test it. PMI and NPR >>> are still fine, the radius of gyration is right (for an extremely >>> artificial test system) and the asphericity index also seems right (despite >>> my best efforts to confuse things further - sorry about that!). Also >>> highlights even more confusion in the Todeschini article - the approximate >>> asphericity values for prolate and oblate molecules are reversed. >>> >>> The only (very trivial) thing I've spotted is the comment in the >>> inertialShapeFactor function. 'planar or no coordinates' should be 'linear >>> or no coordinates' to avoid confusion. >>> >>> Chris >>> >>> On 16 January 2017 at 09:30, Greg Landrum >>> wrote: >>> On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw < ch...@cge-compchem.co.uk> wrote: > > Either way, it makes it rather hard to trust their derivations > generally - especially as there appear to be other errors (e.g. the > denominator in eq. 16 should be the square root of the given sum of > squares, according to their reference). > Indeed. Given the problems encountered, I went back and checked some additional references to find definitions of the descriptors. The results are in this PR, which I'd love feedback on if you have time to take a look: https://github.com/rdkit/rdkit/pull/1265 I didn't manage to find any information about "inertial shape factor" and don't have access to the references cited in the Todeschini paper, but I think the others are now reasonably reliable. -greg >>> >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >>> >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] PMI API
The dimensions along one of the axes of a planar molecule in its inertial frame will be zero, but the principal moments of inertia will all be non-zero. The moment of inertia about an axis can only be zero if all the atoms in the molecule are precisely aligned on that axis. That's only possible for linear molecules. There's no way to draw a straight line axis through all the atoms in a non-linear molecule, which would be a requirement for the corresponding moment of inertia to be zero. Chris On 17 January 2017 at 12:29, Brian Kelley wrote: > Looks like I'm late to the game. I don't know about the PMI descriptors > per-se, but if a planar molecule is in it's inertial frame, one of the axes > should be zero (whether it is x, y or z) which means that the one of the > M1x, M1y or M1z should be zero. > > We had some good experimentation with multipole expansion of moments > (essentially based on the description of electrostatic multipoles) that > might be nice to add to the PMI framework. > > Greg, I'm assuming that the Moments.py we opensourced a while back is > similarly broken? I'm attaching it here for posterity but it does appear > to match the moe PMI's. > > > > On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw > wrote: > >> The new version looks good to me as far as I can test it. PMI and NPR are >> still fine, the radius of gyration is right (for an extremely artificial >> test system) and the asphericity index also seems right (despite my best >> efforts to confuse things further - sorry about that!). Also highlights >> even more confusion in the Todeschini article - the approximate asphericity >> values for prolate and oblate molecules are reversed. >> >> The only (very trivial) thing I've spotted is the comment in the >> inertialShapeFactor function. 'planar or no coordinates' should be 'linear >> or no coordinates' to avoid confusion. >> >> Chris >> >> On 16 January 2017 at 09:30, Greg Landrum wrote: >> >>> >>> >>> On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw < >>> ch...@cge-compchem.co.uk> wrote: >>> Either way, it makes it rather hard to trust their derivations generally - especially as there appear to be other errors (e.g. the denominator in eq. 16 should be the square root of the given sum of squares, according to their reference). >>> >>> Indeed. Given the problems encountered, I went back and checked some >>> additional references to find definitions of the descriptors. The results >>> are in this PR, which I'd love feedback on if you have time to take a look: >>> https://github.com/rdkit/rdkit/pull/1265 >>> >>> I didn't manage to find any information about "inertial shape factor" >>> and don't have access to the references cited in the Todeschini paper, but >>> I think the others are now reasonably reliable. >>> >>> -greg >>> >>> >>> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] PMI API
Looks like I'm late to the game. I don't know about the PMI descriptors per-se, but if a planar molecule is in it's inertial frame, one of the axes should be zero (whether it is x, y or z) which means that the one of the M1x, M1y or M1z should be zero. We had some good experimentation with multipole expansion of moments (essentially based on the description of electrostatic multipoles) that might be nice to add to the PMI framework. Greg, I'm assuming that the Moments.py we opensourced a while back is similarly broken? I'm attaching it here for posterity but it does appear to match the moe PMI's. On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw wrote: > The new version looks good to me as far as I can test it. PMI and NPR are > still fine, the radius of gyration is right (for an extremely artificial > test system) and the asphericity index also seems right (despite my best > efforts to confuse things further - sorry about that!). Also highlights > even more confusion in the Todeschini article - the approximate asphericity > values for prolate and oblate molecules are reversed. > > The only (very trivial) thing I've spotted is the comment in the > inertialShapeFactor function. 'planar or no coordinates' should be 'linear > or no coordinates' to avoid confusion. > > Chris > > On 16 January 2017 at 09:30, Greg Landrum wrote: > >> >> >> On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw < >> ch...@cge-compchem.co.uk> wrote: >> >>> >>> Either way, it makes it rather hard to trust their derivations generally >>> - especially as there appear to be other errors (e.g. the denominator in >>> eq. 16 should be the square root of the given sum of squares, according to >>> their reference). >>> >> >> Indeed. Given the problems encountered, I went back and checked some >> additional references to find definitions of the descriptors. The results >> are in this PR, which I'd love feedback on if you have time to take a look: >> https://github.com/rdkit/rdkit/pull/1265 >> >> I didn't manage to find any information about "inertial shape factor" and >> don't have access to the references cited in the Todeschini paper, but I >> think the others are now reasonably reliable. >> >> -greg >> >> >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > Moments.py Description: Binary data -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] PMI API
The new version looks good to me as far as I can test it. PMI and NPR are still fine, the radius of gyration is right (for an extremely artificial test system) and the asphericity index also seems right (despite my best efforts to confuse things further - sorry about that!). Also highlights even more confusion in the Todeschini article - the approximate asphericity values for prolate and oblate molecules are reversed. The only (very trivial) thing I've spotted is the comment in the inertialShapeFactor function. 'planar or no coordinates' should be 'linear or no coordinates' to avoid confusion. Chris On 16 January 2017 at 09:30, Greg Landrum wrote: > > > On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw > wrote: > >> >> Either way, it makes it rather hard to trust their derivations generally >> - especially as there appear to be other errors (e.g. the denominator in >> eq. 16 should be the square root of the given sum of squares, according to >> their reference). >> > > Indeed. Given the problems encountered, I went back and checked some > additional references to find definitions of the descriptors. The results > are in this PR, which I'd love feedback on if you have time to take a look: > https://github.com/rdkit/rdkit/pull/1265 > > I didn't manage to find any information about "inertial shape factor" and > don't have access to the references cited in the Todeschini paper, but I > think the others are now reasonably reliable. > > -greg > > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] PMI API
On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw wrote: > > Either way, it makes it rather hard to trust their derivations generally - > especially as there appear to be other errors (e.g. the denominator in eq. > 16 should be the square root of the given sum of squares, according to > their reference). > Indeed. Given the problems encountered, I went back and checked some additional references to find definitions of the descriptors. The results are in this PR, which I'd love feedback on if you have time to take a look: https://github.com/rdkit/rdkit/pull/1265 I didn't manage to find any information about "inertial shape factor" and don't have access to the references cited in the Todeschini paper, but I think the others are now reasonably reliable. -greg -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] PMI API
Dear Guillaume Thanks - looks like we agree about reality (good!) and that Todeschini et al. are wrong in their discussion about planar molecules. Whether this is a simple mistaken assertion, or if they've mixed up another quantity (e.g. the eigenvalues of the covariance matrix) with the PMIs is impossible to say. Either way, it makes it rather hard to trust their derivations generally - especially as there appear to be other errors (e.g. the denominator in eq. 16 should be the square root of the given sum of squares, according to their reference). Best regards, Chris Dr Chris Earnshaw CGE Computational Chemistry Phone: +44(0) 1223 426000 Mobile: 07944 707773 E-mail: ch...@cge-compchem.co.uk On 16 January 2017 at 08:54, Guillaume GODIN wrote: > Dear Chris, > > > No prob let me explain: > > > I Aggree on monoatomics center of mass is the atom so (for all x axis: > Ix= 0) > > > > Now I consider the mathematics only not the physics. > > > I suggest that they (Todeschini) are not really computing the "real > physical" PMi on the 3 axis but arbitrary said that for 2D molecules the > 3nd axis PMi is zero. > > > BR > > > > *Dr. Guillaume GODIN* > Principal Scientist > Chemoinformatic & Datamining > Innovation > CORPORATE R&D DIVISION > DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045> > MOBILE +41 (0)79 536 1039 <+41%2079%20536%2010%2039> > Firmenich SA > RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 > > -- > *De :* Chris Earnshaw > *Envoyé :* lundi 16 janvier 2017 09:36 > *À :* Guillaume GODIN > *Cc :* Greg Landrum; RDKit Discuss > > *Objet :* Re: [Rdkit-discuss] PMI API > > > > On 16 January 2017 at 06:25, Guillaume GODIN < > guillaume.go...@firmenich.com> wrote: > >> reading carefully the Todeschini article, them said that Ic,Ib,Ia are >> determine as max & min values of I other all 3D axis passing throught the >> center of mass! >> > I don't quite understand this comment. The inequality Ia <= Ib <= Ic is > one of the errors in the Todeschini article pointed out by Greg yesterday. > By definition, the Principal Moment of Inertia axes pass through the centre > of mass. > > The "global PM" is never zero (sum of mi*ri*ri) idem for PMi even for >> planar molecule. >> > The global Moment of Inertia is only zero for monatomics. > > >> But When you have a planar molecule, the matrix is no more 3D but 2D! so >> it's normal to consider that the 3nd PM is zero. >> > I really don't understand this - it's simply wrong. The molecule may be 2D > but the three principal moments of inertia are most definitely non-zero for > a planar structure. For a fully symmetrical molecule like benzene the > largest PMI is around the axis perpendicular to the plane of the molecule > and there are two equivalent, smaller, PMIs perpendicular to each other in > the plane of the molecule. For a less symmetrical molecule like > naphthalene, the largest PMI is again around the axis perpendicular to the > plane, the intermediate PMI is along the fusion bond between the rings and > the smallest PMI is around the long axis of the molecule. There's no way it > can be correct to consider the 3rd PMI as zero in any planar molecule - > it's never equal to zero and is only degenerate with the 2nd PMI for fully > symmetric molecules. Only in the special case of a completely linear > molecule (e.g. acetylene, HCN) is the 3rd PMI (along the axis of the > molecule) equal to zero. > > Apologies - I appear to have opened a can of worms here... > > Chris > >> -- >> *De :* Greg Landrum >> *Envoyé :* dimanche 15 janvier 2017 17:42 >> *À :* Guillaume GODIN; RDKit Discuss >> >> *Objet :* Re: [Rdkit-discuss] PMI API >> >> Thanks Guillaume! >> >> On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN < >> guillaume.go...@firmenich.com> wrote: >> >>> Here, Dragon results for the 3 molecules: I've included both Whim and >>> 3D descriptors but I don't have access to PMi! >>> >>> >>> I found the second document in agreement with Peter answer... >>> >>> >>> BR, >>> >>> *Dr. Guillaume GODIN* >>> Principal Scientist >>> Chemoinformatic & Datamining >>> Innovation >>> CORPORATE R&D DIVISION >>> DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045> >>> MOBILE +41 (0)79 536 1039 <+41%2079%20536%2010%2039> >>> Firmenich SA >>>
Re: [Rdkit-discuss] PMI API
On Mon, Jan 16, 2017 at 9:36 AM, Chris Earnshaw wrote: > > Apologies - I appear to have opened a can of worms here... > No need whatsoever to apologize. You identified and pointed out a bug in the implementation of the new 3D descriptors, which is something very much appreciated. The fact that I picked a seemingly unreliable source for the definitions of those descriptors and that it's turning out to be difficult than I might like to find reliable definitions for some of them is just the way things are. I'll have an updated version checked in (hopefully) in the next couple hours. It would be great if you could take a look at it and let me know if it looks right. -greg -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] PMI API
Dear Chris, No prob let me explain: I Aggree on monoatomics center of mass is the atom so (for all x axis: Ix= 0) Now I consider the mathematics only not the physics. I suggest that they (Todeschini) are not really computing the "real physical" PMi on the 3 axis but arbitrary said that for 2D molecules the 3nd axis PMi is zero. BR Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 De : Chris Earnshaw Envoyé : lundi 16 janvier 2017 09:36 À : Guillaume GODIN Cc : Greg Landrum; RDKit Discuss Objet : Re: [Rdkit-discuss] PMI API On 16 January 2017 at 06:25, Guillaume GODIN mailto:guillaume.go...@firmenich.com>> wrote: reading carefully the Todeschini article, them said that Ic,Ib,Ia are determine as max & min values of I other all 3D axis passing throught the center of mass! I don't quite understand this comment. The inequality Ia <= Ib <= Ic is one of the errors in the Todeschini article pointed out by Greg yesterday. By definition, the Principal Moment of Inertia axes pass through the centre of mass. The "global PM" is never zero (sum of mi*ri*ri) idem for PMi even for planar molecule. The global Moment of Inertia is only zero for monatomics. But When you have a planar molecule, the matrix is no more 3D but 2D! so it's normal to consider that the 3nd PM is zero. I really don't understand this - it's simply wrong. The molecule may be 2D but the three principal moments of inertia are most definitely non-zero for a planar structure. For a fully symmetrical molecule like benzene the largest PMI is around the axis perpendicular to the plane of the molecule and there are two equivalent, smaller, PMIs perpendicular to each other in the plane of the molecule. For a less symmetrical molecule like naphthalene, the largest PMI is again around the axis perpendicular to the plane, the intermediate PMI is along the fusion bond between the rings and the smallest PMI is around the long axis of the molecule. There's no way it can be correct to consider the 3rd PMI as zero in any planar molecule - it's never equal to zero and is only degenerate with the 2nd PMI for fully symmetric molecules. Only in the special case of a completely linear molecule (e.g. acetylene, HCN) is the 3rd PMI (along the axis of the molecule) equal to zero. Apologies - I appear to have opened a can of worms here... Chris De : Greg Landrum mailto:greg.land...@gmail.com>> Envoyé : dimanche 15 janvier 2017 17:42 À : Guillaume GODIN; RDKit Discuss Objet : Re: [Rdkit-discuss] PMI API Thanks Guillaume! On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN mailto:guillaume.go...@firmenich.com>> wrote: Here, Dragon results for the 3 molecules: I've included both Whim and 3D descriptors but I don't have access to PMi! I found the second document in agreement with Peter answer... BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 De : Peter Gedeck mailto:peter.ged...@gmail.com>> Envoyé : dimanche 15 janvier 2017 15:07 À : Greg Landrum; RDKit Discuss; Guillaume GODIN Objet : Re: [Rdkit-discuss] PMI API According to this: https://en.wikipedia.org/wiki/List_of_moments_of_inertia The moments of inertia of a disk (something like benzene) are: Iz = mr^2/2 Ix = Iy = mr^2/4 None of them is zero. The smallest moment of inertia of a rod-like molecule (e.g. C#C) is zero. Best, Peter On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum mailto:greg.land...@gmail.com>> wrote: Hi Guillaume, I think it this case it's something else. According to the Todeschini article the smallest moment of inertia of a planar molecule like benzene should be zero. The eigenvalues of the inertia matrix for benzene, however, are definitely not zero (and not close enough that it's likely to be round-off error). It would be very nice if you could run the three files I mention through Dragon and let me know what it calculates for those descriptors. -greg _____ From: Guillaume GODIN mailto:guillaume.go...@firmenich.com>> Sent: Sunday, January 15, 2017 1:11 PM Subject: RE: [Rdkit-discuss] PMI API To: Greg Landrum mailto:greg.land...@gmail.com>>, RDKit Discuss mailto:rdkit-discuss@lists.sourceforge.net>>, Chris Earnshaw mailto:cgearns...@gmail.com>> Dear Greg, I suspect that it's a precision error or eigen algorithm shift between rdkit c++ & d
Re: [Rdkit-discuss] PMI API
On 16 January 2017 at 06:25, Guillaume GODIN wrote: > reading carefully the Todeschini article, them said that Ic,Ib,Ia are > determine as max & min values of I other all 3D axis passing throught the > center of mass! > I don't quite understand this comment. The inequality Ia <= Ib <= Ic is one of the errors in the Todeschini article pointed out by Greg yesterday. By definition, the Principal Moment of Inertia axes pass through the centre of mass. The "global PM" is never zero (sum of mi*ri*ri) idem for PMi even for > planar molecule. > The global Moment of Inertia is only zero for monatomics. > But When you have a planar molecule, the matrix is no more 3D but 2D! so > it's normal to consider that the 3nd PM is zero. > I really don't understand this - it's simply wrong. The molecule may be 2D but the three principal moments of inertia are most definitely non-zero for a planar structure. For a fully symmetrical molecule like benzene the largest PMI is around the axis perpendicular to the plane of the molecule and there are two equivalent, smaller, PMIs perpendicular to each other in the plane of the molecule. For a less symmetrical molecule like naphthalene, the largest PMI is again around the axis perpendicular to the plane, the intermediate PMI is along the fusion bond between the rings and the smallest PMI is around the long axis of the molecule. There's no way it can be correct to consider the 3rd PMI as zero in any planar molecule - it's never equal to zero and is only degenerate with the 2nd PMI for fully symmetric molecules. Only in the special case of a completely linear molecule (e.g. acetylene, HCN) is the 3rd PMI (along the axis of the molecule) equal to zero. Apologies - I appear to have opened a can of worms here... Chris > -- > *De :* Greg Landrum > *Envoyé :* dimanche 15 janvier 2017 17:42 > *À :* Guillaume GODIN; RDKit Discuss > > *Objet :* Re: [Rdkit-discuss] PMI API > > Thanks Guillaume! > > On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN < > guillaume.go...@firmenich.com> wrote: > >> Here, Dragon results for the 3 molecules: I've included both Whim and 3D >> descriptors but I don't have access to PMi! >> >> >> I found the second document in agreement with Peter answer... >> >> >> BR, >> >> *Dr. Guillaume GODIN* >> Principal Scientist >> Chemoinformatic & Datamining >> Innovation >> CORPORATE R&D DIVISION >> DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045> >> MOBILE +41 (0)79 536 1039 <+41%2079%20536%2010%2039> >> Firmenich SA >> RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 >> >> -- >> *De :* Peter Gedeck >> *Envoyé :* dimanche 15 janvier 2017 15:07 >> *À :* Greg Landrum; RDKit Discuss; Guillaume GODIN >> >> *Objet :* Re: [Rdkit-discuss] PMI API >> >> According to this: >> https://en.wikipedia.org/wiki/List_of_moments_of_inertia >> The moments of inertia of a disk (something like benzene) are: >> >> Iz = mr^2/2 >> Ix = Iy = mr^2/4 >> >> None of them is zero. The smallest moment of inertia of a rod-like >> molecule (e.g. C#C) is zero. >> >> Best, >> >> Peter >> >> >> >> On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum >> wrote: >> >>> Hi Guillaume, >>> >>> I think it this case it's something else. According to the Todeschini >>> article the smallest moment of inertia of a planar molecule like benzene >>> should be zero. The eigenvalues of the inertia matrix for benzene, however, >>> are definitely not zero (and not close enough that it's likely to be >>> round-off error). >>> It would be very nice if you could run the three files I mention through >>> Dragon and let me know what it calculates for those descriptors. >>> >>> -greg >>> >>> >>> _ >>> From: Guillaume GODIN >>> Sent: Sunday, January 15, 2017 1:11 PM >>> Subject: RE: [Rdkit-discuss] PMI API >>> To: Greg Landrum , RDKit Discuss < >>> rdkit-discuss@lists.sourceforge.net>, Chris Earnshaw < >>> cgearns...@gmail.com> >>> >>> >>> >>> Dear Greg, >>> >>> >>> I suspect that it's a precision error or eigen algorithm shift between >>> rdkit c++ & dragon. >>> >>> >>> To obtain good value, I suggest to try to implement a test on the eigen >>> values like i did in g
Re: [Rdkit-discuss] PMI API
No problem Greg, reading carefully the Todeschini article, them said that Ic,Ib,Ia are determine as max & min values of I other all 3D axis passing throught the center of mass! The "global PM" is never zero (sum of mi*ri*ri) idem for PMi even for planar molecule. But When you have a planar molecule, the matrix is no more 3D but 2D! so it's normal to consider that the 3nd PM is zero. BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 De : Greg Landrum Envoyé : dimanche 15 janvier 2017 17:42 À : Guillaume GODIN; RDKit Discuss Objet : Re: [Rdkit-discuss] PMI API Thanks Guillaume! On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN mailto:guillaume.go...@firmenich.com>> wrote: Here, Dragon results for the 3 molecules: I've included both Whim and 3D descriptors but I don't have access to PMi! I found the second document in agreement with Peter answer... BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 De : Peter Gedeck mailto:peter.ged...@gmail.com>> Envoyé : dimanche 15 janvier 2017 15:07 À : Greg Landrum; RDKit Discuss; Guillaume GODIN Objet : Re: [Rdkit-discuss] PMI API According to this: https://en.wikipedia.org/wiki/List_of_moments_of_inertia The moments of inertia of a disk (something like benzene) are: Iz = mr^2/2 Ix = Iy = mr^2/4 None of them is zero. The smallest moment of inertia of a rod-like molecule (e.g. C#C) is zero. Best, Peter On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum mailto:greg.land...@gmail.com>> wrote: Hi Guillaume, I think it this case it's something else. According to the Todeschini article the smallest moment of inertia of a planar molecule like benzene should be zero. The eigenvalues of the inertia matrix for benzene, however, are definitely not zero (and not close enough that it's likely to be round-off error). It would be very nice if you could run the three files I mention through Dragon and let me know what it calculates for those descriptors. -greg _ From: Guillaume GODIN mailto:guillaume.go...@firmenich.com>> Sent: Sunday, January 15, 2017 1:11 PM Subject: RE: [Rdkit-discuss] PMI API To: Greg Landrum mailto:greg.land...@gmail.com>>, RDKit Discuss mailto:rdkit-discuss@lists.sourceforge.net>>, Chris Earnshaw mailto:cgearns...@gmail.com>> Dear Greg, I suspect that it's a precision error or eigen algorithm shift between rdkit c++ & dragon. To obtain good value, I suggest to try to implement a test on the eigen values like i did in gateway.cpp implementation. JacobiSVD getSVD(MatrixXd A) { JacobiSVD mysvd(A, ComputeThinU | ComputeThinV); return mysvd; } // get the A-1 matrix using MatrixXd GetPinv(MatrixXd A){ JacobiSVD svd = getSVD(A); double pinvtoler=1.e-2;// choose your tolerance wisely! VectorXd vs=svd.singularValues(); VectorXd vsinv=svd.singularValues(); for (unsignedint i=0; i pinvtoler ) vsinv(i)=1.0/vs(i); else vsinv(i)=0.0; } MatrixXd S = vsinv.asDiagonal(); MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); return Ap; } If it's not solve the problem, I would like to test it in Matlab. can you provide me the 3 (3d xyz matrix) of your example please ? I also have Dragon 6 best regards, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 De : Greg Landrum mailto:greg.land...@gmail.com>> Envoyé : dimanche 15 janvier 2017 11:50 À : Chris Earnshaw; RDKit Discuss Objet : Re: [Rdkit-discuss] PMI API I managed to make some time to look into this this weekend and I've found a bug and something I don't understand. Hopefully the community can help out here. On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw mailto:cgearns...@gmail.com>> wrote: 4) The big one! The returned results look very odd. They appear to relate more to the dimensions of the molecule than the moments of inertia. For a rod-like molecule (dimethylacetylene) I'd expect two large and one small PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: 25.3699 NPR1: 0.002430 NPR2: 0.0024
Re: [Rdkit-discuss] PMI API
Thanks Guillaume! On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN < guillaume.go...@firmenich.com> wrote: > Here, Dragon results for the 3 molecules: I've included both Whim and 3D > descriptors but I don't have access to PMi! > > > I found the second document in agreement with Peter answer... > > > BR, > > *Dr. Guillaume GODIN* > Principal Scientist > Chemoinformatic & Datamining > Innovation > CORPORATE R&D DIVISION > DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045> > MOBILE +41 (0)79 536 1039 <+41%2079%20536%2010%2039> > Firmenich SA > RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 > > -- > *De :* Peter Gedeck > *Envoyé :* dimanche 15 janvier 2017 15:07 > *À :* Greg Landrum; RDKit Discuss; Guillaume GODIN > > *Objet :* Re: [Rdkit-discuss] PMI API > > According to this: > https://en.wikipedia.org/wiki/List_of_moments_of_inertia > The moments of inertia of a disk (something like benzene) are: > > Iz = mr^2/2 > Ix = Iy = mr^2/4 > > None of them is zero. The smallest moment of inertia of a rod-like > molecule (e.g. C#C) is zero. > > Best, > > Peter > > > > On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum > wrote: > >> Hi Guillaume, >> >> I think it this case it's something else. According to the Todeschini >> article the smallest moment of inertia of a planar molecule like benzene >> should be zero. The eigenvalues of the inertia matrix for benzene, however, >> are definitely not zero (and not close enough that it's likely to be >> round-off error). >> It would be very nice if you could run the three files I mention through >> Dragon and let me know what it calculates for those descriptors. >> >> -greg >> >> >> _ >> From: Guillaume GODIN >> Sent: Sunday, January 15, 2017 1:11 PM >> Subject: RE: [Rdkit-discuss] PMI API >> To: Greg Landrum , RDKit Discuss < >> rdkit-discuss@lists.sourceforge.net>, Chris Earnshaw < >> cgearns...@gmail.com> >> >> >> >> Dear Greg, >> >> >> I suspect that it's a precision error or eigen algorithm shift between >> rdkit c++ & dragon. >> >> >> To obtain good value, I suggest to try to implement a test on the eigen >> values like i did in gateway.cpp implementation. >> >> >> >> JacobiSVD getSVD(MatrixXd A) { >> >> JacobiSVD mysvd(A, ComputeThinU | ComputeThinV); >> >> return mysvd; >> >> } >> >> >> // get the A-1 matrix using >> >> MatrixXd GetPinv(MatrixXd A){ >> >> JacobiSVD svd = getSVD(A); >> >> double pinvtoler=1.e-2;// choose your tolerance wisely! >> >> VectorXd vs=svd.singularValues(); >> >> VectorXd vsinv=svd.singularValues(); >> >> >> for (unsignedint i=0; i> >> if ( vs(i) > pinvtoler ) >> >>vsinv(i)=1.0/vs(i); >> >>else vsinv(i)=0.0; >> >> } >> >> >> MatrixXd S = vsinv.asDiagonal(); >> >> MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); >> >> return Ap; >> >> } >> >> >> If it's not solve the problem, I would like to test it in Matlab. can you >> provide me the 3 (3d xyz matrix) of your example please ? >> >> >> I also have Dragon 6 >> >> >> best regards, >> >> *Dr. Guillaume GODIN* >> Principal Scientist >> Chemoinformatic & Datamining >> Innovation >> CORPORATE R&D DIVISION >> DIRECT LINE +41 (0)22 780 3645 <022%20780%2036%2045> >> MOBILE +41 (0)79 536 1039 <079%20536%2010%2039> >> Firmenich SA >> RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 >> >> -- >> *De :* Greg Landrum >> *Envoyé :* dimanche 15 janvier 2017 11:50 >> *À :* Chris Earnshaw; RDKit Discuss >> *Objet :* Re: [Rdkit-discuss] PMI API >> >> I managed to make some time to look into this this weekend and I've found >> a bug and something I don't understand. Hopefully the community can help >> out here. >> On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw >> wrote: >> >> 4) The big one! The returned results look very odd. They appear to relate >> more to the dimensions of the molecule than the moments of inertia. For a >> rod-like molecule (dimethylacetylene) I'd expect tw
Re: [Rdkit-discuss] PMI API
On Sun, Jan 15, 2017 at 5:15 PM, Chris Earnshaw wrote: > > I've built a version of RDKit with fixes from https://github.com/ > greglandrum/rdkit/tree/fix/github1262 and can confirm that it gives > exactly the same values of PMI and NPR that I got with the RDKit fork by > 'hahnda6'. I can't say for certain that the PMI values are correct in > absolute terms, but the NPR values are certainly what would be expected for > those test molecules. > Glad to hear it. > I'm worried about the Todeschini paper - I think there are errors in some > of the equations and inconsistencies in the discussion, some of which may > involve mixing up PMIs with eigenvalues of the covariance matrix. > Unfortunately I don't have access to the original references so can't check > in detail, but I'd be disinclined to take any of the equations at face > value. > Ok. I'm going to have to see if I can track down some additional references and work from there. -greg -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] PMI API
Thanks Greg I've built a version of RDKit with fixes from https://github.com/ greglandrum/rdkit/tree/fix/github1262 and can confirm that it gives exactly the same values of PMI and NPR that I got with the RDKit fork by 'hahnda6'. I can't say for certain that the PMI values are correct in absolute terms, but the NPR values are certainly what would be expected for those test molecules. I'm worried about the Todeschini paper - I think there are errors in some of the equations and inconsistencies in the discussion, some of which may involve mixing up PMIs with eigenvalues of the covariance matrix. Unfortunately I don't have access to the original references so can't check in detail, but I'd be disinclined to take any of the equations at face value. Chris On 15 January 2017 at 10:50, Greg Landrum wrote: > I managed to make some time to look into this this weekend and I've found > a bug and something I don't understand. Hopefully the community can help > out here. > > On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw > wrote: > >> 4) The big one! The returned results look very odd. They appear to relate >> more to the dimensions of the molecule than the moments of inertia. For a >> rod-like molecule (dimethylacetylene) I'd expect two large and one small >> PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 >> NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: >> 25.3699 NPR1: 0.002430 NPR2: 0.002430. >> For disk-like (benzene) the result should be one large and two medium >> (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: >> 0.500013) but get PMI1: 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: >> 2.14213e-11 NPR2: 0.33. >> Finally for a roughly spherical molecule (neopentane) the NPR values look >> reasonable (no great surprise) but the absolute PMI values may be too >> small: old program - PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 >> NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: >> 6.59488 PMI3: 6.59531 NPR1: 0.02 NPR2: 0.35 >> > > Your expectations are correct: the current RDKit implementation is wrong. > The corresponding github entry is here: https://github.com/ > rdkit/rdkit/issues/1262 > This is due to a mistake in the way the principal moments are calculated > (which is due to the fact that I don't spend a lot of time working > with/thinking about 3D descriptors). Instead of using the > eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the > RDKit is currently using the covariance matrix. There's some more on the > relationship between these two here: http://number-none.com/ > blow/inertia/deriving_i.html > > The problem is easy to fix (and I have something working here: > https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws > up the values of the descriptors that are derived from here: > Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of > Chemoinformatics http://dx.doi.org/10.1002/9783527618279.ch37 > These include the radius of gyration, inertial shape factor, etc. > Within that article they state that Ic = 0 for planar molecules. Ignoring > the inequality on page 1010, which says that Ic is the largest moment and > is contradicted by the rest of the text (particularly the inequalities on > page 1011), Ic corresponds to the smallest principal moment : PMI1. > > So now I'm confused, but I'm hoping this is obvious to someone versed in > the field: I'd like to reproduce the descriptors described in the > Todeschini article, but I clearly can't do that using the actual moments of > inertia. I could keep using the eigenvalues of the covariance matrix there, > but that doesn't match what's described in the text. > > Two things that would be extremely helpful: > 1) an explanation of the disconnect here from someone who knows this > stuff, I would guess that it's pretty simple > 2) The results of running the files github1262_1.mol, github1262_2.mol, > and github1262_3.mol from here: https://github.com/ > greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/ > MolTransforms/test_data through Dragon and calculating the radius of > gyration, inertial shape factor, eccentricity, molecular asphericity, and > spherocity index. > > Best, > -greg > > > >> > -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] PMI API
Here, Dragon results for the 3 molecules: I've included both Whim and 3D descriptors but I don't have access to PMi! I found the second document in agreement with Peter answer... BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 De : Peter Gedeck Envoyé : dimanche 15 janvier 2017 15:07 À : Greg Landrum; RDKit Discuss; Guillaume GODIN Objet : Re: [Rdkit-discuss] PMI API According to this: https://en.wikipedia.org/wiki/List_of_moments_of_inertia The moments of inertia of a disk (something like benzene) are: Iz = mr^2/2 Ix = Iy = mr^2/4 None of them is zero. The smallest moment of inertia of a rod-like molecule (e.g. C#C) is zero. Best, Peter On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum mailto:greg.land...@gmail.com>> wrote: Hi Guillaume, I think it this case it's something else. According to the Todeschini article the smallest moment of inertia of a planar molecule like benzene should be zero. The eigenvalues of the inertia matrix for benzene, however, are definitely not zero (and not close enough that it's likely to be round-off error). It would be very nice if you could run the three files I mention through Dragon and let me know what it calculates for those descriptors. -greg _ From: Guillaume GODIN mailto:guillaume.go...@firmenich.com>> Sent: Sunday, January 15, 2017 1:11 PM Subject: RE: [Rdkit-discuss] PMI API To: Greg Landrum mailto:greg.land...@gmail.com>>, RDKit Discuss mailto:rdkit-discuss@lists.sourceforge.net>>, Chris Earnshaw mailto:cgearns...@gmail.com>> Dear Greg, I suspect that it's a precision error or eigen algorithm shift between rdkit c++ & dragon. To obtain good value, I suggest to try to implement a test on the eigen values like i did in gateway.cpp implementation. JacobiSVD getSVD(MatrixXd A) { JacobiSVD mysvd(A, ComputeThinU | ComputeThinV); return mysvd; } // get the A-1 matrix using MatrixXd GetPinv(MatrixXd A){ JacobiSVD svd = getSVD(A); double pinvtoler=1.e-2;// choose your tolerance wisely! VectorXd vs=svd.singularValues(); VectorXd vsinv=svd.singularValues(); for (unsignedint i=0; i pinvtoler ) vsinv(i)=1.0/vs(i); else vsinv(i)=0.0; } MatrixXd S = vsinv.asDiagonal(); MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); return Ap; } If it's not solve the problem, I would like to test it in Matlab. can you provide me the 3 (3d xyz matrix) of your example please ? I also have Dragon 6 best regards, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 De : Greg Landrum mailto:greg.land...@gmail.com>> Envoyé : dimanche 15 janvier 2017 11:50 À : Chris Earnshaw; RDKit Discuss Objet : Re: [Rdkit-discuss] PMI API I managed to make some time to look into this this weekend and I've found a bug and something I don't understand. Hopefully the community can help out here. On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw mailto:cgearns...@gmail.com>> wrote: 4) The big one! The returned results look very odd. They appear to relate more to the dimensions of the molecule than the moments of inertia. For a rod-like molecule (dimethylacetylene) I'd expect two large and one small PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: 25.3699 NPR1: 0.002430 NPR2: 0.002430. For disk-like (benzene) the result should be one large and two medium (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: 0.500013) but get PMI1: 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: 2.14213e-11 NPR2: 0.33. Finally for a roughly spherical molecule (neopentane) the NPR values look reasonable (no great surprise) but the absolute PMI values may be too small: old program - PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: 6.59488 PMI3: 6.59531 NPR1: 0.02 NPR2: 0.35 Your expectations are correct: the current RDKit implementation is wrong. The corresponding github entry is here: https://github.com/rdkit/rdkit/issues/1262 This is due to a mistake in the way the principal moments are calculated (which is due to the fact that I don't spend a lot of time working with/thinking about 3D descriptors). Instead of using the eigenvectors/eigenvalues of the inertia matrix (the tenso
Re: [Rdkit-discuss] PMI API
According to this: https://en.wikipedia.org/wiki/List_of_moments_of_inertia The moments of inertia of a disk (something like benzene) are: Iz = mr^2/2 Ix = Iy = mr^2/4 None of them is zero. The smallest moment of inertia of a rod-like molecule (e.g. C#C) is zero. Best, Peter On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum wrote: > Hi Guillaume, > > I think it this case it's something else. According to the Todeschini > article the smallest moment of inertia of a planar molecule like benzene > should be zero. The eigenvalues of the inertia matrix for benzene, however, > are definitely not zero (and not close enough that it's likely to be > round-off error). > It would be very nice if you could run the three files I mention through > Dragon and let me know what it calculates for those descriptors. > > -greg > > > _ > From: Guillaume GODIN > Sent: Sunday, January 15, 2017 1:11 PM > Subject: RE: [Rdkit-discuss] PMI API > To: Greg Landrum , RDKit Discuss < > rdkit-discuss@lists.sourceforge.net>, Chris Earnshaw > > > > > Dear Greg, > > > I suspect that it's a precision error or eigen algorithm shift between > rdkit c++ & dragon. > > > To obtain good value, I suggest to try to implement a test on the eigen > values like i did in gateway.cpp implementation. > > > > JacobiSVD getSVD(MatrixXd A) { > > JacobiSVD mysvd(A, ComputeThinU | ComputeThinV); > > return mysvd; > > } > > > // get the A-1 matrix using > > MatrixXd GetPinv(MatrixXd A){ > > JacobiSVD svd = getSVD(A); > > double pinvtoler=1.e-2;// choose your tolerance wisely! > > VectorXd vs=svd.singularValues(); > > VectorXd vsinv=svd.singularValues(); > > > for (unsignedint i=0; i > if ( vs(i) > pinvtoler ) > >vsinv(i)=1.0/vs(i); > >else vsinv(i)=0.0; > > } > > > MatrixXd S = vsinv.asDiagonal(); > > MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); > > return Ap; > > } > > > If it's not solve the problem, I would like to test it in Matlab. can you > provide me the 3 (3d xyz matrix) of your example please ? > > > I also have Dragon 6 > > > best regards, > > *Dr. Guillaume GODIN* > Principal Scientist > Chemoinformatic & Datamining > Innovation > CORPORATE R&D DIVISION > DIRECT LINE +41 (0)22 780 3645 <022%20780%2036%2045> > MOBILE +41 (0)79 536 1039 <079%20536%2010%2039> > Firmenich SA > RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 > > -- > *De :* Greg Landrum > *Envoyé :* dimanche 15 janvier 2017 11:50 > *À :* Chris Earnshaw; RDKit Discuss > *Objet :* Re: [Rdkit-discuss] PMI API > > I managed to make some time to look into this this weekend and I've found > a bug and something I don't understand. Hopefully the community can help > out here. > On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw > wrote: > > 4) The big one! The returned results look very odd. They appear to relate > more to the dimensions of the molecule than the moments of inertia. For a > rod-like molecule (dimethylacetylene) I'd expect two large and one small > PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 > NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: > 25.3699 NPR1: 0.002430 NPR2: 0.002430. > For disk-like (benzene) the result should be one large and two medium > (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: > 0.500013) but get PMI1: 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: > 2.14213e-11 NPR2: 0.33. > Finally for a roughly spherical molecule (neopentane) the NPR values look > reasonable (no great surprise) but the absolute PMI values may be too > small: old program - PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 > NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: > 6.59488 PMI3: 6.59531 NPR1: 0.02 NPR2: 0.35 > > > Your expectations are correct: the current RDKit implementation is wrong. > The corresponding github entry is here: > https://github.com/rdkit/rdkit/issues/1262 > This is due to a mistake in the way the principal moments are calculated > (which is due to the fact that I don't spend a lot of time working > with/thinking about 3D descriptors). Instead of using the > eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the > RDKit is currently using the covariance matrix. There's some more on the > relationship between these two here: > http://number-none.com/blow/inertia/deriving_i.html > &
Re: [Rdkit-discuss] PMI API
Hi Guillaume, I think it this case it's something else. According to the Todeschini article the smallest moment of inertia of a planar molecule like benzene should be zero. The eigenvalues of the inertia matrix for benzene, however, are definitely not zero (and not close enough that it's likely to be round-off error).It would be very nice if you could run the three files I mention through Dragon and let me know what it calculates for those descriptors. -greg _ From: Guillaume GODIN Sent: Sunday, January 15, 2017 1:11 PM Subject: RE: [Rdkit-discuss] PMI API To: Greg Landrum , RDKit Discuss , Chris Earnshaw Dear Greg, I suspect that it's a precision error or eigen algorithm shift between rdkit c++ & dragon. To obtain good value, I suggest to try to implement a test on the eigen values like i did in gateway.cpp implementation. JacobiSVD getSVD(MatrixXd A) { JacobiSVD mysvd(A, ComputeThinU | ComputeThinV); return mysvd; } // get the A-1 matrix using MatrixXd GetPinv(MatrixXd A){ JacobiSVD svd = getSVD(A); double pinvtoler=1.e-2;// choose your tolerance wisely! VectorXd vs=svd.singularValues(); VectorXd vsinv=svd.singularValues(); for (unsignedint i=0; i pinvtoler ) vsinv(i)=1.0/vs(i); else vsinv(i)=0.0; } MatrixXd S = vsinv.asDiagonal(); MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); return Ap; } If it's not solve the problem, I would like to test it in Matlab. can you provide me the 3 (3d xyz matrix) of your example please ? I also have Dragon 6 best regards, Dr. Guillaume GODINPrincipal ScientistChemoinformatic & DataminingInnovationCORPORATE R&D DIVISIONDIRECT LINE +41 (0)22 780 3645MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 De : Greg Landrum Envoyé : dimanche 15 janvier 2017 11:50 À : Chris Earnshaw; RDKit Discuss Objet : Re: [Rdkit-discuss] PMI API I managed to make some time to look into this this weekend and I've found a bug and something I don't understand. Hopefully the community can help out here. On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw wrote: 4) The big one! The returned results look very odd. They appear to relate more to the dimensions of the molecule than the moments of inertia. For a rod-like molecule (dimethylacetylene) I'd expect two large and one small PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: 25.3699 NPR1: 0.002430 NPR2: 0.002430. For disk-like (benzene) the result should be one large and two medium (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: 0.500013) but get PMI1: 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: 2.14213e-11 NPR2: 0.33. Finally for a roughly spherical molecule (neopentane) the NPR values look reasonable (no great surprise) but the absolute PMI values may be too small: old program - PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: 6.59488 PMI3: 6.59531 NPR1: 0.02 NPR2: 0.35 Your expectations are correct: the current RDKit implementation is wrong. The corresponding github entry is here: https://github.com/rdkit/rdkit/issues/1262This is due to a mistake in the way the principal moments are calculated (which is due to the fact that I don't spend a lot of time working with/thinking about 3D descriptors). Instead of using the eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the RDKit is currently using the covariance matrix. There's some more on the relationship between these two here: http://number-none.com/blow/inertia/deriving_i.html The problem is easy to fix (and I have something working here: https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws up the values of the descriptors that are derived from here:Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of Chemoinformaticshttp://dx.doi.org/10.1002/9783527618279.ch37These include the radius of gyration, inertial shape factor, etc.Within that article they state that Ic = 0 for planar molecules. Ignoring the inequality on page 1010, which says that Ic is the largest moment and is contradicted by the rest of the text (particularly the inequalities on page 1011), Ic corresponds to the smallest principal moment : PMI1. So now I'm confused, but I'm hoping this is obvious to someone versed in the field: I'd like to reproduce the descriptors described in the Todeschini article, but I clearly can't do that using the actual moments of inertia. I could keep using the eigenvalues of the covariance matrix there, but that doesn't match what's described
Re: [Rdkit-discuss] PMI API
Dear Greg, I suspect that it's a precision error or eigen algorithm shift between rdkit c++ & dragon. To obtain good value, I suggest to try to implement a test on the eigen values like i did in gateway.cpp implementation. JacobiSVD getSVD(MatrixXd A) { JacobiSVD mysvd(A, ComputeThinU | ComputeThinV); return mysvd; } // get the A-1 matrix using MatrixXd GetPinv(MatrixXd A){ JacobiSVD svd = getSVD(A); double pinvtoler=1.e-2; // choose your tolerance wisely! VectorXd vs=svd.singularValues(); VectorXd vsinv=svd.singularValues(); for (unsigned int i=0; i pinvtoler ) vsinv(i)=1.0/vs(i); else vsinv(i)=0.0; } MatrixXd S = vsinv.asDiagonal(); MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); return Ap; } If it's not solve the problem, I would like to test it in Matlab. can you provide me the 3 (3d xyz matrix) of your example please ? I also have Dragon 6 best regards, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 De : Greg Landrum Envoyé : dimanche 15 janvier 2017 11:50 À : Chris Earnshaw; RDKit Discuss Objet : Re: [Rdkit-discuss] PMI API I managed to make some time to look into this this weekend and I've found a bug and something I don't understand. Hopefully the community can help out here. On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw mailto:cgearns...@gmail.com>> wrote: 4) The big one! The returned results look very odd. They appear to relate more to the dimensions of the molecule than the moments of inertia. For a rod-like molecule (dimethylacetylene) I'd expect two large and one small PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: 25.3699 NPR1: 0.002430 NPR2: 0.002430. For disk-like (benzene) the result should be one large and two medium (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: 0.500013) but get PMI1: 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: 2.14213e-11 NPR2: 0.33. Finally for a roughly spherical molecule (neopentane) the NPR values look reasonable (no great surprise) but the absolute PMI values may be too small: old program - PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: 6.59488 PMI3: 6.59531 NPR1: 0.02 NPR2: 0.35 Your expectations are correct: the current RDKit implementation is wrong. The corresponding github entry is here: https://github.com/rdkit/rdkit/issues/1262 This is due to a mistake in the way the principal moments are calculated (which is due to the fact that I don't spend a lot of time working with/thinking about 3D descriptors). Instead of using the eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the RDKit is currently using the covariance matrix. There's some more on the relationship between these two here: http://number-none.com/blow/inertia/deriving_i.html The problem is easy to fix (and I have something working here: https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws up the values of the descriptors that are derived from here: Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of Chemoinformatics http://dx.doi.org/10.1002/9783527618279.ch37 These include the radius of gyration, inertial shape factor, etc. Within that article they state that Ic = 0 for planar molecules. Ignoring the inequality on page 1010, which says that Ic is the largest moment and is contradicted by the rest of the text (particularly the inequalities on page 1011), Ic corresponds to the smallest principal moment : PMI1. So now I'm confused, but I'm hoping this is obvious to someone versed in the field: I'd like to reproduce the descriptors described in the Todeschini article, but I clearly can't do that using the actual moments of inertia. I could keep using the eigenvalues of the covariance matrix there, but that doesn't match what's described in the text. Two things that would be extremely helpful: 1) an explanation of the disconnect here from someone who knows this stuff, I would guess that it's pretty simple 2) The results of running the files github1262_1.mol, github1262_2.mol, and github1262_3.mol from here: https://github.com/greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/MolTransforms/test_data through Dragon and calculating the radius of gyration, inertial shape factor, eccentricity, molecular asphericity, and spherocity index. Best, -greg ** DISCLAIMER This e
Re: [Rdkit-discuss] PMI API
I managed to make some time to look into this this weekend and I've found a bug and something I don't understand. Hopefully the community can help out here. On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw wrote: > 4) The big one! The returned results look very odd. They appear to relate > more to the dimensions of the molecule than the moments of inertia. For a > rod-like molecule (dimethylacetylene) I'd expect two large and one small > PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 > NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: > 25.3699 NPR1: 0.002430 NPR2: 0.002430. > For disk-like (benzene) the result should be one large and two medium > (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: > 0.500013) but get PMI1: 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: > 2.14213e-11 NPR2: 0.33. > Finally for a roughly spherical molecule (neopentane) the NPR values look > reasonable (no great surprise) but the absolute PMI values may be too > small: old program - PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 > NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: > 6.59488 PMI3: 6.59531 NPR1: 0.02 NPR2: 0.35 > Your expectations are correct: the current RDKit implementation is wrong. The corresponding github entry is here: https://github.com/rdkit/rdkit/issues/1262 This is due to a mistake in the way the principal moments are calculated (which is due to the fact that I don't spend a lot of time working with/thinking about 3D descriptors). Instead of using the eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the RDKit is currently using the covariance matrix. There's some more on the relationship between these two here: http://number-none.com/blow/inertia/deriving_i.html The problem is easy to fix (and I have something working here: https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws up the values of the descriptors that are derived from here: Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of Chemoinformatics http://dx.doi.org/10.1002/9783527618279.ch37 These include the radius of gyration, inertial shape factor, etc. Within that article they state that Ic = 0 for planar molecules. Ignoring the inequality on page 1010, which says that Ic is the largest moment and is contradicted by the rest of the text (particularly the inequalities on page 1011), Ic corresponds to the smallest principal moment : PMI1. So now I'm confused, but I'm hoping this is obvious to someone versed in the field: I'd like to reproduce the descriptors described in the Todeschini article, but I clearly can't do that using the actual moments of inertia. I could keep using the eigenvalues of the covariance matrix there, but that doesn't match what's described in the text. Two things that would be extremely helpful: 1) an explanation of the disconnect here from someone who knows this stuff, I would guess that it's pretty simple 2) The results of running the files github1262_1.mol, github1262_2.mol, and github1262_3.mol from here: https://github.com/greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/MolTransforms/test_data through Dragon and calculating the radius of gyration, inertial shape factor, eccentricity, molecular asphericity, and spherocity index. Best, -greg > -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] PMI API
I can confirm that removing the conditional compilation directive #ifdef RDK_BUILD_DESCRIPTORS3D (and the corresponding #endif) from PMI.h allows compilation without having to worry about the -DRDK_BUILD_DESCRIPTORS3D directive. I think this is a worthwhile change. Any thoughts about the numerical PMI and NPR values produced by the current implementation? I still can't make any sense of them. Chris On 9 January 2017 at 05:18, Greg Landrum wrote: > A more straightforward solution to this one, and what I probably should > have done in the first place, would be to not include the conditional > compilation directives in the PMI.h header file. It should be fine to have > the declarations in the header even if there is no corresponding > definition, and then client code wouldn't need to know about the extra > options. > > PR coming. > > On Sun, Jan 8, 2017 at 7:17 PM, Brian Kelley wrote: > >> I think the relevant issue is that if you are using an existing build, we >> don't yet have the capability for you to know what was built and what was >> not. I.e. You need to add the compiler flag to indicate that the 3D stuff >> was actually built. >> >> I had a PR to fix this a while ago that was postponed that we should >> probably resurrect. Basically it is an rdkit.h header file that has these >> flags built in so you won't have to include them yourself. >> >> >> Brian Kelley >> >> On Jan 8, 2017, at 11:31 AM, Greg Landrum wrote: >> >> Hi Chris, >> >> The RDKit should automatically build with the new descriptors enabled if >> eigen3 can be found when cmake is run. When you run cmake you should see a >> message if/when the build is disabled. >> >> If you want to call the functions, the best documentation available is >> the standard C++ API documentation, but something seems to have gone wrong >> when I ran doxygen. I'll look into this. That documentation is generated >> from the header file, so you can just look there: >> https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/ >> Descriptors/PMI.h >> not that there's a huge amount of documentation available. >> >> W.r.t. efficiency: you do need to call the functions individually, but >> the expensive calculation of the moments will only be done once, so it >> doesn't end up doing repeated work. >> >> And, finally, on the values themselves: I will have to take a look at >> that. >> -greg >> >> >> >> >> On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw >> wrote: >> >>> Hi >>> >>> A while ago I had a project which needed PMI descriptors (specifically >>> NPR1 and NPR2) which were not available in the main branch of RDKit at the >>> time. At the time I used the fork by 'hahnda6' which provided the >>> calcPMIDescriptors() function, and this worked well. Now that PMI >>> descriptors are available in the main RDKit distrubution I thought I'd >>> rewrite my code to use the official version. >>> >>> Building the new RDKit was no problem, but things went downhill shortly >>> after that. There's every chance that I've missed the relevant >>> documentation (I hope someone can point me in the right direction if so) >>> and done something stupid! >>> >>> The issues are - >>> 1) I can't find any documentation of the C++ API - the only reference to >>> PMI in the online RDKit documentation appears to be to the PMI.h file >>> 2) Having written a program using the PMI[123] and/or NPR[12] functions, >>> I couldn't get it to compile until I added the -DRDK_BUILD_DESCRIPTORS3D >>> directive - >>> g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit >>> -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers >>> -Wno-deprecated -O2 -DRDK_BUILD_DESCRIPTORS3D >>> This seems a bit odd... >>> 3) Is it necessary to make separate calls to the individual PMI() and/or >>> NPR() functions? Surely this results in duplication of some of the heavier >>> calculations? I can't find any equivalent of calcPMIDescriptors() which >>> returned a 'Moments' struct containing all the PMI and NPR values in one go. >>> 4) The big one! The returned results look very odd. They appear to >>> relate more to the dimensions of the molecule than the moments of inertia. >>> For a rod-like molecule (dimethylacetylene) I'd expect two large and one >>> small PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: >>> 0.0439828 NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 >>> PMI3: 25.3699 NPR1: 0.002430 NPR2: 0.002430. >>> For disk-like (benzene) the result should be one large and two medium >>> (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: >>> 0.500013) but get PMI1: 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: >>> 2.14213e-11 NPR2: 0.33. >>> Finally for a roughly spherical molecule (neopentane) the NPR values >>> look reasonable (no great surprise) but the absolute PMI values may be too >>> small: old program - PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 >>> NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: >
Re: [Rdkit-discuss] PMI API
A more straightforward solution to this one, and what I probably should have done in the first place, would be to not include the conditional compilation directives in the PMI.h header file. It should be fine to have the declarations in the header even if there is no corresponding definition, and then client code wouldn't need to know about the extra options. PR coming. On Sun, Jan 8, 2017 at 7:17 PM, Brian Kelley wrote: > I think the relevant issue is that if you are using an existing build, we > don't yet have the capability for you to know what was built and what was > not. I.e. You need to add the compiler flag to indicate that the 3D stuff > was actually built. > > I had a PR to fix this a while ago that was postponed that we should > probably resurrect. Basically it is an rdkit.h header file that has these > flags built in so you won't have to include them yourself. > > > Brian Kelley > > On Jan 8, 2017, at 11:31 AM, Greg Landrum wrote: > > Hi Chris, > > The RDKit should automatically build with the new descriptors enabled if > eigen3 can be found when cmake is run. When you run cmake you should see a > message if/when the build is disabled. > > If you want to call the functions, the best documentation available is the > standard C++ API documentation, but something seems to have gone wrong when > I ran doxygen. I'll look into this. That documentation is generated from > the header file, so you can just look there: > https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Descriptors/PMI.h > not that there's a huge amount of documentation available. > > W.r.t. efficiency: you do need to call the functions individually, but the > expensive calculation of the moments will only be done once, so it doesn't > end up doing repeated work. > > And, finally, on the values themselves: I will have to take a look at > that. > -greg > > > > > On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw > wrote: > >> Hi >> >> A while ago I had a project which needed PMI descriptors (specifically >> NPR1 and NPR2) which were not available in the main branch of RDKit at the >> time. At the time I used the fork by 'hahnda6' which provided the >> calcPMIDescriptors() function, and this worked well. Now that PMI >> descriptors are available in the main RDKit distrubution I thought I'd >> rewrite my code to use the official version. >> >> Building the new RDKit was no problem, but things went downhill shortly >> after that. There's every chance that I've missed the relevant >> documentation (I hope someone can point me in the right direction if so) >> and done something stupid! >> >> The issues are - >> 1) I can't find any documentation of the C++ API - the only reference to >> PMI in the online RDKit documentation appears to be to the PMI.h file >> 2) Having written a program using the PMI[123] and/or NPR[12] functions, >> I couldn't get it to compile until I added the -DRDK_BUILD_DESCRIPTORS3D >> directive - >> g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit >> -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers >> -Wno-deprecated -O2 -DRDK_BUILD_DESCRIPTORS3D >> This seems a bit odd... >> 3) Is it necessary to make separate calls to the individual PMI() and/or >> NPR() functions? Surely this results in duplication of some of the heavier >> calculations? I can't find any equivalent of calcPMIDescriptors() which >> returned a 'Moments' struct containing all the PMI and NPR values in one go. >> 4) The big one! The returned results look very odd. They appear to relate >> more to the dimensions of the molecule than the moments of inertia. For a >> rod-like molecule (dimethylacetylene) I'd expect two large and one small >> PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 >> NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: >> 25.3699 NPR1: 0.002430 NPR2: 0.002430. >> For disk-like (benzene) the result should be one large and two medium >> (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: >> 0.500013) but get PMI1: 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: >> 2.14213e-11 NPR2: 0.33. >> Finally for a roughly spherical molecule (neopentane) the NPR values look >> reasonable (no great surprise) but the absolute PMI values may be too >> small: old program - PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 >> NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: >> 6.59488 PMI3: 6.59531 NPR1: 0.02 NPR2: 0.35 >> >> As I say, it's entirely likely that I'm doing something stupid here so >> any pointers will be gratefully received. FWIW, the core of my program is - >> mol = MolBlockToMol(ctab, true, false); >> double pmi1 = RDKit::Descriptors::PMI1(*mol); >> double pmi2 = RDKit::Descriptors::PMI2(*mol); >> double pmi3 = RDKit::Descriptors::PMI3(*mol); >> double npr1 = RDKit::Descriptors::NPR1(*mol); >> double npr2 = RDKit::Descriptors::NPR2(*mol); >> >> Thanks for any help! >> Chris >> >> --
Re: [Rdkit-discuss] PMI API
Hi Brian & Greg Many thanks for the replies. I built RDKit with Descriptors3D enabled without any problems, it was working out how to tell the compiler to process my source code using the new functions which was troublesome. It would be very helpful if the need for the -DRDK_BUILD_DESCRIPTORS3D compiler directive was documented, e.g. with a comment near the top of PMI.h, at least until a better solution is in place. Good to know that the expensive calculation is only done once. Hope it won't be difficult to sort out the strange PMI & NPR values - please let me kbow if you need any more information from me. Chris Earnshaw On 8 Jan 2017 18:17, "Brian Kelley" wrote: I think the relevant issue is that if you are using an existing build, we don't yet have the capability for you to know what was built and what was not. I.e. You need to add the compiler flag to indicate that the 3D stuff was actually built. I had a PR to fix this a while ago that was postponed that we should probably resurrect. Basically it is an rdkit.h header file that has these flags built in so you won't have to include them yourself. Brian Kelley On Jan 8, 2017, at 11:31 AM, Greg Landrum wrote: Hi Chris, The RDKit should automatically build with the new descriptors enabled if eigen3 can be found when cmake is run. When you run cmake you should see a message if/when the build is disabled. If you want to call the functions, the best documentation available is the standard C++ API documentation, but something seems to have gone wrong when I ran doxygen. I'll look into this. That documentation is generated from the header file, so you can just look there: https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Descriptors/PMI.h not that there's a huge amount of documentation available. W.r.t. efficiency: you do need to call the functions individually, but the expensive calculation of the moments will only be done once, so it doesn't end up doing repeated work. And, finally, on the values themselves: I will have to take a look at that. -greg On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw wrote: > Hi > > A while ago I had a project which needed PMI descriptors (specifically > NPR1 and NPR2) which were not available in the main branch of RDKit at the > time. At the time I used the fork by 'hahnda6' which provided the > calcPMIDescriptors() function, and this worked well. Now that PMI > descriptors are available in the main RDKit distrubution I thought I'd > rewrite my code to use the official version. > > Building the new RDKit was no problem, but things went downhill shortly > after that. There's every chance that I've missed the relevant > documentation (I hope someone can point me in the right direction if so) > and done something stupid! > > The issues are - > 1) I can't find any documentation of the C++ API - the only reference to > PMI in the online RDKit documentation appears to be to the PMI.h file > 2) Having written a program using the PMI[123] and/or NPR[12] functions, I > couldn't get it to compile until I added the -DRDK_BUILD_DESCRIPTORS3D > directive - > g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit > -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers > -Wno-deprecated -O2 -DRDK_BUILD_DESCRIPTORS3D > This seems a bit odd... > 3) Is it necessary to make separate calls to the individual PMI() and/or > NPR() functions? Surely this results in duplication of some of the heavier > calculations? I can't find any equivalent of calcPMIDescriptors() which > returned a 'Moments' struct containing all the PMI and NPR values in one go. > 4) The big one! The returned results look very odd. They appear to relate > more to the dimensions of the molecule than the moments of inertia. For a > rod-like molecule (dimethylacetylene) I'd expect two large and one small > PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 > NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: > 25.3699 NPR1: 0.002430 NPR2: 0.002430. > For disk-like (benzene) the result should be one large and two medium > (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: > 0.500013) but get PMI1: 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: > 2.14213e-11 NPR2: 0.33. > Finally for a roughly spherical molecule (neopentane) the NPR values look > reasonable (no great surprise) but the absolute PMI values may be too > small: old program - PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 > NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: > 6.59488 PMI3: 6.59531 NPR1: 0.02 NPR2: 0.35 > > As I say, it's entirely likely that I'm doing something stupid here so any > pointers will be gratefully received. FWIW, the core of my program is - > mol = MolBlockToMol(ctab, true, false); > double pmi1 = RDKit::Descriptors::PMI1(*mol); > double pmi2 = RDKit::Descriptors::PMI2(*mol); > double pmi3 = RDKit::Descriptors::PMI3(*mol); > double npr1 = RDKit::Desc
Re: [Rdkit-discuss] PMI API
I think the relevant issue is that if you are using an existing build, we don't yet have the capability for you to know what was built and what was not. I.e. You need to add the compiler flag to indicate that the 3D stuff was actually built. I had a PR to fix this a while ago that was postponed that we should probably resurrect. Basically it is an rdkit.h header file that has these flags built in so you won't have to include them yourself. Brian Kelley > On Jan 8, 2017, at 11:31 AM, Greg Landrum wrote: > > Hi Chris, > > The RDKit should automatically build with the new descriptors enabled if > eigen3 can be found when cmake is run. When you run cmake you should see a > message if/when the build is disabled. > > If you want to call the functions, the best documentation available is the > standard C++ API documentation, but something seems to have gone wrong when I > ran doxygen. I'll look into this. That documentation is generated from the > header file, so you can just look there: > https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Descriptors/PMI.h > not that there's a huge amount of documentation available. > > W.r.t. efficiency: you do need to call the functions individually, but the > expensive calculation of the moments will only be done once, so it doesn't > end up doing repeated work. > > And, finally, on the values themselves: I will have to take a look at that. > -greg > > > > >> On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw wrote: >> Hi >> >> A while ago I had a project which needed PMI descriptors (specifically NPR1 >> and NPR2) which were not available in the main branch of RDKit at the time. >> At the time I used the fork by 'hahnda6' which provided the >> calcPMIDescriptors() function, and this worked well. Now that PMI >> descriptors are available in the main RDKit distrubution I thought I'd >> rewrite my code to use the official version. >> >> Building the new RDKit was no problem, but things went downhill shortly >> after that. There's every chance that I've missed the relevant documentation >> (I hope someone can point me in the right direction if so) and done >> something stupid! >> >> The issues are - >> 1) I can't find any documentation of the C++ API - the only reference to PMI >> in the online RDKit documentation appears to be to the PMI.h file >> 2) Having written a program using the PMI[123] and/or NPR[12] functions, I >> couldn't get it to compile until I added the -DRDK_BUILD_DESCRIPTORS3D >> directive - >> g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit >> -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers -Wno-deprecated >> -O2 -DRDK_BUILD_DESCRIPTORS3D >> This seems a bit odd... >> 3) Is it necessary to make separate calls to the individual PMI() and/or >> NPR() functions? Surely this results in duplication of some of the heavier >> calculations? I can't find any equivalent of calcPMIDescriptors() which >> returned a 'Moments' struct containing all the PMI and NPR values in one go. >> 4) The big one! The returned results look very odd. They appear to relate >> more to the dimensions of the molecule than the moments of inertia. For a >> rod-like molecule (dimethylacetylene) I'd expect two large and one small PMI >> (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 NPR2: >> 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: 25.3699 >> NPR1: 0.002430 NPR2: 0.002430. >> For disk-like (benzene) the result should be one large and two medium (e.g. >> PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: 0.500013) >> but get PMI1: 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: 2.14213e-11 >> NPR2: 0.33. >> Finally for a roughly spherical molecule (neopentane) the NPR values look >> reasonable (no great surprise) but the absolute PMI values may be too small: >> old program - PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 >> NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: 6.59488 >> PMI3: 6.59531 NPR1: 0.02 NPR2: 0.35 >> >> As I say, it's entirely likely that I'm doing something stupid here so any >> pointers will be gratefully received. FWIW, the core of my program is - >> mol = MolBlockToMol(ctab, true, false); >> double pmi1 = RDKit::Descriptors::PMI1(*mol); >> double pmi2 = RDKit::Descriptors::PMI2(*mol); >> double pmi3 = RDKit::Descriptors::PMI3(*mol); >> double npr1 = RDKit::Descriptors::NPR1(*mol); >> double npr2 = RDKit::Descriptors::NPR2(*mol); >> >> Thanks for any help! >> Chris >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > > -
Re: [Rdkit-discuss] PMI API
Hi Chris, The RDKit should automatically build with the new descriptors enabled if eigen3 can be found when cmake is run. When you run cmake you should see a message if/when the build is disabled. If you want to call the functions, the best documentation available is the standard C++ API documentation, but something seems to have gone wrong when I ran doxygen. I'll look into this. That documentation is generated from the header file, so you can just look there: https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Descriptors/PMI.h not that there's a huge amount of documentation available. W.r.t. efficiency: you do need to call the functions individually, but the expensive calculation of the moments will only be done once, so it doesn't end up doing repeated work. And, finally, on the values themselves: I will have to take a look at that. -greg On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw wrote: > Hi > > A while ago I had a project which needed PMI descriptors (specifically > NPR1 and NPR2) which were not available in the main branch of RDKit at the > time. At the time I used the fork by 'hahnda6' which provided the > calcPMIDescriptors() function, and this worked well. Now that PMI > descriptors are available in the main RDKit distrubution I thought I'd > rewrite my code to use the official version. > > Building the new RDKit was no problem, but things went downhill shortly > after that. There's every chance that I've missed the relevant > documentation (I hope someone can point me in the right direction if so) > and done something stupid! > > The issues are - > 1) I can't find any documentation of the C++ API - the only reference to > PMI in the online RDKit documentation appears to be to the PMI.h file > 2) Having written a program using the PMI[123] and/or NPR[12] functions, I > couldn't get it to compile until I added the -DRDK_BUILD_DESCRIPTORS3D > directive - > g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit > -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers > -Wno-deprecated -O2 -DRDK_BUILD_DESCRIPTORS3D > This seems a bit odd... > 3) Is it necessary to make separate calls to the individual PMI() and/or > NPR() functions? Surely this results in duplication of some of the heavier > calculations? I can't find any equivalent of calcPMIDescriptors() which > returned a 'Moments' struct containing all the PMI and NPR values in one go. > 4) The big one! The returned results look very odd. They appear to relate > more to the dimensions of the molecule than the moments of inertia. For a > rod-like molecule (dimethylacetylene) I'd expect two large and one small > PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 > NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: > 25.3699 NPR1: 0.002430 NPR2: 0.002430. > For disk-like (benzene) the result should be one large and two medium > (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: > 0.500013) but get PMI1: 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: > 2.14213e-11 NPR2: 0.33. > Finally for a roughly spherical molecule (neopentane) the NPR values look > reasonable (no great surprise) but the absolute PMI values may be too > small: old program - PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 > NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: > 6.59488 PMI3: 6.59531 NPR1: 0.02 NPR2: 0.35 > > As I say, it's entirely likely that I'm doing something stupid here so any > pointers will be gratefully received. FWIW, the core of my program is - > mol = MolBlockToMol(ctab, true, false); > double pmi1 = RDKit::Descriptors::PMI1(*mol); > double pmi2 = RDKit::Descriptors::PMI2(*mol); > double pmi3 = RDKit::Descriptors::PMI3(*mol); > double npr1 = RDKit::Descriptors::NPR1(*mol); > double npr2 = RDKit::Descriptors::NPR2(*mol); > > Thanks for any help! > Chris > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] PMI API
Hi David Thanks for the rapid reply! Looks like a very useful document for people getting started with the RDKit C++ API. As you suspected, I'm slightly beyond that stage having been an RDKit user for a number of years. My queries are specifically to do with using the PMI functionality; most particularly why the numbers produced by the current implementation don't appear to match expected values for particular shapes of molecule, but also the lack of information about the PMI-related functions in the main RDKit C++ API documentation and the apparently odd requirement for the -DRDK_BUILD_DESCRIPTORS3D flag (looks more like a cmake directive) when compiling a program which uses GraphMol/Descriptors3D functions. Cheers, Chris On 8 January 2017 at 12:13, David Cosgrove wrote: > Hi Chris, > I can help a bit with the first point - I am currently 'porting' the > getting started in Python bit of the documentation to c++. There's a long > way to go, but if you go to my fork of RDKit at https://github.com/ > DavidACosgrove and check out the GetStartedC++ branch, you can at least > use what I've managed so far (https://github.com/ > DavidACosgrove/rdkit/blob/GetStartedC%2B%2B/Docs/Book/ > GettingStartedInC%2B%2B.md). It's pretty basic stuff that you may > already be beyond, but there are some examples and a CMakeLists.txt file > that builds them which might be helpful. > > > It's probably time I tidied it up (having just looked at it to get the > link above, I see there's a typo on the first sentence, for example!) and > sent in an interim Pull Request as for people starting out it might already > be of value. > > Cheers, > Dave > > On Sun, 8 Jan 2017 at 10:19, Chris Earnshaw wrote: > >> Hi >> >> A while ago I had a project which needed PMI >> >> descriptors (specifically NPR1 and NPR2) which were not available in the >> >> main branch of RDKit at the time. At the time I used the fork by >> >> 'hahnda6' which provided the calcPMIDescriptors() function, and this >> >> worked well. Now that PMI descriptors are available in the main RDKit >> >> distrubution I thought I'd rewrite my code to use the official version. >> >> Building >> >> the new RDKit was no problem, but things went downhill shortly after >> >> that. There's every chance that I've missed the relevant documentation >> >> (I hope someone can point me in the right direction if so) and done >> >> something stupid! >> >> The issues are - >> 1) I can't find >> >> any documentation of the C++ API - the only reference to PMI in the >> >> online RDKit documentation appears to be to the PMI.h file >> 2) >> >> Having written a program using the PMI[123] and/or NPR[12] functions, I >> >> couldn't get it to compile until I added the -DRDK_BUILD_DESCRIPTORS3D >> >> directive - >> g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit >> -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers >> -Wno-deprecated -O2 -DRDK_BUILD_DESCRIPTORS3D >> This seems a bit odd... >> 3) >> >> Is it necessary to make separate calls to the individual PMI() and/or >> >> NPR() functions? Surely this results in duplication of some of the >> >> heavier calculations? I can't find any equivalent of >> >> calcPMIDescriptors() which returned a 'Moments' struct containing all >> >> the PMI and NPR values in one go. >> 4) The big one! The >> >> returned results look very odd. They appear to relate more to the >> >> dimensions of the molecule than the moments of inertia. For a rod-like >> >> molecule (dimethylacetylene) I'd expect two large and one small PMI >> >> (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 >> >> NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: >> >> 25.3699 NPR1: 0.002430 NPR2: 0.002430. >> For disk-like (benzene) the >> >> result should be one large and two medium (e.g. PMI1: 89.1448 PMI2: >> >> 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: 0.500013) but get PMI1: >> >> 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: 2.14213e-11 NPR2: >> >> 0.33. >> Finally for a roughly spherical molecule (neopentane) the >> >> NPR values look reasonable (no great surprise) but the absolute PMI >> >> values may be too small: old program - PMI1: 114.795 PMI2: 114.797 >> >> PMI3: 114.799 >> NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: >> 6.59488 PMI3: 6.59531 NPR1: 0.02 NPR2: 0.35 >> >> As >> >> I say, it's entirely likely that I'm doing something stupid here so any >> >> pointers will be gratefully received. FWIW, the core of my program is - >> mol = MolBlockToMol(ctab, true, false); >> double pmi1 = RDKit::Descriptors::PMI1(*mol); >> double pmi2 = RDKit::Descriptors::PMI2(*mol); >> double pmi3 = RDKit::Descriptors::PMI3(*mol); >> double npr1 = RDKit::Descriptors::NPR1(*mol); >> double npr2 = RDKit::Descriptors::NPR2(*mol); >> >> Thanks for any help! >> Chris >> >> >> >> -- >> >> Check out the vibrant tech c
Re: [Rdkit-discuss] PMI API
Hi Chris, I can help a bit with the first point - I am currently 'porting' the getting started in Python bit of the documentation to c++. There's a long way to go, but if you go to my fork of RDKit at https://github.com/DavidACosgrove and check out the GetStartedC++ branch, you can at least use what I've managed so far ( https://github.com/DavidACosgrove/rdkit/blob/GetStartedC%2B%2B/Docs/Book/GettingStartedInC%2B%2B.md). It's pretty basic stuff that you may already be beyond, but there are some examples and a CMakeLists.txt file that builds them which might be helpful. It's probably time I tidied it up (having just looked at it to get the link above, I see there's a typo on the first sentence, for example!) and sent in an interim Pull Request as for people starting out it might already be of value. Cheers, Dave On Sun, 8 Jan 2017 at 10:19, Chris Earnshaw wrote: > Hi > > A while ago I had a project which needed PMI > > descriptors (specifically NPR1 and NPR2) which were not available in the > > main branch of RDKit at the time. At the time I used the fork by > > 'hahnda6' which provided the calcPMIDescriptors() function, and this > > worked well. Now that PMI descriptors are available in the main RDKit > > distrubution I thought I'd rewrite my code to use the official version. > > Building > > the new RDKit was no problem, but things went downhill shortly after > > that. There's every chance that I've missed the relevant documentation > > (I hope someone can point me in the right direction if so) and done > > something stupid! > > The issues are - > 1) I can't find > > any documentation of the C++ API - the only reference to PMI in the > > online RDKit documentation appears to be to the PMI.h file > 2) > > Having written a program using the PMI[123] and/or NPR[12] functions, I > > couldn't get it to compile until I added the -DRDK_BUILD_DESCRIPTORS3D > > directive - > g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit > -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers > -Wno-deprecated -O2 -DRDK_BUILD_DESCRIPTORS3D > This seems a bit odd... > 3) > > Is it necessary to make separate calls to the individual PMI() and/or > > NPR() functions? Surely this results in duplication of some of the > > heavier calculations? I can't find any equivalent of > > calcPMIDescriptors() which returned a 'Moments' struct containing all > > the PMI and NPR values in one go. > 4) The big one! The > > returned results look very odd. They appear to relate more to the > > dimensions of the molecule than the moments of inertia. For a rod-like > > molecule (dimethylacetylene) I'd expect two large and one small PMI > > (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 > > NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: > > 25.3699 NPR1: 0.002430 NPR2: 0.002430. > For disk-like (benzene) the > > result should be one large and two medium (e.g. PMI1: 89.1448 PMI2: > > 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: 0.500013) but get PMI1: > > 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: 2.14213e-11 NPR2: > > 0.33. > Finally for a roughly spherical molecule (neopentane) the > > NPR values look reasonable (no great surprise) but the absolute PMI > > values may be too small: old program - PMI1: 114.795 PMI2: 114.797 > > PMI3: 114.799 > NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: > 6.59488 PMI3: 6.59531 NPR1: 0.02 NPR2: 0.35 > > As > > I say, it's entirely likely that I'm doing something stupid here so any > > pointers will be gratefully received. FWIW, the core of my program is - > mol = MolBlockToMol(ctab, true, false); > double pmi1 = RDKit::Descriptors::PMI1(*mol); > double pmi2 = RDKit::Descriptors::PMI2(*mol); > double pmi3 = RDKit::Descriptors::PMI3(*mol); > double npr1 = RDKit::Descriptors::NPR1(*mol); > double npr2 = RDKit::Descriptors::NPR2(*mol); > > Thanks for any help! > Chris > > > > -- > > Check out the vibrant tech community on one of the world's most > > engaging tech sites, SlashDot.org! http://sdm.link/slashdot__ > _ > > Rdkit-discuss mailing list > > Rdkit-discuss@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] PMI API
Hi A while ago I had a project which needed PMI descriptors (specifically NPR1 and NPR2) which were not available in the main branch of RDKit at the time. At the time I used the fork by 'hahnda6' which provided the calcPMIDescriptors() function, and this worked well. Now that PMI descriptors are available in the main RDKit distrubution I thought I'd rewrite my code to use the official version. Building the new RDKit was no problem, but things went downhill shortly after that. There's every chance that I've missed the relevant documentation (I hope someone can point me in the right direction if so) and done something stupid! The issues are - 1) I can't find any documentation of the C++ API - the only reference to PMI in the online RDKit documentation appears to be to the PMI.h file 2) Having written a program using the PMI[123] and/or NPR[12] functions, I couldn't get it to compile until I added the -DRDK_BUILD_DESCRIPTORS3D directive - g++ -o sdf_pmi_blob sdf_pmi.cpp -I/packages/rdkit/include/rdkit -L/packages/rdkit/lib -lDescriptors -lGraphMol -lFileParsers -Wno-deprecated -O2 -DRDK_BUILD_DESCRIPTORS3D This seems a bit odd... 3) Is it necessary to make separate calls to the individual PMI() and/or NPR() functions? Surely this results in duplication of some of the heavier calculations? I can't find any equivalent of calcPMIDescriptors() which returned a 'Moments' struct containing all the PMI and NPR values in one go. 4) The big one! The returned results look very odd. They appear to relate more to the dimensions of the molecule than the moments of inertia. For a rod-like molecule (dimethylacetylene) I'd expect two large and one small PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 NPR2: 0.98) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: 25.3699 NPR1: 0.002430 NPR2: 0.002430. For disk-like (benzene) the result should be one large and two medium (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: 0.500013) but get PMI1: 2.37457e-10 PMI2: 11.0844 PMI3: 11.0851 NPR1: 2.14213e-11 NPR2: 0.33. Finally for a roughly spherical molecule (neopentane) the NPR values look reasonable (no great surprise) but the absolute PMI values may be too small: old program - PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 NPR1: 0.66 NPR2: 0.88, new program - PMI1: 6.59466 PMI2: 6.59488 PMI3: 6.59531 NPR1: 0.02 NPR2: 0.35 As I say, it's entirely likely that I'm doing something stupid here so any pointers will be gratefully received. FWIW, the core of my program is - mol = MolBlockToMol(ctab, true, false); double pmi1 = RDKit::Descriptors::PMI1(*mol); double pmi2 = RDKit::Descriptors::PMI2(*mol); double pmi3 = RDKit::Descriptors::PMI3(*mol); double npr1 = RDKit::Descriptors::NPR1(*mol); double npr2 = RDKit::Descriptors::NPR2(*mol); Thanks for any help! Chris -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss