Re: [Rdkit-discuss] AdditionalOutput from FingerprintGenerator
Unfortunately it looks like the additional outputs for morgan, and rdkit fingerprints are parts that weren't finished: https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Fingerprints/MorganGenerator.cpp#L143 https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Fingerprints/RDKitFPGenerator.cpp#L99 I will take a look and see if it's possible to get these into the next release. In the meantime, if you want that info it looks like you'll need to use the older fingerprinting functions. -greg On Fri, Mar 13, 2020 at 11:10 PM Jason Biggs wrote: > Thank you Greg. > > I am working in C++. I can poke around with this if I knew which members > of the AdditionalOutput struct are used by which fingerprint generators. I > just wanted to make sure there wasn't an explanation somewhere I missed. > > I can see that with the AtomPairs fingerprints I can do the following > > //mol is an *ROMol and fpg is a *FingerprintGenerator > RDKit::AdditionalOutput ao; > > std::vector> atomtobits(mol->getNumAtoms()); > ao.atomToBits = &atb; > > auto res = fpg->getSparseCountFingerprint(*mo, nullptr, nullptr, -1, &ao); > > after which atomtobits contains a list of bits for each atom. From the > comments I think the bitInfo member should be used by the > RDKitFingerprintGenerator, but I don't see where it is used in the code. > Is that the part that wasn't finished? Is it possible to get information > about the atoms/environments that set particular bits in the Morgan or > RDKit fingerprints using the new API? > > Jason Biggs > > > > On Fri, Mar 13, 2020 at 10:20 AM Greg Landrum > wrote: > >> Hi Jason, >> >> At the moment there's nothing available here except what's in the C++ >> tests. This part of the code didn't end up being completely finished before >> the GSoC project ended and it's never bubbled up on my priority list to >> finish it. >> >> I haven't spent much time with this code, but I can probably put together >> an example. >> Are you working from C++? >> >> -greg >> >> >> On Thu, Mar 12, 2020 at 10:42 PM Jason Biggs >> wrote: >> >>> I am taking a look at the FingerprintGenerator class and I really like >>> this unified interface for these four types of fingerprints. I have very >>> limited experience with the fingerprint code before the generator API was >>> introduced. >>> >>> What I'm not sure about is how to get information about the >>> atoms/environments that set the bits. I believe I need to use the >>> AdditionalOutput struct, >>> https://www.rdkit.org/docs/cppapi/structRDKit_1_1AdditionalOutput.html, >>> but I'm not exactly sure how to do so. I normally would look at the c++ >>> test files to see how it is used, and from that I see the atomToBits member >>> is used in the atom pairs fingerprints, but I'm not sure about the other >>> members of this struct. For example there is a bitInfo member, is this >>> where I would find information for the RDKit and Morgan fingerprints? >>> >>> Are there any examples somewhere that I could follow to find out more >>> information? >>> >>> Thank you >>> >>> Jason >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] AdditionalOutput from FingerprintGenerator
Thank you Greg. I am working in C++. I can poke around with this if I knew which members of the AdditionalOutput struct are used by which fingerprint generators. I just wanted to make sure there wasn't an explanation somewhere I missed. I can see that with the AtomPairs fingerprints I can do the following //mol is an *ROMol and fpg is a *FingerprintGenerator RDKit::AdditionalOutput ao; std::vector> atomtobits(mol->getNumAtoms()); ao.atomToBits = &atb; auto res = fpg->getSparseCountFingerprint(*mo, nullptr, nullptr, -1, &ao); after which atomtobits contains a list of bits for each atom. From the comments I think the bitInfo member should be used by the RDKitFingerprintGenerator, but I don't see where it is used in the code. Is that the part that wasn't finished? Is it possible to get information about the atoms/environments that set particular bits in the Morgan or RDKit fingerprints using the new API? Jason Biggs On Fri, Mar 13, 2020 at 10:20 AM Greg Landrum wrote: > Hi Jason, > > At the moment there's nothing available here except what's in the C++ > tests. This part of the code didn't end up being completely finished before > the GSoC project ended and it's never bubbled up on my priority list to > finish it. > > I haven't spent much time with this code, but I can probably put together > an example. > Are you working from C++? > > -greg > > > On Thu, Mar 12, 2020 at 10:42 PM Jason Biggs > wrote: > >> I am taking a look at the FingerprintGenerator class and I really like >> this unified interface for these four types of fingerprints. I have very >> limited experience with the fingerprint code before the generator API was >> introduced. >> >> What I'm not sure about is how to get information about the >> atoms/environments that set the bits. I believe I need to use the >> AdditionalOutput struct, >> https://www.rdkit.org/docs/cppapi/structRDKit_1_1AdditionalOutput.html, >> but I'm not exactly sure how to do so. I normally would look at the c++ >> test files to see how it is used, and from that I see the atomToBits member >> is used in the atom pairs fingerprints, but I'm not sure about the other >> members of this struct. For example there is a bitInfo member, is this >> where I would find information for the RDKit and Morgan fingerprints? >> >> Are there any examples somewhere that I could follow to find out more >> information? >> >> Thank you >> >> Jason >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Building RDKit from source under Ubuntu VM, ctest 87 tests failed out of 165. Any advice much appreciated.
Paolo, Thank you for your rapid reply. It's a great suggestion, we are getting somewhere. I ran the first Python test with the "-V" (verbose?) option and, as you can see from the output below, Python is having a problem finding the rdkit module. Do you have any suggestions on how to fix this? Thank you so much. Earl Higgins $ RDBASE=~/conda-rdkit/rdkit ctest -I 2,2 -V UpdateCTestConfiguration from :/home/deep/conda-rdkit/rdkit/build/DartConfiguration.tcl Parse Config file:/home/deep/conda-rdkit/rdkit/build/DartConfiguration.tcl Add coverage exclude regular expressions. SetCTestConfiguration:CMakeCommand:/home/deep/anaconda3/bin/cmake UpdateCTestConfiguration from :/home/deep/conda-rdkit/rdkit/build/DartConfiguration.tcl Parse Config file:/home/deep/conda-rdkit/rdkit/build/DartConfiguration.tcl Test project /home/deep/conda-rdkit/rdkit/build Constructing a list of tests Done constructing a list of tests Updating test list for fixtures Added 0 tests to meet fixture requirements Checking test dependency graph... Checking test dependency graph end test 2 Start 2: pyCoordGen 2: Test command: /home/deep/anaconda3/bin/python "/home/deep/conda-rdkit/rdkit/External/CoordGen/Wrap/testCoordGen.py" 2: Test timeout computed to be: 1500 2: Traceback (most recent call last): 2: File "/home/deep/conda-rdkit/rdkit/External/CoordGen/Wrap/testCoordGen.py", line 13, in 2: from rdkit.Chem import rdCoordGen, rdMolAlign 2: ModuleNotFoundError: No module named 'rdkit' 1/1 Test #2: pyCoordGen ...***Failed0.07 sec 0% tests passed, 1 tests failed out of 1 Total Test time (real) = 0.09 sec The following tests FAILED: 2 - pyCoordGen (Failed) Errors while running CTest From: Paolo Tosco Dear Earl, given that all Python tests are failing my guess is that you might be running the tests with a Python interpreter different from the one you have built RDKit against. Re-run one of the failing tests with -V ctest -I 2,2 -V to get some more information. Cheers, p. On 13/03/2020 15:56, Earl Higgins wrote: I am new to RDKit, and my goal is to be able to build it from source ... and I'm surprised at 87/165 test case failures right out of the box ... Any guidance anyone could offer in this area would be most appreciated. Thank you in advance... This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or attachment or disclose the contents to any other person. If you have received this transmission in error, please notify the sender immediately and delete the message and any attachment from your system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept liability for any omissions or errors in this message which may arise as a result of E-Mail-transmission or for damages resulting from any unauthorized changes of the content of this message and any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not guarantee that this message is free of viruses and does not accept liability for any damages caused by any virus transmitted therewith. Click http://www.merckgroup.com/disclaimer to access the German, French, Spanish and Portuguese versions of this disclaimer. ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Building RDKit from source under Ubuntu VM, ctest 87 tests failed out of 165. Any advice much appreciated.
Dear Earl, it looks like you might only need to add $RDBASE to your PYTHONPATH. p. On 13/03/2020 16:38, Earl Higgins wrote: Paolo, Thank you for your rapid reply. It's a great suggestion, we are getting somewhere. I ran the first Python test with the "-V" (verbose?) option and, as you can see from the output below, Python is having a problem finding the rdkit module. Do you have any suggestions on how to fix this? Thank you so much. Earl Higgins $ RDBASE=~/conda-rdkit/rdkit ctest -I 2,2 -V UpdateCTestConfiguration from :/home/deep/conda-rdkit/rdkit/build/DartConfiguration.tcl Parse Config file:/home/deep/conda-rdkit/rdkit/build/DartConfiguration.tcl Add coverage exclude regular expressions. SetCTestConfiguration:CMakeCommand:/home/deep/anaconda3/bin/cmake UpdateCTestConfiguration from :/home/deep/conda-rdkit/rdkit/build/DartConfiguration.tcl Parse Config file:/home/deep/conda-rdkit/rdkit/build/DartConfiguration.tcl Test project /home/deep/conda-rdkit/rdkit/build Constructing a list of tests Done constructing a list of tests Updating test list for fixtures Added 0 tests to meet fixture requirements Checking test dependency graph... Checking test dependency graph end test 2 Start 2: pyCoordGen 2: Test command: /home/deep/anaconda3/bin/python "/home/deep/conda-rdkit/rdkit/External/CoordGen/Wrap/testCoordGen.py" 2: Test timeout computed to be: 1500 2: Traceback (most recent call last): 2: File "/home/deep/conda-rdkit/rdkit/External/CoordGen/Wrap/testCoordGen.py", line 13, in 2: from rdkit.Chem import rdCoordGen, rdMolAlign 2: ModuleNotFoundError: No module named 'rdkit' 1/1 Test #2: pyCoordGen ...***Failed 0.07 sec 0% tests passed, 1 tests failed out of 1 Total Test time (real) = 0.09 sec The following tests FAILED: 2 - pyCoordGen (Failed) Errors while running CTest *From:* Paolo Tosco Dear Earl, given that all Python tests are failing my guess is that you might be running the tests with a Python interpreter different from the one you have built RDKit against. Re-run one of the failing tests with -V ctest -I 2,2 -V to get some more information. Cheers, p. On 13/03/2020 15:56, Earl Higgins wrote: I am new to RDKit, and my goal is to be able to build it from source … and I'm surprised at 87/165 test case failures right out of the box … Any guidance anyone could offer in this area would be most appreciated. Thank you in advance… This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or attachment or disclose the contents to any other person. If you have received this transmission in error, please notify the sender immediately and delete the message and any attachment from your system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept liability for any omissions or errors in this message which may arise as a result of E-Mail-transmission or for damages resulting from any unauthorized changes of the content of this message and any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not guarantee that this message is free of viruses and does not accept liability for any damages caused by any virus transmitted therewith. Click http://www.merckgroup.com/disclaimerto access the German, French, Spanish and Portuguese versions of this disclaimer. ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Building RDKit from source under Ubuntu VM, ctest 87 tests failed out of 165. Any advice much appreciated.
Dear Earl, given that all Python tests are failing my guess is that you might be running the tests with a Python interpreter different from the one you have built RDKit against. Re-run one of the failing tests with -V ctest -I 2,2 -V to get some more information. Cheers, p. On 13/03/2020 15:56, Earl Higgins wrote: I am new to RDKit, and my goal is to be able to build it from source under Ubuntu 18.04.4 LTS running in a VM using Oracle VirtualBox (host Windows 10). I need to be able to build from source because, as a developer, I am on a team which is looking at making some enhancements to the MOL file load support, possibly adding support for Opensmiles as a distinct dialect of SMILES. So, following the Linux/Python 3 instructions at https://www.rdkit.org/docs/Install.html#how-to-build-from-source-with-conda , I am able to download and install everything fine until I get to the step to run ctest. I get: $ ctest CMake Error at /home/deep/conda-rdkit/rdkit/build/CTestCustom.ctest:3 (MESSAGE): Please set your RDBASE env variable before running the tests. Problem reading custom configuration: /home/deep/conda-rdkit/rdkit/build/CTestCustom.ctest Test project /home/deep/conda-rdkit/rdkit/build No tests were found!!! $ So I run: $ RDBASE=~/conda-rdkit/rdkit ctest When I do that, I get: $ RDBASE=~/conda-rdkit/rdkit ctest Test project /home/deep/conda-rdkit/rdkit/build Start 1: testCoordGen 1/165 Test #1: testCoordGen ... Passed 0.37 sec Start 2: pyCoordGen 2/165 Test #2: pyCoordGen .***Failed 0.05 sec Start 3: testDict 3/165 Test #3: testDict ... Passed 0.63 sec Start 4: testRDValue 4/165 Test #4: testRDValue Passed 0.00 sec Start 5: testDataStructs 5/165 Test #5: testDataStructs Passed 0.01 sec Start 6: testFPB 6/165 Test #6: testFPB Passed 0.01 sec Start 7: testMultiFPB 7/165 Test #7: testMultiFPB ... Passed 0.04 sec Start 8: pyBV 8/165 Test #8: pyBV ...***Failed 0.05 sec Start 9: pyDiscreteValueVect 9/165 Test #9: pyDiscreteValueVect ***Failed 0.09 sec Start 10: pySparseIntVect 10/165 Test #10: pySparseIntVect ***Failed 0.06 sec Start 11: pyFPB 11/165 Test #11: pyFPB ..***Failed 0.05 sec Start 12: testTransforms 12/165 Test #12: testTransforms . Passed 0.01 sec Start 13: testGrid 13/165 Test #13: testGrid ... Passed 0.04 sec Start 14: geometryTestsCatch (Lines omitted) 161/165 Test #161: pythonTestDirDbase .***Failed 0.03 sec Start 162: pythonTestDirSimDivFilters 162/165 Test #162: pythonTestDirSimDivFilters .***Failed 0.04 sec Start 163: pythonTestDirVLib 163/165 Test #163: pythonTestDirVLib ..***Failed 0.04 sec Start 164: pythonTestDirChem 164/165 Test #164: pythonTestDirChem ..***Failed 0.06 sec Start 165: pythonTestSping 165/165 Test #165: pythonTestSping ***Failed 0.03 sec 47% tests passed, 87 tests failed out of 165 Total Test time (real) = 48.62 sec The following tests FAILED: 2 - pyCoordGen (Failed) 8 - pyBV (Failed) 9 - pyDiscreteValueVect (Failed) 10 - pySparseIntVect (Failed) 11 - pyFPB (Failed) 15 - testPyGeometry (Failed) 19 - pyAlignment (Failed) 22 - testMMFFForceField (Child aborted) 23 - pyForceFieldConstraints (Failed) 25 - pyDistGeom (Failed) 28 - graphmolqueryTest (Child aborted) 29 - graphmolMolOpsTest (Child aborted) 31 - graphmoltestPickler (Child aborted) 34 - hanoiTest (Child aborted) (Lines omitted) 157 - pyFeatures (Failed) 158 - pythonTestDbCLI (Failed) 159 - pythonTestDirML (Failed) 160 - pythonTestDirDataStructs (Failed) 161 - pythonTestDirDbase (Failed) 162 - pythonTestDirSimDivFilters (Failed) 163 - pythonTestDirVLib (Failed) 164 - pythonTestDirChem (Failed) 165 - pythonTestSping (Failed) Errors while running CTest $ My impression is RDKit is quite robust, well tested and portable and I'm surprised at 87/165 test case failures right out of the box. I'm very co
[Rdkit-discuss] Building RDKit from source under Ubuntu VM, ctest 87 tests failed out of 165. Any advice much appreciated.
I am new to RDKit, and my goal is to be able to build it from source under Ubuntu 18.04.4 LTS running in a VM using Oracle VirtualBox (host Windows 10). I need to be able to build from source because, as a developer, I am on a team which is looking at making some enhancements to the MOL file load support, possibly adding support for Opensmiles as a distinct dialect of SMILES. So, following the Linux/Python 3 instructions at https://www.rdkit.org/docs/Install.html#how-to-build-from-source-with-conda , I am able to download and install everything fine until I get to the step to run ctest. I get: $ ctest CMake Error at /home/deep/conda-rdkit/rdkit/build/CTestCustom.ctest:3 (MESSAGE): Please set your RDBASE env variable before running the tests. Problem reading custom configuration: /home/deep/conda-rdkit/rdkit/build/CTestCustom.ctest Test project /home/deep/conda-rdkit/rdkit/build No tests were found!!! $ So I run: $ RDBASE=~/conda-rdkit/rdkit ctest When I do that, I get: $ RDBASE=~/conda-rdkit/rdkit ctest Test project /home/deep/conda-rdkit/rdkit/build Start 1: testCoordGen 1/165 Test #1: testCoordGen ... Passed0.37 sec Start 2: pyCoordGen 2/165 Test #2: pyCoordGen .***Failed0.05 sec Start 3: testDict 3/165 Test #3: testDict ... Passed0.63 sec Start 4: testRDValue 4/165 Test #4: testRDValue Passed0.00 sec Start 5: testDataStructs 5/165 Test #5: testDataStructs Passed0.01 sec Start 6: testFPB 6/165 Test #6: testFPB Passed0.01 sec Start 7: testMultiFPB 7/165 Test #7: testMultiFPB ... Passed0.04 sec Start 8: pyBV 8/165 Test #8: pyBV ...***Failed0.05 sec Start 9: pyDiscreteValueVect 9/165 Test #9: pyDiscreteValueVect ***Failed0.09 sec Start 10: pySparseIntVect 10/165 Test #10: pySparseIntVect ***Failed0.06 sec Start 11: pyFPB 11/165 Test #11: pyFPB ..***Failed0.05 sec Start 12: testTransforms 12/165 Test #12: testTransforms . Passed0.01 sec Start 13: testGrid 13/165 Test #13: testGrid ... Passed0.04 sec Start 14: geometryTestsCatch (Lines omitted) 161/165 Test #161: pythonTestDirDbase .***Failed0.03 sec Start 162: pythonTestDirSimDivFilters 162/165 Test #162: pythonTestDirSimDivFilters .***Failed0.04 sec Start 163: pythonTestDirVLib 163/165 Test #163: pythonTestDirVLib ..***Failed0.04 sec Start 164: pythonTestDirChem 164/165 Test #164: pythonTestDirChem ..***Failed0.06 sec Start 165: pythonTestSping 165/165 Test #165: pythonTestSping ***Failed0.03 sec 47% tests passed, 87 tests failed out of 165 Total Test time (real) = 48.62 sec The following tests FAILED: 2 - pyCoordGen (Failed) 8 - pyBV (Failed) 9 - pyDiscreteValueVect (Failed) 10 - pySparseIntVect (Failed) 11 - pyFPB (Failed) 15 - testPyGeometry (Failed) 19 - pyAlignment (Failed) 22 - testMMFFForceField (Child aborted) 23 - pyForceFieldConstraints (Failed) 25 - pyDistGeom (Failed) 28 - graphmolqueryTest (Child aborted) 29 - graphmolMolOpsTest (Child aborted) 31 - graphmoltestPickler (Child aborted) 34 - hanoiTest (Child aborted) (Lines omitted) 157 - pyFeatures (Failed) 158 - pythonTestDbCLI (Failed) 159 - pythonTestDirML (Failed) 160 - pythonTestDirDataStructs (Failed) 161 - pythonTestDirDbase (Failed) 162 - pythonTestDirSimDivFilters (Failed) 163 - pythonTestDirVLib (Failed) 164 - pythonTestDirChem (Failed) 165 - pythonTestSping (Failed) Errors while running CTest $ My impression is RDKit is quite robust, well tested and portable and I'm surprised at 87/165 test case failures right out of the box. I'm very comfortable with the UNIX environment, bash command line, makefiles, compiling and linking C and C++ software, so I don't think I'm making any silly newbie mistakes. Perhaps it's my unfamiliarity with Anaconda, cmake and specifically the ctest utility, but it is unclear to me how I can get further information about the cause of each of the 87 test case failures. Do I need to connect with the my.CDash.org server in order t
Re: [Rdkit-discuss] AdditionalOutput from FingerprintGenerator
Hi Jason, At the moment there's nothing available here except what's in the C++ tests. This part of the code didn't end up being completely finished before the GSoC project ended and it's never bubbled up on my priority list to finish it. I haven't spent much time with this code, but I can probably put together an example. Are you working from C++? -greg On Thu, Mar 12, 2020 at 10:42 PM Jason Biggs wrote: > I am taking a look at the FingerprintGenerator class and I really like > this unified interface for these four types of fingerprints. I have very > limited experience with the fingerprint code before the generator API was > introduced. > > What I'm not sure about is how to get information about the > atoms/environments that set the bits. I believe I need to use the > AdditionalOutput struct, > https://www.rdkit.org/docs/cppapi/structRDKit_1_1AdditionalOutput.html, > but I'm not exactly sure how to do so. I normally would look at the c++ > test files to see how it is used, and from that I see the atomToBits member > is used in the atom pairs fingerprints, but I'm not sure about the other > members of this struct. For example there is a bitInfo member, is this > where I would find information for the RDKit and Morgan fingerprints? > > Are there any examples somewhere that I could follow to find out more > information? > > Thank you > > Jason > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss