Re: [Rdkit-discuss] substructure of a fingerprint position
https://iwatobipen.wordpress.com/2017/01/08/get-bit-information-with-rdkit/ George. Sent from my giPhone > On 26 Jan 2017, at 11:02, Gonzalo Colmenarejo> wrote: > > Hi, > > is there a way in RDKit to retrieve the substructure(s) corresponding to a > (hashed or unhashed) Morgan fingerprint position? > > Thanks a lot in advance > > Gonzalo > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Rdkit atom indexing vs indexing in written pdb file
Thank you very much Andrew! Indeed, I did not spot the pattern - how silly of me! From: Andrew Dalke [da...@dalkescientific.com] Sent: 01 February 2017 16:49 To: Susan Leung Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] Rdkit atom indexing vs indexing in written pdb file Dear Susan, If I understand what's going on correctly, you have run across the difference between 0-based and 1-based indexing. See https://en.wikipedia.org/wiki/Zero-based_numbering . RDKit, like most programming libraries and languages, index based on an offset from the beginning, so 0 means the beginning, 1 means one after the beginning, etc. This is somewhat like how some buildings use "1" as the first floor above the ground, while others regard "1" as the ground floor, which is confusing if you are not used to it. (My apartment number says its on the second floor, while the elevator button says I live on floor 3.) On Feb 1, 2017, at 5:15 PM, Susan Leungwrote: > I am producing rdkit conformers and writing them to pdb files but am finding > the atom indexing in rdkit is different from the written pdb. ... > Here is my code and output (the C=O looks like it's atoms 3,4 in rdkit but > 4,5 in the pdb file): ... > In [3]: mol = Chem.MolFromSmiles("CC1=C(C(=O)C)C=CC=C1") ... > In [4]: mol.GetSubstructMatch(Chem.MolFromSmiles('C(=O)')) > Out[4]: (3, 4) ... > record_name atom_number blank_1 atom_name alt_loc residue_name blank_2 \ > 0 HETATM1C1 UNL > 1 HETATM2C2 UNL > 2 HETATM3C3 UNL > 3 HETATM4C4 UNL > 4 HETATM5O1 UNL > 5 HETATM6C5 UNL > 6 HETATM7C6 UNL > 7 HETATM8C7 UNL > 8 HETATM9C8 UNL > 9 HETATM 10C9 UNL If I understand you correctly, then the "(3, 4)" as RDKit atom indices is (3+1, 4+1) = (4,5) as PDB atom number, that is, the RDKit indices correspond to the left-most column of your table, rather than the atom_number column. Cheers, Andrew da...@dalkescientific.com -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Rdkit atom indexing vs indexing in written pdb file
Dear Susan, If I understand what's going on correctly, you have run across the difference between 0-based and 1-based indexing. See https://en.wikipedia.org/wiki/Zero-based_numbering . RDKit, like most programming libraries and languages, index based on an offset from the beginning, so 0 means the beginning, 1 means one after the beginning, etc. This is somewhat like how some buildings use "1" as the first floor above the ground, while others regard "1" as the ground floor, which is confusing if you are not used to it. (My apartment number says its on the second floor, while the elevator button says I live on floor 3.) On Feb 1, 2017, at 5:15 PM, Susan Leungwrote: > I am producing rdkit conformers and writing them to pdb files but am finding > the atom indexing in rdkit is different from the written pdb. ... > Here is my code and output (the C=O looks like it's atoms 3,4 in rdkit but > 4,5 in the pdb file): ... > In [3]: mol = Chem.MolFromSmiles("CC1=C(C(=O)C)C=CC=C1") ... > In [4]: mol.GetSubstructMatch(Chem.MolFromSmiles('C(=O)')) > Out[4]: (3, 4) ... > record_name atom_number blank_1 atom_name alt_loc residue_name blank_2 \ > 0 HETATM1C1 UNL > 1 HETATM2C2 UNL > 2 HETATM3C3 UNL > 3 HETATM4C4 UNL > 4 HETATM5O1 UNL > 5 HETATM6C5 UNL > 6 HETATM7C6 UNL > 7 HETATM8C7 UNL > 8 HETATM9C8 UNL > 9 HETATM 10C9 UNL If I understand you correctly, then the "(3, 4)" as RDKit atom indices is (3+1, 4+1) = (4,5) as PDB atom number, that is, the RDKit indices correspond to the left-most column of your table, rather than the atom_number column. Cheers, Andrew da...@dalkescientific.com -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Rdkit atom indexing vs indexing in written pdb file
Dear all, I am producing rdkit conformers and writing them to pdb files but am finding the atom indexing in rdkit is different from the written pdb. I would like this because I want to do a substructure search (using rdkit) to give me a handle on these atoms in the pdbfile. Apologies if this has been discussed before. Here is my code and output (the C=O looks like it's atoms 3,4 in rdkit but 4,5 in the pdb file): Thanks, Susan * In [1]: import rdkit In [2]: from rdkit import Chem ...: from rdkit.Chem import AllChem ...: from rdkit.Chem.Draw import IPythonConsole ...: In [3]: mol = Chem.MolFromSmiles("CC1=C(C(=O)C)C=CC=C1") ...: idx = AllChem.EmbedMultipleConfs(mol,numConfs=1,randomSeed=0xf00d, ...: useExpTorsionAnglePrefs=True,useBasicKnowledge=True) ...: In [4]: mol.GetSubstructMatch(Chem.MolFromSmiles('C(=O)')) Out[4]: (3, 4) In [5]: Chem.MolToPDBFile(mol,'./test.pdb') In [6]: import biopandas ...: from biopandas.pdb import PandasPDB ...: ppdb = PandasPDB() ...: ppdb.read_pdb('./test.pdb') ...: ppdb.df['HETATM'] ...: Out[6]: record_name atom_number blank_1 atom_name alt_loc residue_name blank_2 \ 0 HETATM1C1 UNL 1 HETATM2C2 UNL 2 HETATM3C3 UNL 3 HETATM4C4 UNL 4 HETATM5O1 UNL 5 HETATM6C5 UNL 6 HETATM7C6 UNL 7 HETATM8C7 UNL 8 HETATM9C8 UNL 9 HETATM 10C9 UNL chain_id residue_number insertion...x_coord y_coord z_coord \ 01 ... 0.1761.9111.137 11 ... -0.5130.7590.511 21 ... 0.272 -0.184 -0.139 31 ... 1.717 -0.056 -0.210 41 ... 2.406 -0.917 -0.801 51 ... 2.3441.1180.435 61 ... -0.332 -1.286 -0.743 71 ... -1.696 -1.416 -0.682 81 ... -2.495 -0.504 -0.048 91 ... -1.8790.5750.540 occupancy b_factor blank_4 segment_id element_symbol charge line_idx 01.0 0.0 CNaN 0 11.0 0.0 CNaN 1 21.0 0.0 CNaN 2 31.0 0.0 CNaN 3 41.0 0.0 ONaN 4 51.0 0.0 CNaN 5 61.0 0.0 CNaN 6 71.0 0.0 CNaN 7 81.0 0.0 CNaN 8 91.0 0.0 CNaN 9 [10 rows x 21 columns] -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] ctest fails
Dear Paolo, Thank you for your response. I built boost with the following commands. $ ./bootstrap.sh --prefix=/usr/local/boost_1_62_0 --with-python=/usr/local/bin/python --with-python-root=/usr/local $ ./b2 threading=multi /usr/local/bin/python is my self-built version of Python 2.7. The command you suggested gives the following information but I can not interpret it. $ ctest -I 7,7 -V UpdateCTestConfiguration from :/home/nmr/RDKit/build/DartConfiguration.tcl Start processing tests UpdateCTestConfiguration from :/home/nmr/RDKit/build/DartConfiguration.tcl Test project /home/nmr/RDKit/build Constructing a list of tests Done constructing a list of tests Changing directory into /home/nmr/RDKit/build/External/INCHI-API Changing directory into /home/nmr/RDKit/build/Code/RDGeneral Changing directory into /home/nmr/RDKit/build/Code/DataStructs Changing directory into /home/nmr/RDKit/build/Code/DataStructs/Wrap 7/112 Testing pyBV Test command: /usr/local/bin/python /home/nmr/RDKit/Code/DataStructs/Wrap/testBV.py Test timeout computed to be: 9.99988e+06 ..E...E.. == ERROR: test3Bounds (__main__.TestCase) -- Traceback (most recent call last): File "/home/nmr/RDKit/Code/DataStructs/Wrap/testBV.py", line 91, in test3Bounds bv1[11] RuntimeError: IndexErrorException == ERROR: test7FPS (__main__.TestCase) -- Traceback (most recent call last): File "/home/nmr/RDKit/Code/DataStructs/Wrap/testBV.py", line 172, in test7FPS self.assertRaises(ValueError, lambda: DataStructs.CreateFromFPSText("030082801")) File "/usr/local/lib/python2.7/unittest/case.py", line 475, in assertRaises callableObj(*args, **kwargs) File "/home/nmr/RDKit/Code/DataStructs/Wrap/testBV.py", line 172, in self.assertRaises(ValueError, lambda: DataStructs.CreateFromFPSText("030082801")) RuntimeError: ValueErrorException -- Ran 13 tests in 0.753s FAILED (errors=2) -- Process completed ***Failed Changing directory into /home/nmr/RDKit/build/Code/Geometry Changing directory into /home/nmr/RDKit/build/Code/Geometry/Wrap Changing directory into /home/nmr/RDKit/build/Code/Numerics Changing directory into /home/nmr/RDKit/build/Code/Numerics/Alignment Changing directory into /home/nmr/RDKit/build/Code/Numerics/Alignment/Wrap Changing directory into /home/nmr/RDKit/build/Code/Numerics/Optimizer Changing directory into /home/nmr/RDKit/build/Code/ForceField/UFF Changing directory into /home/nmr/RDKit/build/Code/ForceField/MMFF Changing directory into /home/nmr/RDKit/build/Code/ForceField/Wrap Changing directory into /home/nmr/RDKit/build/Code/DistGeom Changing directory into /home/nmr/RDKit/build/Code/DistGeom/Wrap Changing directory into /home/nmr/RDKit/build/Code/GraphMol Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Depictor Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Depictor/Wrap Changing directory into /home/nmr/RDKit/build/Code/GraphMol/SmilesParse Changing directory into /home/nmr/RDKit/build/Code/GraphMol/FileParsers Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Substruct Changing directory into /home/nmr/RDKit/build/Code/GraphMol/ChemReactions Changing directory into /home/nmr/RDKit/build/Code/GraphMol/ChemReactions/Wrap Changing directory into /home/nmr/RDKit/build/Code/GraphMol/ChemTransforms Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Subgraphs Changing directory into /home/nmr/RDKit/build/Code/GraphMol/FilterCatalog Changing directory into /home/nmr/RDKit/build/Code/GraphMol/FilterCatalog/Wrap Changing directory into /home/nmr/RDKit/build/Code/GraphMol/FragCatalog Changing directory into /home/nmr/RDKit/build/Code/GraphMol/FragCatalog/Wrap Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Descriptors Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Descriptors/Wrap Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Fingerprints Changing directory into /home/nmr/RDKit/build/Code/GraphMol/PartialCharges Changing directory into /home/nmr/RDKit/build/Code/GraphMol/PartialCharges/Wrap Changing directory into /home/nmr/RDKit/build/Code/GraphMol/MolTransforms Changing directory into /home/nmr/RDKit/build/Code/GraphMol/MolTransforms/Wrap Changing directory into /home/nmr/RDKit/build/Code/GraphMol/ForceFieldHelpers/MMFF Changing directory into /home/nmr/RDKit/build/Code/GraphMol/ForceFieldHelpers/UFF Changing directory into /home/nmr/RDKit/build/Code/GraphMol/ForceFieldHelpers/CrystalFF Changing directory into /home/nmr/RDKit/build/Code/GraphMol/ForceFieldHelpers/Wrap Changing directory into /home/nmr/RDKit/build/Code/GraphMol/DistGeomHelpers Changing directory into
[Rdkit-discuss] C++ MolPickler
Hi All, I've got as far as 'Preserving Molecules' in the 'Getting Started with C++' document I'm writing, and it appears that the MolPickler doesn't write properties into the pickle. Is that right? If so, it means the molecule name goes missing, which is an issue when putting multiple molecules in the same file. Cheers, Dave -- David Cosgrove Freelance computational chemistry and chemoinformatics developer http://cozchemix.co.uk -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] ctest fails
Dear All, I have been used RDKit 2015_03_1 with self-build version of python 2.7 and boost 1.62 on CentOS 5.11. Now I tried to built and install 2016_09_3 in the same environment and I got 13 errors after ctest. 88% tests passed, 13 tests failed out of 112 The following tests FAILED: 7 - pyBV (Failed) 8 - pyDiscreteValueVect (Failed) 9 - pySparseIntVect (Failed) 13 - testPyGeometry (Failed) 64 - pyMolTransforms (Failed) 70 - pyDistGeom (Failed) 72 - pyMolAlign (Failed) 74 - pyChemicalFeatures (Failed) 95 - pyGraphMolWrap (Failed) 97 - pyTestTrajectory (Failed) 102 - pySimDivPickers (Failed) 108 - pythonTestDirDataStructs (Failed) 112 - pythonTestDirChem (Failed) Errors while running CTest I checked following commands. I could not find problems. $ ldd $RDBASE/rdkit/Chem/rdchem.so linux-vdso.so.1 => (0x7fff5e3fd000) libSmilesParse.so.1 => /home/nmr/RDKit/lib/libSmilesParse.so.1 (0x2b7ccfaa) libChemTransforms.so.1 => /home/nmr/RDKit/lib/libChemTransforms.so.1 (0x2b7ccfcf6000) libSubstructMatch.so.1 => /home/nmr/RDKit/lib/libSubstructMatch.so.1 (0x2b7ccff39000) libGraphMol.so.1 => /home/nmr/RDKit/lib/libGraphMol.so.1 (0x2b7cd0167000) libRDGeometryLib.so.1 => /home/nmr/RDKit/lib/libRDGeometryLib.so.1 (0x2b7cd04c9000) libRDGeneral.so.1 => /home/nmr/RDKit/lib/libRDGeneral.so.1 (0x2b7cd06e4000) libRDBoost.so.1 => /home/nmr/RDKit/lib/libRDBoost.so.1 (0x2b7cd08ff000) libboost_python.so.1.62.0 => /usr/local/boost_1_62_0/lib/libboost_python.so.1.62.0 (0x2b7cd0ccb000) libboost_thread.so.1.62.0 => /usr/local/boost_1_62_0/lib/libboost_thread.so.1.62.0 (0x2b7cd0f1f000) libboost_system.so.1.62.0 => /usr/local/boost_1_62_0/lib/libboost_system.so.1.62.0 (0x2b7cd1145000) libboost_serialization.so.1.62.0 => /usr/local/boost_1_62_0/lib/libboost_serialization.so.1.62.0 (0x2b7cd1349000) libDataStructs.so.1 => /home/nmr/RDKit/lib/libDataStructs.so.1 (0x2b7cd1599000) libpthread.so.0 => /lib64/libpthread.so.0 (0x2b7cd18fd000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x2b7cd1b1a000) libm.so.6 => /lib64/libm.so.6 (0x2b7cd1e1a000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x2b7cd209d000) libc.so.6 => /lib64/libc.so.6 (0x2b7cd22ac000) libdl.so.2 => /lib64/libdl.so.2 (0x2b7cd2605000) libutil.so.1 => /lib64/libutil.so.1 (0x2b7cd280a000) librt.so.1 => /lib64/librt.so.1 (0x2b7cd2a0d000) /lib64/ld-linux-x86-64.so.2 (0x003ada20) $ python -c 'from rdkit import rdBase; print rdBase.__file__' /home/nmr/RDKit/rdkit/rdBase.so Can you please suggest any solutions? Thank you, Rintarou Suzuki, Rintarou National Agriculture and Food Research Organization Tsukuba, Japan -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss