Re: [Rdkit-discuss] substructure of a fingerprint position

2017-02-01 Thread George Papadatos
https://iwatobipen.wordpress.com/2017/01/08/get-bit-information-with-rdkit/

George. 

Sent from my giPhone

> On 26 Jan 2017, at 11:02, Gonzalo Colmenarejo  
> wrote:
> 
> Hi,
> 
> is there a way in RDKit to retrieve the substructure(s) corresponding to a 
> (hashed or unhashed) Morgan fingerprint position? 
> 
> Thanks a lot in advance
> 
> Gonzalo
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Rdkit atom indexing vs indexing in written pdb file

2017-02-01 Thread Susan Leung
Thank you very much Andrew!

Indeed, I did not spot the pattern - how silly of me!

From: Andrew Dalke [da...@dalkescientific.com]
Sent: 01 February 2017 16:49
To: Susan Leung
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Rdkit atom indexing vs indexing in written pdb file

Dear Susan,

  If I understand what's going on correctly, you have run across the difference 
between 0-based and 1-based indexing. See 
https://en.wikipedia.org/wiki/Zero-based_numbering .

RDKit, like most programming libraries and languages, index based on an offset 
from the beginning, so 0 means the beginning, 1 means one after the beginning, 
etc.

This is somewhat like how some buildings use "1" as the first floor above the 
ground, while others regard "1" as the ground floor, which is confusing if you 
are not used to it. (My apartment number says its on the second floor, while 
the elevator button says I live on floor 3.)

On Feb 1, 2017, at 5:15 PM, Susan Leung  wrote:
> I am producing rdkit conformers and writing them to pdb files but am finding 
> the atom indexing in rdkit is different from the written pdb.
  ...
> Here is my code and output (the C=O looks like it's atoms 3,4 in rdkit but 
> 4,5 in the pdb file):
  ...
> In [3]: mol = Chem.MolFromSmiles("CC1=C(C(=O)C)C=CC=C1")
  ...
> In [4]: mol.GetSubstructMatch(Chem.MolFromSmiles('C(=O)'))
> Out[4]: (3, 4)
  ...
>   record_name  atom_number blank_1 atom_name alt_loc residue_name blank_2  \
> 0  HETATM1C1  UNL
> 1  HETATM2C2  UNL
> 2  HETATM3C3  UNL
> 3  HETATM4C4  UNL
> 4  HETATM5O1  UNL
> 5  HETATM6C5  UNL
> 6  HETATM7C6  UNL
> 7  HETATM8C7  UNL
> 8  HETATM9C8  UNL
> 9  HETATM   10C9  UNL


If I understand you correctly, then the "(3, 4)" as RDKit atom indices is (3+1, 
4+1) = (4,5) as PDB atom number, that is, the RDKit indices correspond to the 
left-most column of your table, rather than the atom_number column.

Cheers,

Andrew
da...@dalkescientific.com



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Rdkit atom indexing vs indexing in written pdb file

2017-02-01 Thread Andrew Dalke
Dear Susan,

  If I understand what's going on correctly, you have run across the difference 
between 0-based and 1-based indexing. See 
https://en.wikipedia.org/wiki/Zero-based_numbering .

RDKit, like most programming libraries and languages, index based on an offset 
from the beginning, so 0 means the beginning, 1 means one after the beginning, 
etc.

This is somewhat like how some buildings use "1" as the first floor above the 
ground, while others regard "1" as the ground floor, which is confusing if you 
are not used to it. (My apartment number says its on the second floor, while 
the elevator button says I live on floor 3.)

On Feb 1, 2017, at 5:15 PM, Susan Leung  wrote:
> I am producing rdkit conformers and writing them to pdb files but am finding 
> the atom indexing in rdkit is different from the written pdb.
  ...
> Here is my code and output (the C=O looks like it's atoms 3,4 in rdkit but 
> 4,5 in the pdb file):
  ...
> In [3]: mol = Chem.MolFromSmiles("CC1=C(C(=O)C)C=CC=C1")
  ...
> In [4]: mol.GetSubstructMatch(Chem.MolFromSmiles('C(=O)'))
> Out[4]: (3, 4)
  ...
>   record_name  atom_number blank_1 atom_name alt_loc residue_name blank_2  \
> 0  HETATM1C1  UNL   
> 1  HETATM2C2  UNL   
> 2  HETATM3C3  UNL   
> 3  HETATM4C4  UNL   
> 4  HETATM5O1  UNL   
> 5  HETATM6C5  UNL   
> 6  HETATM7C6  UNL   
> 7  HETATM8C7  UNL   
> 8  HETATM9C8  UNL   
> 9  HETATM   10C9  UNL  


If I understand you correctly, then the "(3, 4)" as RDKit atom indices is (3+1, 
4+1) = (4,5) as PDB atom number, that is, the RDKit indices correspond to the 
left-most column of your table, rather than the atom_number column.

Cheers,

Andrew
da...@dalkescientific.com



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Rdkit atom indexing vs indexing in written pdb file

2017-02-01 Thread Susan Leung
Dear all,

I am producing rdkit conformers and writing them to pdb files but am finding 
the atom indexing in rdkit is different from the written pdb. I would like this 
because I want to do a substructure search (using rdkit) to give me a handle on 
these atoms in the pdbfile.

Apologies if this has been discussed before.

Here is my code and output (the C=O looks like it's atoms 3,4 in rdkit but 4,5 
in the pdb file):

Thanks,

Susan

*

In [1]: import rdkit

In [2]: from rdkit import Chem
   ...: from rdkit.Chem import AllChem
   ...: from rdkit.Chem.Draw import IPythonConsole
   ...:

In [3]: mol = Chem.MolFromSmiles("CC1=C(C(=O)C)C=CC=C1")
   ...: idx = AllChem.EmbedMultipleConfs(mol,numConfs=1,randomSeed=0xf00d,
   ...:  
useExpTorsionAnglePrefs=True,useBasicKnowledge=True)
   ...:

In [4]: mol.GetSubstructMatch(Chem.MolFromSmiles('C(=O)'))
Out[4]: (3, 4)

In [5]: Chem.MolToPDBFile(mol,'./test.pdb')

In [6]: import biopandas
   ...: from biopandas.pdb import PandasPDB
   ...: ppdb = PandasPDB()
   ...: ppdb.read_pdb('./test.pdb')
   ...: ppdb.df['HETATM']
   ...:
Out[6]:
  record_name  atom_number blank_1 atom_name alt_loc residue_name blank_2  \
0  HETATM1C1  UNL
1  HETATM2C2  UNL
2  HETATM3C3  UNL
3  HETATM4C4  UNL
4  HETATM5O1  UNL
5  HETATM6C5  UNL
6  HETATM7C6  UNL
7  HETATM8C7  UNL
8  HETATM9C8  UNL
9  HETATM   10C9  UNL

  chain_id  residue_number insertion...x_coord  y_coord  z_coord  \
01  ...  0.1761.9111.137
11  ... -0.5130.7590.511
21  ...  0.272   -0.184   -0.139
31  ...  1.717   -0.056   -0.210
41  ...  2.406   -0.917   -0.801
51  ...  2.3441.1180.435
61  ... -0.332   -1.286   -0.743
71  ... -1.696   -1.416   -0.682
81  ... -2.495   -0.504   -0.048
91  ... -1.8790.5750.540

   occupancy  b_factor  blank_4 segment_id element_symbol charge  line_idx
01.0   0.0  CNaN 0
11.0   0.0  CNaN 1
21.0   0.0  CNaN 2
31.0   0.0  CNaN 3
41.0   0.0  ONaN 4
51.0   0.0  CNaN 5
61.0   0.0  CNaN 6
71.0   0.0  CNaN 7
81.0   0.0  CNaN 8
91.0   0.0  CNaN 9

[10 rows x 21 columns]

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] ctest fails

2017-02-01 Thread 鈴木 倫太郎
Dear Paolo,

Thank you for your response.

I built boost with the following commands.


$ ./bootstrap.sh --prefix=/usr/local/boost_1_62_0 
--with-python=/usr/local/bin/python --with-python-root=/usr/local
$ ./b2 threading=multi


/usr/local/bin/python is my self-built version of Python 2.7.

The command you suggested gives the following information but I can not 
interpret it.


$ ctest -I 7,7 -V
UpdateCTestConfiguration  from :/home/nmr/RDKit/build/DartConfiguration.tcl
Start processing tests
UpdateCTestConfiguration  from :/home/nmr/RDKit/build/DartConfiguration.tcl
Test project /home/nmr/RDKit/build
Constructing a list of tests
Done constructing a list of tests
Changing directory into /home/nmr/RDKit/build/External/INCHI-API
Changing directory into /home/nmr/RDKit/build/Code/RDGeneral
Changing directory into /home/nmr/RDKit/build/Code/DataStructs
Changing directory into /home/nmr/RDKit/build/Code/DataStructs/Wrap
  7/112 Testing pyBV
Test command: /usr/local/bin/python 
/home/nmr/RDKit/Code/DataStructs/Wrap/testBV.py
Test timeout computed to be: 9.99988e+06
..E...E..
==
ERROR: test3Bounds (__main__.TestCase)
--
Traceback (most recent call last):
  File "/home/nmr/RDKit/Code/DataStructs/Wrap/testBV.py", line 91, in 
test3Bounds
bv1[11]
RuntimeError: IndexErrorException

==
ERROR: test7FPS (__main__.TestCase)
--
Traceback (most recent call last):
  File "/home/nmr/RDKit/Code/DataStructs/Wrap/testBV.py", line 172, in test7FPS
self.assertRaises(ValueError, lambda: 
DataStructs.CreateFromFPSText("030082801"))
  File "/usr/local/lib/python2.7/unittest/case.py", line 475, in assertRaises
callableObj(*args, **kwargs)
  File "/home/nmr/RDKit/Code/DataStructs/Wrap/testBV.py", line 172, in 
self.assertRaises(ValueError, lambda: 
DataStructs.CreateFromFPSText("030082801"))
RuntimeError: ValueErrorException

--
Ran 13 tests in 0.753s

FAILED (errors=2)
-- Process completed
***Failed
Changing directory into /home/nmr/RDKit/build/Code/Geometry
Changing directory into /home/nmr/RDKit/build/Code/Geometry/Wrap
Changing directory into /home/nmr/RDKit/build/Code/Numerics
Changing directory into /home/nmr/RDKit/build/Code/Numerics/Alignment
Changing directory into /home/nmr/RDKit/build/Code/Numerics/Alignment/Wrap
Changing directory into /home/nmr/RDKit/build/Code/Numerics/Optimizer
Changing directory into /home/nmr/RDKit/build/Code/ForceField/UFF
Changing directory into /home/nmr/RDKit/build/Code/ForceField/MMFF
Changing directory into /home/nmr/RDKit/build/Code/ForceField/Wrap
Changing directory into /home/nmr/RDKit/build/Code/DistGeom
Changing directory into /home/nmr/RDKit/build/Code/DistGeom/Wrap
Changing directory into /home/nmr/RDKit/build/Code/GraphMol
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Depictor
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Depictor/Wrap
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/SmilesParse
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/FileParsers
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Substruct
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/ChemReactions
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/ChemReactions/Wrap
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/ChemTransforms
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Subgraphs
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/FilterCatalog
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/FilterCatalog/Wrap
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/FragCatalog
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/FragCatalog/Wrap
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Descriptors
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Descriptors/Wrap
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/Fingerprints
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/PartialCharges
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/PartialCharges/Wrap
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/MolTransforms
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/MolTransforms/Wrap
Changing directory into 
/home/nmr/RDKit/build/Code/GraphMol/ForceFieldHelpers/MMFF
Changing directory into 
/home/nmr/RDKit/build/Code/GraphMol/ForceFieldHelpers/UFF
Changing directory into 
/home/nmr/RDKit/build/Code/GraphMol/ForceFieldHelpers/CrystalFF
Changing directory into 
/home/nmr/RDKit/build/Code/GraphMol/ForceFieldHelpers/Wrap
Changing directory into /home/nmr/RDKit/build/Code/GraphMol/DistGeomHelpers
Changing directory into 

[Rdkit-discuss] C++ MolPickler

2017-02-01 Thread David Cosgrove
Hi All,

I've got as far as 'Preserving Molecules' in the 'Getting Started with C++'
document I'm writing, and it appears that the MolPickler doesn't write
properties into the pickle.  Is that right?  If so, it means the molecule
name goes missing, which is an issue when putting multiple molecules in the
same file.

Cheers,
Dave


-- 
David Cosgrove
Freelance computational chemistry and chemoinformatics developer
http://cozchemix.co.uk
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] ctest fails

2017-02-01 Thread 鈴木 倫太郎
Dear All,

I have been used RDKit 2015_03_1 with self-build version of python 2.7 and 
boost 1.62 on CentOS 5.11.
Now I tried to built and install 2016_09_3 in the same environment and I got 13 
errors after ctest.


88% tests passed, 13 tests failed out of 112

The following tests FAILED:
  7 - pyBV (Failed)
  8 - pyDiscreteValueVect (Failed)
  9 - pySparseIntVect (Failed)
 13 - testPyGeometry (Failed)
 64 - pyMolTransforms (Failed)
 70 - pyDistGeom (Failed)
 72 - pyMolAlign (Failed)
 74 - pyChemicalFeatures (Failed)
 95 - pyGraphMolWrap (Failed)
 97 - pyTestTrajectory (Failed)
102 - pySimDivPickers (Failed)
108 - pythonTestDirDataStructs (Failed)
112 - pythonTestDirChem (Failed)
Errors while running CTest


I checked following commands. I could not find problems.


$ ldd $RDBASE/rdkit/Chem/rdchem.so
linux-vdso.so.1 =>  (0x7fff5e3fd000)
libSmilesParse.so.1 => /home/nmr/RDKit/lib/libSmilesParse.so.1 
(0x2b7ccfaa)
libChemTransforms.so.1 => /home/nmr/RDKit/lib/libChemTransforms.so.1 
(0x2b7ccfcf6000)
libSubstructMatch.so.1 => /home/nmr/RDKit/lib/libSubstructMatch.so.1 
(0x2b7ccff39000)
libGraphMol.so.1 => /home/nmr/RDKit/lib/libGraphMol.so.1 
(0x2b7cd0167000)
libRDGeometryLib.so.1 => /home/nmr/RDKit/lib/libRDGeometryLib.so.1 
(0x2b7cd04c9000)
libRDGeneral.so.1 => /home/nmr/RDKit/lib/libRDGeneral.so.1 
(0x2b7cd06e4000)
libRDBoost.so.1 => /home/nmr/RDKit/lib/libRDBoost.so.1 
(0x2b7cd08ff000)
libboost_python.so.1.62.0 => 
/usr/local/boost_1_62_0/lib/libboost_python.so.1.62.0 (0x2b7cd0ccb000)
libboost_thread.so.1.62.0 => 
/usr/local/boost_1_62_0/lib/libboost_thread.so.1.62.0 (0x2b7cd0f1f000)
libboost_system.so.1.62.0 => 
/usr/local/boost_1_62_0/lib/libboost_system.so.1.62.0 (0x2b7cd1145000)
libboost_serialization.so.1.62.0 => 
/usr/local/boost_1_62_0/lib/libboost_serialization.so.1.62.0 
(0x2b7cd1349000)
libDataStructs.so.1 => /home/nmr/RDKit/lib/libDataStructs.so.1 
(0x2b7cd1599000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x2b7cd18fd000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x2b7cd1b1a000)
libm.so.6 => /lib64/libm.so.6 (0x2b7cd1e1a000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x2b7cd209d000)
libc.so.6 => /lib64/libc.so.6 (0x2b7cd22ac000)
libdl.so.2 => /lib64/libdl.so.2 (0x2b7cd2605000)
libutil.so.1 => /lib64/libutil.so.1 (0x2b7cd280a000)
librt.so.1 => /lib64/librt.so.1 (0x2b7cd2a0d000)
/lib64/ld-linux-x86-64.so.2 (0x003ada20)


$ python -c 'from rdkit import rdBase; print rdBase.__file__'
/home/nmr/RDKit/rdkit/rdBase.so


Can you please suggest any solutions?

Thank you,
Rintarou


Suzuki, Rintarou
National Agriculture and Food Research Organization
Tsukuba, Japan

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss