Re: [Rdkit-discuss] Example of the new Coulomb Matrix Feature in RDKit

2020-09-20 Thread Max Pinheiro Jr
Dear Henrique,

You can try something like this:

import rdkit
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem.rdMolDescriptors import CalcCoulombMat


smiles = 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C'
caffeine = Chem.MolFromSmiles(smiles, sanitize=True)
caffeine = Chem.AddHs(caffeine)
AllChem.EmbedMolecule(caffeine)
AllChem.UFFOptimizeMolecule(caffeine)
CM = CalcCoulombMat(caffeine)
list(CM[0])

The CM object contains a list of Coulomb matrices calculated for the
molecule that I think it is generated by randomizing the arrangement of
atoms, but I am not totally sure. It seems that you need first to optimize
the molecule to get its 3D conformation, and then you can pass this object
as input to the CalcCoulombMat function. One way to access the values of
each Coulomb Matrix is by transforming it into a list or a NumPy array. I
hope this can help you.

Max Pinheiro Jr
-
Postdoctoral researcher
Aix-Marseille Université, France
Institut de Chimie Radicalaire

Em dom., 20 de set. de 2020 às 14:58, Henrique Castro <
henrique...@outlook.com> escreveu:

> Dear colleagues, how are you?
> I have almost zero experience with RDKit, so forgive me for such a basic
> request.
> Could anyone provide me an example of the new Coulomb Matrix generator in
> RDKit? I intend to iterate through my organometallic molecules stored in a
> pandas DataFrame.
>
> Thank you in advance
>
> --
> Henrique C. S. Junior
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RMSD between molecules

2020-06-09 Thread Max Pinheiro Jr
Hi Eduardo,

Have you tried a python package called RMSD? Here is the link for this
package https://pypi.org/project/rmsd/. I have used this program to
calculate the RMSD between conformers and it works pretty well. You just
need to give the xyz matrices of each molecule as input. There are a few
examples on the Github page. I hope it helps.

Max Pinheiro Jr

Em ter., 9 de jun. de 2020 às 08:13, Eduardo Mayo <
eduardomayoya...@gmail.com> escreveu:

> Hi I'm trying to calculate the RMSD between conformers of the same
> molecules stores in separate mol file.
> I figured out a way:
>
> m1= Chem.FromMolFile('1.mol')
> m2= Chem.FromMolFile('2.mol')
>
> m1.AddConformer(M2.GetConformer(-1),1)
> AllChem.GetConformerRMS(m1,0,1)
>
> Is there another way??
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Removing solvent and ions from dataset

2020-06-06 Thread Max Pinheiro Jr
Hi RDkit team,

I am working on a chemically diverse dataset of smiles strings and I need
to do some preprocessing to clean a bit the data before starting the
modeling part. So I was looking for some tools or built-in functions in
RDkit to make such preprocessing by removing, for instance, solvent (water)
molecules and ions. I found the "SaltRemover" module that may solve my
problem with removing ions from the database, but I could not find an
equivalent module for the case of solvent molecules. Does anyone know a
specific tool in RDkit (or any other python program) to make such
preprocessing in the smile strings? If so, could you please provide just a
simple example of how to do it? I will be really thankful for any help you
may provide.

Max Pinheiro Jr
-
Université Aix-Marseille, France
Institut de Chimie Radicalaire
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Compilation problems on Linux

2020-04-17 Thread Max Pinheiro Jr
Hi Francois,

Thank you for your suggestions! Sorry, I forgot to mention that Paolo had
already helped me to fix the problem. It was simple to solve, RDKit was
using the older gcc version installed in the system although I have loaded
the newest one using the module load command. So the trick just provided
the full path to the  gcc-8.1 compiler with something like this
"CC=/opt/gcc8.1/bin/gcc CXX=/opt/gcc8.1/bin/g++ cmake" to enforce the
program to pick up the specified compiler. After that, the program compiled
nicely.

I would like just to comment on another issue I found later. When I tried
to use the latest version of tensorflow together with rdkit in the same
python environment, I had problems to import the tensorflow package:

AttributeError: type object 'NewBase' has no attribute 'is_abstract'

After some googling, I found one solution in StackOverflow (
https://stackoverflow.com/questions/41529526/tensorflow-attributerror-type-object-newbase-has-no-attribute-is-abstract/43712626#43712626)
that worked in my case. It seems that rdkit uses an old version of six.py
that is incompatible with tensorflow. So the solution was simply to copy
the newest six.py available in my python environment to replace that one
used by rdkit, but I don't know if this may affect other stuff in rdkit. Do
you know if there is another way to solve this problem or if this
replacement in the six.py file may cause some problems in rdkit?

All the best,

Max

Em sex., 17 de abr. de 2020 às 07:45, Francois Berenger 
escreveu:

> Hi Max,
>
> Not sure if it will help, but on Debian and Ubuntu you need the
> following
> system packages to be installed in order to compile rdkit:
>
> curl
> wget
> libboost-all-dev
> cmake
> git
> g++
> libeigen3-dev
> python3
> libpython3-all-dev
> python3-numpy
> python3-pip
> python3-pil
> python3-six
> python3-pandas
>
> What Linux distro are you using?
>
> Doesn't your distribution provides python3 ready packages for rdkit?
>
> Ideally, this is what you would want, especially if you install rdkit on
> all nodes of a computing cluster.
>
> Regards,
> F.
>
> On 16/04/2020 01:33, Max Pinheiro Jr wrote:
> > Hi Paolo,
> >
> > Thank you for your quite fast answer! Yes, I compiled Boost 1.67 using
> > the same gcc version, 8.1. I have seen this GLIBCXX possible solution
> > that you have commented before, and I also tried that but didn't work
> > anyway, I got the same problem with the Boost library and the
> > compilation can't finish. I am wondering if may exist any other
> > solution. I can also provide some other specific information if this
> > would help to map the problem and find a solution.
> >
> > Thank you again!
> >
> > Max Pinheiro Jr
> >
> > Em qua., 15 de abr. de 2020 às 18:25, Paolo Tosco
> >  escreveu:
> >
> >> Hi Max,
> >>
> >> you mention you are using gcc-8.1 and Boost 1.67. Did you compile
> >> Boost with the same compiler or was it compiled with an earlier
> >> version of gcc/g++?
> >>
> >> If Boost was compiled with an earlier version of gcc/g++, you will
> >> need to add to /home/mpinheiro/codes/rdkit-2020.09/CMakeLists.txt
> >> the following line:
> >>
> >> add_definitions("-D_GLIBCXX_USE_CXX11_ABI=0")
> >>
> >> or the linker will fail during the compilation; see
> >> https://github.com/rdkit/rdkit/issues/2013#issuecomment-553563418.
> >>
> >> HTH, cheers
> >> p.
> >> On 15/04/2020 17:15, Max Pinheiro Jr wrote:
> >>
> >>> Dear all,
> >>>
> >>> I have exhaustively tried to compile rdkit (latest git version) on
> >>> a Linux cluster but the compilation process was always failing at
> >>> the same point with an error message related to the boost library.
> >>> After searching in the forum, the only way I could surpass the
> >>> problem and finally get the program compiled was setting the flag
> >>> "RDK_USE_BOOST_SERIALIZATION" to OFF. However, when I do a simple
> >>> test trying to import the Chem module I get the following error:
> >>>
> >>>
> >>
> >
> 
> >>>
> >>> from rdkit import Chem
> >>> Traceback (most recent call last):
> >>> File "", line 1, in 
> >>> File
> >>> "/home/mpinheiro/codes/rdkit-2020.09/rdkit/Chem/__init__.py", line
> >>> 20, in 
> >>> from rdkit.Chem import rdchem

Re: [Rdkit-discuss] Compilation problems on Linux

2020-04-15 Thread Max Pinheiro Jr
Hi Paolo,

Thank you for your quite fast answer! Yes, I compiled Boost 1.67 using the
same gcc version, 8.1. I have seen this GLIBCXX possible solution that you
have commented before, and I also tried that but didn't work anyway, I got
the same problem with the Boost library and the compilation can't finish. I
am wondering if may exist any other solution. I can also provide some other
specific information if this would help to map the problem and find a
solution.

Thank you again!

Max Pinheiro Jr

Em qua., 15 de abr. de 2020 às 18:25, Paolo Tosco <
paolo.tosco.m...@gmail.com> escreveu:

> Hi Max,
>
> you mention you are using gcc-8.1 and Boost 1.67. Did you compile Boost
> with the same compiler or was it compiled with an earlier version of
> gcc/g++?
>
> If Boost was compiled with an earlier version of gcc/g++, you will need to
> add to /home/mpinheiro/codes/rdkit-2020.09/CMakeLists.txt the following
> line:
>
> add_definitions("-D_GLIBCXX_USE_CXX11_ABI=0")
>
> or the linker will fail during the compilation; see
> https://github.com/rdkit/rdkit/issues/2013#issuecomment-553563418.
>
> HTH, cheers
> p.
> On 15/04/2020 17:15, Max Pinheiro Jr wrote:
>
> Dear all,
>
> I have exhaustively tried to compile rdkit (latest git version) on a Linux
> cluster but the compilation process was always failing at the same point
> with an error message related to the boost library. After searching in the
> forum, the only way I could surpass the problem and finally get the program
> compiled was setting the flag "RDK_USE_BOOST_SERIALIZATION" to OFF.
> However, when I do a simple test trying to import the Chem module I get the
> following error:
>
>
> 
> from rdkit import Chem
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "/home/mpinheiro/codes/rdkit-2020.09/rdkit/Chem/__init__.py", line
> 20, in 
> from rdkit.Chem import rdchem
> SystemError: initialization of rdchem raised unreported exception
>
> 
>
> I am using gcc-8.1, cmake-3.11.2 and the version 1.67 of boost library to
> build RDKit. The compilation instructions I have used are the following:
>
> cmake -DPy_ENABLE_SHARED=1 \
>   -DRDK_INSTALL_INTREE=ON \
>   -DRDK_BUILD_CPP_TESTS=ON \
>   -DRDK_INSTALL_STATIC_LIBS=ON \
>   -DRDK_BUILD_AVALON_SUPPORT=ON \
>   -DRDK_BUILD_CAIRO_SUPPORT=ON \
>   -DRDK_BUILD_INCHI_SUPPORT=ON \
>   -DRDK_BUILD_PYTHON_WRAPPERS=ON \
>   -DRDK_BUILD_SWIG_CSHARP_WRAPPER=ON \
>   -DPYTHON_EXECUTABLE=/home/mpinheiro/.pyenv/versions/3.8.2/bin/python
> \
>
> -DPYTHON_LIBRARY=/home/mpinheiro/.pyenv/versions/3.8.2/lib/libpython3.8.a \
>
> -DPYTHON_INCLUDE_DIR=/home/mpinheiro/.pyenv/versions/3.8.2/include/python3.8
> \
>   -DPYTHON_NUMPY_INCLUDE_PATH="$(python -c 'import numpy ;
> print(numpy.get_include())')" \
>   -DBOOST_ROOT=/home/mpinheiro/codes/boost-1.67/ \
>   -DBOOST_INCLUDEDIR=/home/mpinheiro/codes/boost-1.67/include/boost \
>   -DBOOST_LIBRARYDIR=/home/mpinheiro/codes/boost-1.67/lib ..
>
> make -j 4 > make.log
> make install
>
> I have also checked the links created in the rdBase.so file as shown below
> and everything seems to be fine:
>
>  linux-vdso.so.1 =>  (0x2aaab000)
> libRDKitRDBoost.so.1 =>
> /home/mpinheiro/codes/rdkit-2020.09/lib/libRDKitRDBoost.so.1
> (0x2adb1000)
> libboost_python38.so.1.67.0 =>
> /home/mpinheiro/codes/boost-1.67/lib/libboost_python38.so.1.67.0
> (0x2afb5000)
> libRDKitRDGeneral.so.1 =>
> /home/mpinheiro/codes/rdkit-2020.09/lib/libRDKitRDGeneral.so.1
> (0x2b1fb000)
> libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x2b423000)
> libstdc++.so.6 =>
> /trinity/shared/apps/custom/x86_64/gcc-8.1.0/lib64/libstdc++.so.6
> (0x2b64)
> libm.so.6 => /usr/lib64/libm.so.6 (0x2b9c4000)
> libgcc_s.so.1 =>
> /trinity/shared/apps/custom/x86_64/gcc-8.1.0/lib64/libgcc_s.so.1
> (0x2bcc6000)
> libc.so.6 => /usr/lib64/libc.so.6 (0x2bedf000)
> librt.so.1 => /usr/lib64/librt.so.1 (0x2c2a2000)
> libdl.so.2 => /usr/lib64/libdl.so.2 (0x2c4aa000)
> libutil.so.1 => /usr/lib64/libutil.so.1 (0x2c6af000)
> /lib64/ld-linux-x86-64.so.2 (0x4000)
>
> As I said, I have tried many different tricks and suggestions that I was
> able to find in the forum but none of them effectively solved my problem to
> get the code working. So

[Rdkit-discuss] Compilation problems on Linux

2020-04-15 Thread Max Pinheiro Jr
Dear all,

I have exhaustively tried to compile rdkit (latest git version) on a Linux
cluster but the compilation process was always failing at the same point
with an error message related to the boost library. After searching in the
forum, the only way I could surpass the problem and finally get the program
compiled was setting the flag "RDK_USE_BOOST_SERIALIZATION" to OFF.
However, when I do a simple test trying to import the Chem module I get the
following error:


from rdkit import Chem
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/mpinheiro/codes/rdkit-2020.09/rdkit/Chem/__init__.py", line
20, in 
from rdkit.Chem import rdchem
SystemError: initialization of rdchem raised unreported exception


I am using gcc-8.1, cmake-3.11.2 and the version 1.67 of boost library to
build RDKit. The compilation instructions I have used are the following:

cmake -DPy_ENABLE_SHARED=1 \
  -DRDK_INSTALL_INTREE=ON \
  -DRDK_BUILD_CPP_TESTS=ON \
  -DRDK_INSTALL_STATIC_LIBS=ON \
  -DRDK_BUILD_AVALON_SUPPORT=ON \
  -DRDK_BUILD_CAIRO_SUPPORT=ON \
  -DRDK_BUILD_INCHI_SUPPORT=ON \
  -DRDK_BUILD_PYTHON_WRAPPERS=ON \
  -DRDK_BUILD_SWIG_CSHARP_WRAPPER=ON \
  -DPYTHON_EXECUTABLE=/home/mpinheiro/.pyenv/versions/3.8.2/bin/python \

-DPYTHON_LIBRARY=/home/mpinheiro/.pyenv/versions/3.8.2/lib/libpython3.8.a \

-DPYTHON_INCLUDE_DIR=/home/mpinheiro/.pyenv/versions/3.8.2/include/python3.8
\
  -DPYTHON_NUMPY_INCLUDE_PATH="$(python -c 'import numpy ;
print(numpy.get_include())')" \
  -DBOOST_ROOT=/home/mpinheiro/codes/boost-1.67/ \
  -DBOOST_INCLUDEDIR=/home/mpinheiro/codes/boost-1.67/include/boost \
  -DBOOST_LIBRARYDIR=/home/mpinheiro/codes/boost-1.67/lib ..

make -j 4 > make.log
make install

I have also checked the links created in the rdBase.so file as shown below
and everything seems to be fine:

 linux-vdso.so.1 =>  (0x2aaab000)
libRDKitRDBoost.so.1 =>
/home/mpinheiro/codes/rdkit-2020.09/lib/libRDKitRDBoost.so.1
(0x2adb1000)
libboost_python38.so.1.67.0 =>
/home/mpinheiro/codes/boost-1.67/lib/libboost_python38.so.1.67.0
(0x2afb5000)
libRDKitRDGeneral.so.1 =>
/home/mpinheiro/codes/rdkit-2020.09/lib/libRDKitRDGeneral.so.1
(0x2b1fb000)
libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x2b423000)
libstdc++.so.6 =>
/trinity/shared/apps/custom/x86_64/gcc-8.1.0/lib64/libstdc++.so.6
(0x2b64)
libm.so.6 => /usr/lib64/libm.so.6 (0x2b9c4000)
libgcc_s.so.1 =>
/trinity/shared/apps/custom/x86_64/gcc-8.1.0/lib64/libgcc_s.so.1
(0x2bcc6000)
libc.so.6 => /usr/lib64/libc.so.6 (0x2bedf000)
librt.so.1 => /usr/lib64/librt.so.1 (0x2c2a2000)
libdl.so.2 => /usr/lib64/libdl.so.2 (0x2c4aa000)
libutil.so.1 => /usr/lib64/libutil.so.1 (0x2c6af000)
/lib64/ld-linux-x86-64.so.2 (0x4000)

As I said, I have tried many different tricks and suggestions that I was
able to find in the forum but none of them effectively solved my problem to
get the code working. So I would like to ask you if someone has faced a
similar problem and may already have some tips on how to fix it. I will
really appreciate any help you can provide on this issue.

Thanks!

Max Pinheiro Jr
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] transparency when highlighting atoms with defined colors

2018-12-20 Thread Max Pinheiro Jr
Hi Jose,

Have you tried to use the "DrawMolecules" from the
"rdMolDraw2D.MolDraw2DSVG" class? I think this class has some other options
that allow one to change the highlight colors while still keeping the alpha
transparency. I did a test here with your code and it seems to provide the
result you are looking for:

m = Chem.MolFromSmiles("O11") # tetrahydrofuran
highlight=[0] # oxygen

matches = m.GetSubstructMatches(Chem.MolFromSmiles('CO'))
print matches
color_1 = {}
color_2 = {}
color_3 = {}
radius = {}
for i,j in zip(matches[0], matches[1]):
color_1[i] = ColorConverter().to_rgb('lightgray')
color_2[j] = ColorConverter().to_rgb('skyblue')
tm = rdMolDraw2D.PrepareMolForDrawing(m)
view = rdMolDraw2D.MolDraw2DSVG(660,350)
option = view.drawOptions()
option.padding=0.13
option.legendFontSize=18
view.DrawMolecules([tm], highlightAtoms=[highlight],
highlightAtomColors=[color_2])
view.FinishDrawing()
svg = view.GetDrawingText()
with open('./example.svg', 'w') as f:
f.write(svg)
SVG(svg.replace('svg:', ''))

I hope it works for you.

All the best,

Max

Em qui, 20 de dez de 2018 às 14:33, Jose Manuel Gally <
jose.manuel.ga...@gmail.com> escreveu:

> Hi all,
>
> I am trying to highlight substructures found in a set of molecules.
>
> However, if I use a specific color in Draw.MolsToGridImage by defining the
> highlightAtomColors parameter, I lose the transparency effect, so some
> atoms can be hidden by the highlight.
>
> Is there a way to set the transparency level (alpha) with defined
> highlightAtomColors?
>
> I could not find any relevant parameter in the documentation (alpha is
> mentioned only once in the calcAtomGaussians docstring) or in this
> mailing list.
>
> I also tried to set a value an additional value in the RGB code (i.e.
> RGBa=(1, 0, 0, 0.5)), but it seems to be simply ignored.
>
> Please find attached a notebook with a very simple example.
>
> Am I missing something obvious?
>
> By default there seems to be some transparency applied when highlighting.
>
> Thank you for your help!
>
> Best regards,
> Jose Manuel
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Highlight bonds with translucent color

2018-12-14 Thread Max Pinheiro Jr
Dear RDkit team,

I started to use RDkit package quite recently first to draw 2D models for a
large set of molecules using a python script. My script is finally working
but I am still trying to make some improvements in the styling of figures.
Specifically, I am applying the highlight function of RDkit on some special
bonds (or atoms) in order to get something similar to the examples I have
found on the RDkit blog (
http://rdkit.blogspot.com/search?updated-max=2018-10-16T20:10:00-07:00=2).
However, in the figures created with my script the bonds are highlighted
with red colors only instead of applying the alpha transparency as in the
blog's examples. I followed essentially the same procedure as described in
the examples with the exception that I am exporting the images as png
rather than SVG. Does anyone know how to reproduce the highlighting bonds
with transparency shown in the RDkit blog? This could be a problem related
to the version of the RDkit I am using (the most recent one)?

I will be grateful for any comment or suggestion that can help me to solve
this.

Follow below the main lines of the python script I am using to create the
images.

labels = np.genfromtxt('./smiles.dat', unpack=True, usecols=0,
skip_header=1, dtype=str)
data = np.genfromtxt('./smiles.dat', skip_header=1, usecols=1, dtype=str)

mols = [Chem.MolFromSmiles(smiles) for smiles in data]

heavy_atoms = []
for m in mols:
  idx = []
  for atom in m.GetAtoms():
   n = atom.GetIdx()
   if ( m.GetAtomWithIdx(n).GetIsAromatic() == True):
  idx.append(atom.GetIdx())
  heavy_atoms.append(idx)

opt = Draw.DrawingOptions()
#opt.bgColor = None
opt.bondLineWidth = 1.80
opt.clearBackground=False

img=Draw.MolsToGridImage(mols[:60],molsPerRow=5,subImgSize=(400,150),highlightAtomLists=heavy_atoms,
 options=opt,legends=[x for x in labels[:60]])
img.save('test_grid.png')
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss