Re: [Rdkit-discuss] MolFromPDBBlock and heterocycles

2016-09-07 Thread Sereina Riniker
Hi Steven,

The PDB reader in the RDKit doesn’t determine any bond orders - everything
is read as a single bond.
In order to set the bond orders, you need to call the
AssignBondOrdersFromTemplate() function using a reference molecule
generated from SMILES (or SDF).

Here is some example code from the docs:

>>> from rdkit.Chem import AllChem
>>> template = AllChem.MolFromSmiles("CN1C(=NC(C1=O)(c2c2)c3c3)N")
>>> mol = AllChem.MolFromPDBFile(os.path.join(RDConfig.RDCodeDir, 'Chem',
'test_data', '4DJU_lig.pdb'))
>>> len([1 for b in template.GetBonds() if b.GetBondTypeAsDouble() == 1.0])
8
>>> len([1 for b in mol.GetBonds() if b.GetBondTypeAsDouble() == 1.0])
22

Now assign the bond orders based on the template molecule
>>> newMol = AllChem.AssignBondOrdersFromTemplate(template, mol)
>>> len([1 for b in newMol.GetBonds() if b.GetBondTypeAsDouble() == 1.0])
8

Note that the template molecule should have no explicit hydrogens
else the algorithm will fail.

Hope this helps.

Best,
Sereina


2016-09-07 17:16 GMT+02:00 Steven Combs :

> Hello!
>
> I have a pdb block that I am working with, which is attached to this
> email. The ligand has aromatic ring structures in it; however, when it is
> read into RDKit and converted into a smiles string, the aromatic rings are
> converted into aliphatic rings. Any thoughts?
>
> Here is the python code:
>
> def extract_data( filename):
> extracted_info = ""
> with open(filename) as f:
> for line in f.readlines():
> if "HETATM" in line:
> extracted_info += ( line)
> return extracted_info
>
> for index, filename in enumerate(solution_pdb_filenames):
> row = extract_data( filename)
> m = Chem.MolFromPDBBlock(row, sanitize=True, removeHs=False )
> Chem.SetHybridization(m)
> Chem.SetAromaticity(m)
> Chem.SanitizeMol(m, 
> sanitizeOps=Chem.rdmolops.SanitizeFlags.SANITIZE_ALL)
> #not needed since sanitizing during read in, but trying to figure out if it
> actually worked
> print ("Parsing file " + str(index) + " of " +
> str(len(solution_pdb_filenames)))
> print (Chem.MolToSmiles(m, kekuleSmiles=True, allHsExplicit=True))
>
> The output smile string is:
>
> [H][O][CH]1[NH][CH]([C]([H])([H])[CH]([OH])[OH])[CH]([C]([H]
> )([H])[C]([H])([H])[H])[CH]([CH]([OH])[CH]2[CH]([H])[CH]([
> H])[CH]([H])[CH]([N]([H])[H])[CH]2[H])[CH]1[N]([C]([H])([H])
> [H])[C]([H])([H])[H]
>
> Steven Combs
>
>
>
> 
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Benchmarking platform

2013-07-05 Thread sereina riniker
Dear all,

The source code and compound lists of the benchmarking platform discussed
in J. Cheminf. 5, 26, 2013 (http://www.jcheminf.com/content/5/1/26) are now
available as a separate repository of RDKit on github
(rdkit/benchmarking_platform). The platform is based on RDKit and includes
88 data sets from three public data sources (MUV, DUD and ChEMBL) together
with precalculated training lists (i.e. indices of randomly selected
training molecules) for 5, 10 and 20 training actives.

I hope some of you find this interesting and if you have questions, please
don't hesitate to contact me.

Best regards,
Sereina
--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] USR/USRCAT implementation in RDKit

2013-08-28 Thread sereina riniker
Dear all,

A c++ implementation and Python wrappers of the ultrafast shape recognition
(USR) descriptor (Ballester and Richards, J. Comput. Chem. (2007), 28,
1711) and the USR CREDO atom types (USRCAT) descriptor (Schreyer and
Blundell, J. Cheminf. (2012), 4, 27) are now available for the RDKit. The
code is based on the Python implementations of Jan Domanski and Adrian
Schreyer.

The descriptors can be accessed from Python via
rdkit.Chem.rdMolDescriptors.GetUSR and
rdkit.Chem.rdMolDescriptors.GetUSRCAT.

Best regards,
Sereina
--
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MMFF Problem

2013-10-08 Thread sereina riniker
Hi Nick (and Paolo),

It's not the molecules that trigger the infinite loop (I tried it with one
of mine and it hanged as well). The problem is that no coordinates are
added for the hydrogens in the script. If you set addCoords=True when
adding the hydrogens, the minimization works (at least it did with me...).

for mol in suppl:
molList.append(Chem.AddHs(mol, addCoords=True))

Best,
Sereina



2013/10/8 Paolo Tosco 

>  Hi Nick,
>
> would you mind sending me the SD file which is triggering the infinite
> loop? Then I'll come back to you as soon as I find out something.
>
> Best,
> p.
>
>
>
> On 10/08/2013 11:36 AM, Nicholas Firth wrote:
>
> Hi RDKitters,
>
>  I'm having an issue using the MMFF to minimise a CORINA conformation.
> I've written a little script which adds hydrogens to a molecule then tries
> to use the MMFF forcefield to minimise the conformer. The problem is that
> the script hangs on the minimise step.
>
>  This error only occurs when I add hydrogens to the conformation, I
> assume the reason for this is because the hydrogens are all added at the
> origin. Is there a way of getting round this (in RDKit, as I want to keep
> the AddHs function)?
>
>  I've included the script below, it will work like this however if you
> switch wrong the commented lines in the first for loop then the it no
> longer works.
>
>
>   from rdkit import Chem
> from rdkit.Chem import AllChem
> from rdkit.Chem import ChemicalForceFields
> from sys import argv
>
>  suppl = Chem.SDMolSupplier(argv[1])
> molList = []
>
>  for mol in suppl:
> #molList.append(Chem.AddHs(mol))
> molList.append(mol)
> del suppl
>
>  #w = Chem.SDWriter(argv[1])
> w = Chem.SDWriter('test.sdf')
>
>  for mol in molList:
> mp = ChemicalForceFields.MMFFGetMoleculeProperties(mol)
> field = AllChem.MMFFGetMoleculeForceField(mol, mp)
> field.Minimize()
> w.write(mol)
>
>  w.close()
>
>
> Thanks in advance.
>
>  Best,
> Nick
>
>  *Nicholas C. Firth* | PhD Student | Cancer Therapeutics
> The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton |
> Surrey | SM2 5NG
>
> *T* 020 8722 4033 | *E* nicholas.fi...@icr.ac.uk | *W* www.icr.ac.uk | *
> Twitter* @ICRnews 
>
> *Facebook* www.facebook.com/theinstituteofcancerresearch
>
> *Making the discoveries that defeat cancer*
>
>
>
> The Institute of Cancer Research: Royal Cancer Hospital, a charitable
> Company Limited by Guarantee, Registered in England under Company No.
> 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
>
> This e-mail message is confidential and for use by the addressee only. If
> the message is received by anyone other than the addressee, please return
> the message to the sender by replying to it and then delete the message
> from your computer and network.
>
>
> --
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
> the latest Intel processors and coprocessors. See abstracts and register 
> >http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
>
>
>
> ___
> Rdkit-discuss mailing 
> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
> --
> ==
> Paolo Tosco, Ph.D.
> Department of Drug Science and Technology
> Via Pietro Giuria, 9 - 10125 Torino (Italy)
> Tel: +39 011 670 7680 | Mob: +39 348 5537206
> Fax: +39 011 670 7687 | E-mail: paolo.tosco@unito.ithttp://open3dqsar.org | 
> http://open3dalign.org
> ==
>
>
>
> --
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
<>--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://

Re: [Rdkit-discuss] MMFF Problem

2013-10-08 Thread sereina riniker
By the way, UFF also hangs if too many hydrogens with zero coordinates are
present, so it seems to be a general problem. This will be fixed at some
point.

Best,
Sereina


2013/10/8 Nicholas Firth 

> Hi,
>
> Thanks Sereina, that's exactly what I'm after! I thought there must be a
> way to do that but didn't look at the function definition. Next time I'll
> remember to check.
>
> Thanks Paolo as well.
>
>
> Best,
> Nick
>
> *Nicholas C. Firth* | PhD Student | Cancer Therapeutics
> The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton |
> Surrey | SM2 5NG
>
> *T* 020 8722 4033 | *E* nicholas.fi...@icr.ac.uk | *W* www.icr.ac.uk | *
> Twitter* @ICRnews <https://twitter.com/ICRnews>
>
> *Facebook* www.facebook.com/theinstituteofcancerresearch
>
> *Making the discoveries that defeat cancer*
>
>
> On 8 Oct 2013, at 10:51, sereina riniker 
> wrote:
>
> Hi Nick (and Paolo),
>
> It's not the molecules that trigger the infinite loop (I tried it with one
> of mine and it hanged as well). The problem is that no coordinates are
> added for the hydrogens in the script. If you set addCoords=True when
> adding the hydrogens, the minimization works (at least it did with me...).
>
> for mol in suppl:
> molList.append(Chem.AddHs(mol, addCoords=True))
>
> Best,
> Sereina
>
>
>
> 2013/10/8 Paolo Tosco 
>
>>  Hi Nick,
>>
>> would you mind sending me the SD file which is triggering the infinite
>> loop? Then I'll come back to you as soon as I find out something.
>>
>> Best,
>> p.
>>
>>
>>
>> On 10/08/2013 11:36 AM, Nicholas Firth wrote:
>>
>> Hi RDKitters,
>>
>>  I'm having an issue using the MMFF to minimise a CORINA conformation.
>> I've written a little script which adds hydrogens to a molecule then tries
>> to use the MMFF forcefield to minimise the conformer. The problem is that
>> the script hangs on the minimise step.
>>
>>  This error only occurs when I add hydrogens to the conformation, I
>> assume the reason for this is because the hydrogens are all added at the
>> origin. Is there a way of getting round this (in RDKit, as I want to keep
>> the AddHs function)?
>>
>>  I've included the script below, it will work like this however if you
>> switch wrong the commented lines in the first for loop then the it no
>> longer works.
>>
>>
>>   from rdkit import Chem
>> from rdkit.Chem import AllChem
>> from rdkit.Chem import ChemicalForceFields
>> from sys import argv
>>
>>  suppl = Chem.SDMolSupplier(argv[1])
>> molList = []
>>
>>  for mol in suppl:
>> #molList.append(Chem.AddHs(mol))
>> molList.append(mol)
>> del suppl
>>
>>  #w = Chem.SDWriter(argv[1])
>> w = Chem.SDWriter('test.sdf')
>>
>>  for mol in molList:
>> mp = ChemicalForceFields.MMFFGetMoleculeProperties(mol)
>> field = AllChem.MMFFGetMoleculeForceField(mol, mp)
>> field.Minimize()
>> w.write(mol)
>>
>>  w.close()
>>
>>
>> Thanks in advance.
>>
>>  Best,
>> Nick
>>
>>  *Nicholas C. Firth* | PhD Student | Cancer Therapeutics
>>  The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton
>> | Surrey | SM2 5NG
>>
>> *T* 020 8722 4033 | *E* nicholas.fi...@icr.ac.uk | *W* www.icr.ac.uk | *
>> Twitter* @ICRnews <https://twitter.com/ICRnews>
>>
>> *Facebook* www.facebook.com/theinstituteofcancerresearch
>>
>> *Making the discoveries that defeat cancer*
>>
>> 
>>
>>
>> The Institute of Cancer Research: Royal Cancer Hospital, a charitable
>> Company Limited by Guarantee, Registered in England under Company No.
>> 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
>>
>> This e-mail message is confidential and for use by the addressee only. If
>> the message is received by anyone other than the addressee, please return
>> the message to the sender by replying to it and then delete the message
>> from your computer and network.
>>
>>
>> --
>> October Webinars: Code for Performance
>> Free Intel webinars can help you accelerate application performance.
>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
>> the latest Intel processors and coprocessors. See abstracts and register 
>> >http://pubads.g.doubleclick.net/gampad/clk?id=60134071

Re: [Rdkit-discuss] Beta of Q3 2013 release available

2013-10-25 Thread sereina riniker
Hi James,

Regarding the AssignBondOrdersFromTemplate() method:
As far as I understood, the PDB reader assigns bond orders to the amino
acids in a protein, but if a ligand is present it puts all bonds of it to
SINGLE bonds as auto bond-type perception is not trivial (see Roger's
comments). However, usually one knows which ligand was crystallized (i.e.
the SMILES is available), so the AssignBondOrdersFromTemplate() method can
be used to set the bond orders based on the known ligand structure. This is
the idea of the method. Now, to your real-world application. I'm sorry but
I don't think I understand it completely. Do you want to set only the bond
orders of a specific substructure? Or would you like to give the function a
set of ligands and a set of templates and it figures out which template
belongs to which ligand and sets the bonds orders accordingly?

Best,
Sereina



2013/10/24 Greg Landrum 

> James,
>
> On Thu, Oct 24, 2013 at 7:27 PM, James Davidson 
> wrote:
>
>>  Hi Greg (et al.),
>>
>> ** **
>>
>> Thanks for the beta!  I have been going through some of the
>> recently-added functionality, and had a couple of questions regarding the
>> PDB reading / writing.
>>
>
> Thanks for the bug reports!
>
>> **
>>
>> **1.   **Do I remember correctly that there was a proposal (from
>> Roger) to add some auto bond-type perception to the PDB parser for ligands
>> (or is that just wishful thinking!)?
>>
> Roger will have to confirm this, but I believe he said something along the
> lines of "that way lies madness".
>
>> 2.   **If not, I notice that there is an
>> AssignBondOrdersFromTemplate() method – but the example in the doc-string
>> only shows (I think) the case where the input PDB is just a single small
>> molecule – so the matching is pretty easy!  I think a more real-World case
>> is when one wants to set the bond orders for multiple ligands (HETATM
>> residues) based on substructure matches – which will then return an atom
>> index selection that can be used as a start point.  Is there any way to
>> have the AssignBondOrdersFromTemplate() convenience function optionally
>> accept a list of atom indexes to specify a substructure?
>>
> Sereina? Is that doable?
>
>> 
>>
>> **3.   **Is there some explanation for what the ‘flavor’ option does
>> for reading/writing PDB?
>>
> I'm not sure about the reader. Roger, can you answer that?
>
> This is what's in the C++ for the PDBWriter:
> // PDBWriter support multiple "flavors" of PDB output
> // flavor & 1 : Write MODEL/ENDMDL lines around each record
> // flavor & 2 : Don't write any CONECT records
> // flavor & 4 : Write CONECT records in both directions
> // flavor & 8 : Don't use multiple CONECTs to encode bond order
> // flavor & 16 : Write MASTER record
> // flavor & 32 : Write TER record
>
> This is now in the docs for both the Python and C++ code.
>
>> 
>>
>> **4.   **Having read in a PDB file I see the correct atoms flagged
>> as HETATM (from GetIsHeteroAtom()).  But when call Chem.MolToPDBBlock()
>> these atoms get written as ATOM records…  Also, a Chem.MolToPDBFile()
>> method would be nice for completeness / symmetry : )
>>
> The HETATM thing was the result of a dumb copy and paste error from me.
> It's fixed.
>
> Re: Chem.MolToPDBFile()
> that's missing because there's no corresponding Chem.MolToMolFile()
> This is an odd oversight, which I've now fixed.
>
>> 
>>
>> **5.   **It seems to me that GetResidueNumber() and
>> GetSerialNumber() may have got mixed-up at some point(?).  At least, when I
>> call GetSerialNumber() I see what appears to be the residue number; and
>> when I call GetResidueNumber() I get “0”!
>>
> This was another dumb bug from me. It's fixed.
>
>> 
>>
>> **6.   **I also seem to be seeing all of the bonds (for all
>> residues) being written out in CONECT records – such that they all appear
>> as single bonds in eg PyMOL – is this expected behaviour at the moment?
>>
> Another one for Roger.
>
> -greg
>
>
>
> --
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=6

Re: [Rdkit-discuss] Beta of Q3 2013 release available

2013-10-25 Thread sereina riniker
Hi James,

Okay, now it's clear. I somehow (wrongly) thought the PDB reader would give
you the protein and the ligand as two molecules and then it wouldn't have
been a problem... I will discuss with Greg on how to best do this and get
back to you.

Best,
Sereina


2013/10/25 James Davidson 

> Hi Sereina,
>
> Sereina wrote:
> > Regarding the AssignBondOrdersFromTemplate() method:
> > As far as I understood, the PDB reader assigns bond orders to the amino
> acids in a protein, but if a ligand is present it puts all bonds of it to
> SINGLE bonds as auto bond-type perception is not trivial (see Roger's
> comments).
> > However, usually one knows which ligand was crystallized (i.e. the
> SMILES is available), so the AssignBondOrdersFromTemplate() method can be
> used to set the bond orders based on the known ligand structure.
> > This is the idea of the method. Now, to your real-world application. I'm
> sorry but I don't think I understand it completely. Do you want to set only
> the bond orders of a specific substructure?
> > Or would you like to give the function a set of ligands and a set of
> templates and it figures out which template belongs to which ligand and
> sets the bonds orders accordingly?
>
> This is very likely to be me being stupid - so please bear with me!
> If I read in a complex (pdb), and already have my reference ligand (lig),
> then AllChem.AssignBondOrdersFromTemplate(lig, pdb) fails because the
> reference ligand has not been matched to the ligand in the pdb 'complex'
> (dot-separated list of molecules).
> The doc-string states that the method works on two molecules - but I want
> to work on a reference molecule (lig) and a *substructure* of the
> macromolecule (pdb).  How should I be getting the bound ligand out as a
> molecule object to then use the AssignBondOrdersFromTemplate() method?  Am
> I missing some new PDB-related methods, or have I forgotten some
> fundamental RDKit methods for dealing with multi-component molecules?
>
> I guess a sensible process would be:
> 1. Identify any HETATM residues
> 2. For each residue (or at least those that have bonds!) extract or copy
> the mol (unless it can be addressed 'in place'?)
> 3. Use AssignBondOrdersFromTemplate() - relying on lookup be eg residue
> name, etc
> 4. Insert the molecule back into the complex (or update the info if it has
> been modified 'in place')
>
> Is this how the method is intended to be used with complexes (and if so,
> do you have an example for steps 2 and 4?
>
> Thanks
>
> James
>
> __
> PLEASE READ: This email is confidential and may be privileged. It is
> intended for the named addressee(s) only and access to it by anyone else is
> unauthorised. If you are not an addressee, any disclosure or copying of the
> contents of this email or any action taken (or not taken) in reliance on it
> is unauthorised and may be unlawful. If you have received this email in
> error, please notify the sender or postmas...@vernalis.com. Email is not
> a secure method of communication and the Company cannot accept
> responsibility for the accuracy or completeness of this message or any
> attachment(s). Please check this email for virus infection for which the
> Company accepts no responsibility. If verification of this email is sought
> then please request a hard copy. Unless otherwise stated, any views or
> opinions presented are solely those of the author and do not represent
> those of the Company.
>
> The Vernalis Group of Companies
> 100 Berkshire Place
> Wharfedale Road
> Winnersh, Berkshire
> RG41 5RD, England
> Tel: +44 (0)118 938 
>
> To access trading company registration and address details, please go to
> the Vernalis website at www.vernalis.com and click on the "Company
> address and registration details" link at the bottom of the page..
> __
>
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MolFromXYZ?

2013-11-04 Thread sereina riniker
Hi Michal,

Well, if you have your 3D coordinates as a PDB file, you can read them in
with the new PDB parser and assign the bond orders based on a template
(generated from the SMILES of your molecule):
tmp = Chem.MolFromPDBFile(yourfilename)
template = Chem.MolFromSmiles(yoursmiles)
mol = AllChem.AssignBondOrdersFromTemplate(template, tmp)

I don't know if this is what you were looking for.

Best,
Sereina



2013/11/4 Michal Krompiec 

> Hello,
> Is it possible to construct a Mol (or EditableMol) object out of a
> list of 3D coordinates? I am trying to write a bridge between cclib
> and RDKit, and I need a function to convert 3D geometries to SDF.
> Thanks,
> Michal
>
>
> --
> Android is increasing in popularity, but the open development platform that
> developers love is also attractive to malware creators. Download this white
> paper to learn more about secure code signing practices that can help keep
> Android apps secure.
> http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Similarity map images save just bottom corner

2013-11-04 Thread sereina riniker
Hi Anthony,

It has something to do with the bounding boxes (they get scaled during the
map generation process).
Using bbox_inches='tight', however, worked for me, i.e.
image.savefig("out.png", bbox_inches='tight')

I hope this helps.

Best,
Sereina



2013/11/4 Anthony Bradley 

>  Hi all,
>
>
>
> I’m having some difficulty trying to save the new similarity map images.
> (They’re super cool by the way!)
>
>
>
> If I do the following in a python shell:
>
>
>
> from rdkit import Chem
>
> from rdkit.Chem.Draw import SimilarityMaps
>
> image =
> SimilarityMaps.GetSimilarityMapFromWeights(Chem.MolFromSmiles("CCC"),[1,2,3])
>
> # Just a dummy image
>
> image.savefig("out.png")
>
> # The outputted image is just the bottom left hand corner
>
>
>
> The image saved is cropped to the left hand corner (saved.png). It will
> render perfectly in an IPython notebook however. (ipython.png)
>
>
>
> Am I missing something here about saving Matplotlib images?
>
>
>
> Cheers,
>
>
>
> Anthony
>
>
>
>
> --
> Android is increasing in popularity, but the open development platform that
> developers love is also attractive to malware creators. Download this white
> paper to learn more about secure code signing practices that can help keep
> Android apps secure.
> http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Extension of benchmarking platform

2013-11-25 Thread sereina riniker
Dear all,

The extension of the benchmarking platform to machine-learning (ML) methods
presented at the UGM 2013 and described in J. Chem. Inf. Model., 53, 2829,
2013 is now available from rdkit/benchmarking_platform. This includes
scripts for three different ML methods (random forest, Naive Bayes and
logistic regression) and classifier fusion, as well as the additional data
sets and training lists.

A tag/release has been created for the original version (as described in J.
Cheminf., 5, 26, 2013).

Please let me know if you have any questions or encounter any problems.

Best,
Sereina
--
Shape the Mobile Experience: Free Subscription
Software experts and developers: Be at the forefront of tech innovation.
Intel(R) Software Adrenaline delivers strategic insight and game-changing 
conversations that shape the rapidly evolving mobile landscape. Sign up now. 
http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PDB reader and bond perception

2014-01-13 Thread sereina riniker
Hi JP,

If you have also a SMILES of the molecule you want to read from PDB, you
can assign the bond orders based on this template:

tmp = Chem.MolFromPDBFile(yourfilename)
template = Chem.MolFromSmiles(yoursmiles)
mol = AllChem.AssignBondOrdersFromTemplate(template, tmp)

Is this what you're looking for?

Best,
Sereina


2014/1/13 JP 

> RDKitters!
>
> Finally back on the mailing list!
>
> I am sure we've been through this at the UGM (my mind must have wandered
> off!), but a quick question about the PDB reader and bond perception.  Is
> this supported with the current PDB reader?  I remember that someone
> (PaulE, perhaps?) was saying bond perception was painful, but there was
> some dictionary for PDB ligands which helps (any idea the name of this
> dictionary?).
>
> To the technical details.
>
> I am reading in the following PDB file with a simple MolFromPDBFile() call:
>
> HETATM1  O1P 84T A1862 -27.016   9.387 -72.564  1.00 20.81
>   O
> HETATM2  P   84T A1862 -27.282   9.818 -73.968  1.00 19.65
>   P
> HETATM3  O2P 84T A1862 -27.881  11.176 -74.182  1.00 21.49
>   O
> HETATM4  N   84T A1862 -25.869   9.583 -74.813  1.00 19.78
>   N
> HETATM5  C   84T A1862 -25.759  10.010 -76.075  1.00 19.97
>   C
> HETATM6  CA  84T A1862 -24.493   9.748 -76.807  1.00 19.75
>   C
> HETATM7  CB  84T A1862 -24.794   8.678 -77.847  1.00 19.73
>   C
> HETATM8  CG  84T A1862 -23.571   8.324 -78.681  1.00 19.70
>   C
> HETATM9  CD2 84T A1862 -23.309   9.519 -79.611  1.00 18.49
>   C
> HETATM   10  CD1 84T A1862 -23.863   6.932 -79.305  1.00 18.60
>   C
> HETATM   11  OHB 84T A1862 -25.210   7.467 -77.223  1.00 19.17
>   O
> HETATM   12  OH  84T A1862 -23.549   9.127 -75.984  1.00 20.33
>   O
> HETATM   13  O   84T A1862 -26.672  10.517 -76.692  1.00 20.26
>   O
> HETATM   14  O5' 84T A1862 -28.377   8.861 -74.619  1.00 19.39
>   O
> HETATM   15  C5' 84T A1862 -28.002   7.536 -74.954  1.00 18.47
>   C
> HETATM   16  C4' 84T A1862 -28.909   7.000 -76.012  1.00 18.24
>   C
> HETATM   17  C3' 84T A1862 -28.901   7.826 -77.298  1.00 18.28
>   C
> HETATM   18  C2' 84T A1862 -30.318   7.610 -77.768  1.00 18.69
>   C
> HETATM   19  O2' 84T A1862 -30.789   8.641 -78.581  1.00 19.64
>   O
> HETATM   20  O4' 84T A1862 -30.262   6.951 -75.529  1.00 18.80
>   O
> HETATM   21  C1' 84T A1862 -31.152   7.470 -76.521  1.00 19.01
>   C
> HETATM   22  N9  84T A1862 -31.753   8.732 -76.009  1.00 20.08
>   N
> HETATM   23  C4  84T A1862 -33.033   9.013 -76.158  1.00 21.10
>   C
> HETATM   24  N3  84T A1862 -34.018   8.339 -76.786  1.00 21.58
>   N
> HETATM   25  C2  84T A1862 -35.263   8.846 -76.830  1.00 21.95
>   C
> HETATM   26  C8  84T A1862 -31.223   9.701 -75.291  1.00 20.27
>   C
> HETATM   27  N7  84T A1862 -32.173  10.618 -75.019  1.00 21.28
> N
> HETATM   28  C5  84T A1862 -33.315  10.213 -75.563  1.00 21.81
>   C
> HETATM   29  C6  84T A1862 -34.624  10.702 -75.627  1.00 22.85
>   C
> HETATM   30  N1  84T A1862 -35.550  10.010 -76.285  1.00 22.44
>   N
> HETATM   31  N6  84T A1862 -35.008  11.862 -75.052  1.00 23.86
>   N
> TER
> END
>
> But I am losing all the double bond (and aromatic) information:
>
> m = Chem.MolFromPDBFile(sys.argv[1])
> print Chem.MolToSmiles(m)
>
> Gives me:
>
> CC(C)C(O)C(O)C(O)NP(O)(O)OCC1CC(O)C(N2CNC3C2NCNC3N)O1
>
> As usual, many thanks for your time,
>
> -
> Jean-Paul Ebejer
> Early Stage Researcher
>
>
> --
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PDB reader and bond perception

2014-01-14 Thread sereina riniker
Hi JP,

However I am unable to get bond orders for the protein side - am I doing
> something wrong or is this the intended behaviour ?
> I imagine I can use AssignBondOrdersFromTemplate() for the 20 amino acids
> and set these myself -- or is there a better way to do this?
>

I don't know why your protein doesn't get bond orders, the PDBParser should
know the standard amino acids. At least it worked for me when I tried
Chem.MolFromPDB() in the past. Which PDB structure do you try to read?


>  Also, is there a way to make AssignBondOrdersFromTemplate assign bond
> orders to all matches?
>

The function was meant for assigning bonds based on an entire molecule. It
would probably not be so difficult to change this (with default = match
only one), if it is really needed.


> Also another thing I don't quite understand is in the following below
> code, I get a "WARNING: More than one matching pattern found - picking one"
> but how can my template match multiple times (this is not symettrical) ?
>

The way the AssignBondOrdersFromTemplate() function works is the following:
1) a copy of the template is generated where all bonds are set to single
bonds
2) this single-bonds copy is used for a substructure match with the query
molecule
3) bond orders are assigned based on this match and the original template

If you get this warning, it means that there is some symmetry in the
"all-single-bonds-stage" of your molecule. In your case, I guess it's the
carboxylic acids which can match two ways when there are only single bonds.

I hope this helps.

Best,
Sereina



>
>
>
> On 13 January 2014 21:02, JP  wrote:
> >
> > Thanks All - I think I am in a good place now.
> >
> > I can get the SMILES from Paul's mmcif links and then I can use Sereina
> magic three lines to do what I want.  I'd cross my fingers - but with RDKit
> you don't need to.
> > This works for all Chemical Components (or what other fashionable name
> they go by these days) in the PDB.
> >
> > For posterity: I have found a post in the mailing list started by James
> which sheds some light on this:
> >
> https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg03481.html
> >
> >
> >
> >
> > On 13 January 2014 19:46, sereina riniker 
> wrote:
> >>
> >> Hi JP,
> >>
> >> If you have also a SMILES of the molecule you want to read from PDB,
> you can assign the bond orders based on this template:
> >>
> >> tmp = Chem.MolFromPDBFile(yourfilename)
> >> template = Chem.MolFromSmiles(yoursmiles)
> >> mol = AllChem.AssignBondOrdersFromTemplate(template, tmp)
> >>
> >> Is this what you're looking for?
> >>
> >> Best,
> >> Sereina
> >>
> >>
> >> 2014/1/13 JP 
> >>>
> >>> RDKitters!
> >>>
> >>> Finally back on the mailing list!
> >>>
> >>> I am sure we've been through this at the UGM (my mind must have
> wandered off!), but a quick question about the PDB reader and bond
> perception.  Is this supported with the current PDB reader?  I remember
> that someone (PaulE, perhaps?) was saying bond perception was painful, but
> there was some dictionary for PDB ligands which helps (any idea the name of
> this dictionary?).
> >>>
> >>> To the technical details.
> >>>
> >>> I am reading in the following PDB file with a simple MolFromPDBFile()
> call:
> >>>
> >>> HETATM1  O1P 84T A1862 -27.016   9.387 -72.564  1.00 20.81
>   O
> >>> HETATM2  P   84T A1862 -27.282   9.818 -73.968  1.00 19.65
>   P
> >>> HETATM3  O2P 84T A1862 -27.881  11.176 -74.182  1.00 21.49
>   O
> >>> HETATM4  N   84T A1862 -25.869   9.583 -74.813  1.00 19.78
>   N
> >>> HETATM5  C   84T A1862 -25.759  10.010 -76.075  1.00 19.97
>   C
> >>> HETATM6  CA  84T A1862 -24.493   9.748 -76.807  1.00 19.75
>   C
> >>> HETATM7  CB  84T A1862 -24.794   8.678 -77.847  1.00 19.73
>   C
> >>> HETATM8  CG  84T A1862 -23.571   8.324 -78.681  1.00 19.70
>   C
> >>> HETATM9  CD2 84T A1862 -23.309   9.519 -79.611  1.00 18.49
>   C
> >>> HETATM   10  CD1 84T A1862 -23.863   6.932 -79.305  1.00 18.60
>   C
> >>> HETATM   11  OHB 84T A1862 -25.210   7.467 -77.223  1.00 19.17
>   O
> >>> HETATM   12  OH  84T A1862 -23.549   9.127 -75.984  1.00 20.33
>   O
> >>> HETATM   13  O   84T A1862 -26.672  10.51

Re: [Rdkit-discuss] How to get coordinates for each atom in molecule?

2014-01-24 Thread sereina riniker
Hi Michael,

You can get the atom positions via the conformer:

m = Chem.MolFromSmiles('c1c1')
AllChem.Compute2DCoords()
pos = m.GetConformer().GetAtomPosition(0) # position of atom 0

This gives you a rdGeometry.Point3D - e.g. the x coordinates you get with:

x = pos.x

I hope this is what you were looking for.

Best,
Sereina




2014/1/24 Michał Nowotka 

> Hi,
>
> Let's say I loaded a molfile containing coordinates to RDKit mol
> object or loaded it from smiles but called
> AllChem.Compute2DCoords(mol).
> Now I would like to get coordinates for each atom. Unfortunately Atom
> class doesn't have any GetCoords method but this is understandable
> since position is optional. I tried to look into properties but it
> seems that they are stored in some stage container exported from C++:
>
> for atom in mol.GetAtoms():
> print atom.GetPropNames()
>:
> 
> 
> ...
>
>
> Some blind guesses such as: atom.GetProp('x'), atom.GetProp('X')
> failed. Mol object itself doesn't provide any method that would
> suggest that it can return coordinates
>
> So is there any way to get this data without parsing original molfile?
>
>
> Regards,
>
> Michal Nowotka
>
>
> --
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] similarity maps look strange when displayed

2014-03-21 Thread sereina riniker
Hi Michal

I think this is related to a previous mailing list item (
https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg03528.html
).

It has probably something to do with the bounding boxes (they get scaled
during the map generation process). In the previous case it was enough to
set bbox_inches='tight' when saving the image to solve the problem. Maybe
there is something similar for mpld3.

Best,
Sereina





2014-03-21 12:12 GMT+01:00 Michał Nowotka :

> Look at the following example:
>
> import gi
> from rdkit import Chem
> from rdkit.Chem import Draw
> from rdkit.Chem.Draw import SimilarityMaps
> import matplotlib.pyplot as plt
> mol =
> Chem.MolFromSmiles('COc12cc(C(=O)NN3CCN(c45nccnc54)CC3)oc21')
> refmol =
> Chem.MolFromSmiles('CCCN(N1CCN(c2c2OC)CC1)Cc1ccc2c2c1')
> fp = SimilarityMaps.GetAPFingerprint(mol, fpType='normal')
> fig, maxweight = SimilarityMaps.GetSimilarityMapForFingerprint(refmol,
> mol, SimilarityMaps.GetMorganFingerprint)
> plt.show()
>
> This displays similarity map. Unfortunately the image is not scaled to fit
> available area and it's not centered. This cases problems with mpld3
> library, which converts matplotlib to javascript:
>
> from rdkit import Chem
> from rdkit.Chem import Draw
> from rdkit.Chem.Draw import SimilarityMaps
> import mpld3
> mol =
> Chem.MolFromSmiles('COc12cc(C(=O)NN3CCN(c45nccnc54)CC3)oc21')
> refmol =
> Chem.MolFromSmiles('CCCN(N1CCN(c2c2OC)CC1)Cc1ccc2c2c1')
> fp = SimilarityMaps.GetAPFingerprint(mol, fpType='normal')
> fig, maxweight = SimilarityMaps.GetSimilarityMapForFingerprint(refmol,
> mol, SimilarityMaps.GetMorganFingerprint)
> mpld3.show_d3(fig)
>
> Again, the image is much larger then drawing area and is not aligned.
>
> I've tried several options: changing coordScale or scale parameter but
> without success. Any help in displaying the image correctly usiing
> plt.show() and/or mpld3.show_d3 would be appreciated.
>
> Regards,
> Michal Nowotka
>
>
> --
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/13534_NeoTech
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Drawing Gasteiger Maps

2014-07-16 Thread Sereina Riniker
Hi Ed,

I think this is related to a previous mailing list item (
https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg03528.html
).

It has something to do with the bounding boxes (they get scaled during the
map generation process). In the previous case it was enough to set
bbox_inches='tight' when saving the image to solve the problem.

Best,
Sereina


2014-07-16 17:04 GMT+02:00 Edward Pyzer-Knapp :

> Hi,
>
> I am trying to replicate the tutorial on visualising maps of gasteiger
> charges.
>
> >>> from rdkit.Chem.Draw import SimilarityMaps>>> mol = 
> >>> Chem.MolFromSmiles('COc12cc(C(=O)NN3CCN(c45nccnc54)CC3)oc21')>>>
> >>>  AllChem.ComputeGasteigerCharges(mol)>>> contribs = 
> >>> [float(mol.GetAtomWithIdx(i).GetProp('_GasteigerCharge')) for i in 
> >>> range(mol.GetNumAtoms())]>>> fig = 
> >>> SimilarityMaps.GetSimilarityMapFromWeights(mol, contribs, colorMap='jet', 
> >>> contourLines=10)
>
>  However when I visualise the image:
> >>> fig.show()
>
> The window only shows a small portion of the image.  Saving the figure has
> the same problem.
>
> Any help much appreciated!
>
> Ed Pyzer-Knapp
>
>
> --
> Want fast and easy access to all the code in your enterprise? Index and
> search up to 200,000 lines of code with a free copy of Black Duck
> Code Sight - the same software that powers the world's largest code
> search on Ohloh, the Black Duck Open Hub! Try it now.
> http://p.sf.net/sfu/bds
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] ETKDG improvement for small and large rings

2020-05-11 Thread Sereina Riniker
Dear RDKit Users, 

For your information (and to make a bit of advertisement): 
We have recently developed and published an extension of the ETKDG conformer 
generator to improve sampling of small and large rings, which is available in 
the 2020.03 release of the RDKit.

Shuzhe Wang, Jagna Witek, Greg Landrum, Sereina Riniker, J. Chem. Inf. Model., 
60, 2044 (2020)
"Im­prov­ing Con­former Gen­er­a­tion for Small Rings and Mac­ro­cycles Based 
on Dis­tance Geo­metry and Ex­per­i­mental Torsional-​Angle Pref­er­ences”
https://pubs.acs.org/doi/10.1021/acs.jcim.0c00025

If you want to try it out, Shuzhe has added a section in the RDKit cookbook to 
showcase the new functionalities:
https://github.com/rdkit/rdkit/blob/master/Docs/Book/Cookbook.rst#conformer-generation-with-etkdg

We hope that you find it useful and we’re happy for any feedback!

Best regards,
Sereina


 - - - 

Prof. Dr. Sereina Riniker
ETH Zürich
Laboratory of Physical Chemistry
HCI G225
Vladimir-Prelog-Weg 2
8093 Zürich
+41 44 633 42 39
srini...@ethz.ch
www.riniker.ethz.ch

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] ConstrainedEmbed issue

2020-07-07 Thread Sereina Riniker
Dear Pavel and Sunhwan,

Please note that hydrogens should always be added for the embedding algorithm 
to work properly (i.e. it’s not a walk around but what should be done).
See also Section “Working with 3D Molecules” in 
https://www.rdkit.org/docs/GettingStartedInPython.html

Best regards,
Sereina



> On 7 Jul 2020, at 21:26, Sunhwan Jo  wrote:
> 
> 
> The reason constraint embed didn’t work is the molecule simply can’t be 
> embedded using the rdkit’s algorithm.
> 
>> In [25]: mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O')   
>>  
>> 
>> In [26]: AllChem.EmbedMolecule(mol_child)
>>  
>> Out[26]: -1
> 
> 
> See more discussion here:
> https://github.com/rdkit/rdkit/issues/2996 
> 
> 
> 
> The SMILES you posted looks valid to me and doesn’t look that complicated, 
> but the anyway I think
> somehow the RDKit’s algorithm tripped up and couldn’t finish embedding 
> without some help. Hope
> someone with more in-depth insight can help here.
> 
> 
> Anyway, for a walk around, adding H seems to do the trick:
> 
>> In [39]: mol = AllChem.AddHs(mol_child)  
>>  
>> 
>> In [40]: AllChem.EmbedMolecule(mol)  
>>  
>> Out[40]: 0 # worked
>> 
>> In [41]: AllChem.ConstrainedEmbed(mol, mol_parent)   
>>  
>> Out[41]:  # also worked
>> 
> 
> 
> 
> Sunhwan
> 
> 
> 
> 
>> On Jul 7, 2020, at 12:36 AM, Pavel Polishchuk > > wrote:
>> 
>> Hi all,
>> 
>>   I have an issue with ConstrainedEmbed and I cannot figure out what exactly 
>> causes this.
>>   I have a molecule C[C@@H]1C1=O with 3D coordinates in 1.mol file 
>> (attached). And I want to generate coordinates for another structure with 
>> this core -
>> C[C@@H]1CC[C@H](O)CC1=O.
>> 
>>   This is usual way which causes issue with embedding and the corresponding 
>> error.
>> 
>> mol_parent = Chem.MolFromMolFile('1.mol')
>> mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O')
>> try:
>> mol = AllChem.ConstrainedEmbed(mol_child, mol_parent)
>> except ValueError as e:
>> print(e)
>> 
>>   If I add explicit hydrogens the issue disappears.
>> 
>> mol_parent = Chem.MolFromMolFile('1.mol')
>> mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O')
>> mol = AllChem.ConstrainedEmbed(Chem.AddHs(mol_child), mol_parent)
>> 
>>   If I do not use pre-defined coordinates - everything works well.
>> 
>> mol_parent = Chem.MolFromSmiles('C[C@@H]1C1=O')
>> AllChem.EmbedMolecule(mol_parent)
>> mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O')
>> mol = AllChem.ConstrainedEmbed(mol_child, mol_parent)
>> 
>>   Does ugly coordinates in 1.mol file cause the embedding issue? Or the 
>> issue is caused by some implicit properties of a molecule? How to solve this 
>> properly?
>> 
>> Kind regards,
>> Pavel.
>> <1.mol>___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net 
>> 
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Failing when embeding molecule with several fragments

2020-11-05 Thread Sereina Riniker
Dear Pablo,

The RDKit conformer generator is not really suitable to generate coordinates 
for arrangements of multiple molecules. 
For this, I would go for tools implemented in MD packages.

Best regards,
Sereina


> On 5 Nov 2020, at 14:56, Pablo Ramos  wrote:
> 
> Hello everybody,
>  
> I am trying to generate 3D coordinates and optimize the system with MM.
> When optimizing, atoms overlap for one of the O=C(Cl)Cl fragments.
>  
> This is my code:
> smiles = 'Cc1ccc(N)cc1N.O=C(Cl)Cl.O=C(Cl)Cl'
> m = Chem.MolFromSmiles(smiles)
> m = Chem.AddHs(m)
> AllChem.EmbedMolecule(m, useRandomCoords = True)
> ffu = AllChem.UFFGetMoleculeForceField(m, ignoreInterfragInteractions = False)
> ffu.Initialize()
> ffu.Minimize(maxIts = 500)
>  
> In order to be sure that this is not a problem of convergency, I 
> unsuccessfully  set  ffu.Minimize(maxIts) with a high value, as well as 
> trying with a high number of maxAttempts for the embedding.
>  
> Thanks a lot, 
>  
> Best regards,
>  
> Pablo Ramos
> Ph.D. at Covestro Deutschland AG
>  
> 
> 
>  
> covestro.com 
> Telephone
> +49 214 6009 7356
>  
> Covestro Deutschland AG
> COVDEAG-Chief Commer-PUR-R&D-EMEA-PMD
> B103, R164
> 51365 Leverkusen, Germany
> pablo.ra...@covestro.com 
>  
>  
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net 
> 
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss 
> 
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] TFD and RMSD for macrocycles

2022-01-19 Thread Sereina Riniker
Hi Paul,

TFD was developed for drug-like molecules with small rings. The torsions of 
ring bonds are therefore summed up into a single average value for each ring 
(see Figure 1 in J. Chem. Inf. Model., 52, 1499, 2012). This makes of course 
not much sense for a macrocycle and likely causes your results.

If you are interested in the macrocycle conformation alone, I would recommend 
to either use ringRMSD (with or without beta-atoms) or a torsional-angle RMSD 
(careful with the periodicity). The latter has the advantage that no alignment 
is needed.

Examples for the use of ringRMSD with macrocycles are in the ETKDG version 3 
paper (J. Chem. Inf. Model., 60, 2044, 2020) or the recent noeETKDG paper 
(https://pubs.acs.org/doi/10.1021/acs.jcim.1c01165).

Hope this helps.

Best,
Sereina


> On 19 Jan 2022, at 16:37, mix_of_reasons via Rdkit-discuss 
>  wrote:
> 
> 
> Hi RDKitters,
> 
> I am using the RDKit implementation of TFD to examine conformational 
> differences between macrocycles and to cluster their conformations. In some 
> basic testing of a set of conformations of the same macrocycle from the PDB I 
> find something unexpected:
> 
> #mollist = list of conformations of the same molecule from the PDB
> 
> tfds = []
> rmsds = []
> for m in mollist:
> for n in mollist:
> tfd = TorsionFingerprints.GetTFDBetweenMolecules(m, n, maxDev='spec', 
> useWeights=False)
> rmsd = GetBestRMS(m,n)
> tfds.append(tfd)
> rmsds.append(rmsd)
> 
> #Plot the two lists
> 
> 
> 
> Comfortingly, all cases of RMS = 0 also give TFD = 0.
> According to the original paper a TFD of 1 implies maximal torsional 
> deviation, yet here I see a very low RMSD (0.3-4A, essentially insignificant 
> for molecules of this size) at TFD = 1.
> Also useWeights = True in the code above gives TFDs > 1, which is clearly not 
> possible in the spirit of the original idea, but probably arises from there 
> not really being a graph centre in a macrocycle.
> 
> The idea of clustering macrocycles based on some measure of distance in 
> torsion space is very appealing, but I am concerned by TFD = 1 being 
> calculated for conformations that have essentially the same geometry. Any 
> suggestions on how to proceed?
> 
> 
> Paul.
> 
> 
> Sent with ProtonMail  Secure Email.
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] atom indexing in mol and conformer

2022-06-19 Thread Sereina Riniker
Dear Ling,

Yes, the atom indexing is the same for all conformers of a molecule.

Best regards,
Sereina



> On 19 Jun 2022, at 00:04, Ling Chan  wrote:
> 
> Dear colleagues,
> 
> Just wonder if the atom indexing in a conformer is always identical to that 
> of the parent molecule? I suspect it is but would like to confirm.
> 
> Specifically, I would like to confirm that
>   for conf in mol.GetConformers():
> conf.GetAtomPosition(idx)
> always corresponds to the atom
>   mol.GetAtomWithIdx(idx)
> 
> Thank you.
> 
> Ling
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss