Re: [Rdkit-discuss] Question on substructure search

2023-06-14 Thread Patrick Walters
and RDKit, > > > > *rdkit.__version__ = 2023.03.1* > > > > Here is a slightly more explicit variant tried because neither worked to > find a match: > > > > > > Respectfully, > > Joey Storer > > > > General Business > > *From:*

Re: [Rdkit-discuss] Question on substructure search

2023-06-13 Thread Patrick Walters
Hi Joey, You can get the intended result like this pat = Chem.MolFromSmarts("*=C1*C=C*1") mol = Chem.MolFromSmiles("C=C1SC=CS1") mol.HasSubstructMatch(pat) Pat On Tue, Jun 13, 2023 at 4:49 PM Storer, Joey (J) via Rdkit-discuss < rdkit-discuss@lists.sourceforge.net> wrote: > Hi RDKit masters, >

Re: [Rdkit-discuss] RDKit in Google Colab

2022-08-03 Thread Patrick Walters
Actually, you can now just !pip install rdkit From: Jan Halborg Jensen Sent: Wednesday, August 3, 2022 9:47:20 AM To: Eduardo Mayo Cc: RDKit Discuss Subject: Re: [Rdkit-discuss] RDKit in Google Colab !pip install rdkit-py No need to use anaconda for Colab RDKi

Re: [Rdkit-discuss] Permutation of multiple enumeration

2022-07-06 Thread Patrick Walters
Here's a simple example showing the enumeration of a 3 component library based on a reaction https://gist.github.com/PatWalters/7439099598b4f08a331a81b209f88baa On Wed, Jul 6, 2022 at 4:57 PM Andrew Dalke wrote: > Hi Carsten, > > How are the fragments expressed? With attachment points marked

Re: [Rdkit-discuss] Clustering

2022-05-01 Thread Patrick Walters
of actives and the dataset of cluster representatives. > > > > On Sun, 1 May 2022, 17:09 Patrick Walters, wrote: > >> For me, a lot of this depends on what you intend to do with the >> clustering. If you want to pick a "representative" subset from a larger >&

Re: [Rdkit-discuss] Clustering

2022-05-01 Thread Patrick Walters
For me, a lot of this depends on what you intend to do with the clustering. If you want to pick a "representative" subset from a larger dataset, k-means may do the trick. As Rajarshi mentioned, Practical Cheminformatics has a k-means implementation that runs with FAISS. Depending on your goal, ch

Re: [Rdkit-discuss] pharmacophore

2022-03-29 Thread Patrick Walters
One way to compare interactions (pharmacophores) in a binding site is to use interaction fingerprints. I've had a good experience with ProLIF. https://github.com/chemosim-lab/ProLIF On Tue, Mar 29, 2022 at 6:26 AM Muhammad Akram wrote: > Hello Everybody, > > > > I am looking if there is a way t

Re: [Rdkit-discuss] problem saving rdSubstructLibrary.

2022-03-13 Thread Patrick Walters
t; Hi Pat, > > I don't think you're doing anything wrong. This looks like a bug in the > RDKit. > It seems to be connected to the PatternHolder... I will look into it. > > -greg > > > On Sat, Mar 12, 2022 at 10:26 PM Patrick Walters > wrote: > >>

[Rdkit-discuss] problem saving rdSubstructLibrary.

2022-03-12 Thread Patrick Walters
Hi All, I'd appreciate any insight on what I'm doing wrong. I'm trying to save an rdSubstructLibrary. with library.toStream(). When library is empty I can save the library with library.toStream(), however when I've added molecule to the library, I get this error message. UnicodeDecodeError: 'ut

Re: [Rdkit-discuss] Find structures with "non-organic" atoms

2022-03-05 Thread Patrick Walters
Here's what I use. not_organic_pat = Chem.MolFromSmarts("[!#1;!C;!O;!N;!S;!P;!F;!Cl;!Br;!I;!c;!o;!n;!s;!p;!Na;!K;!Mg;!Ca;!Li]") cisplatin = Chem.MolFromSmiles("[NH3+]-[Pt-2](Cl)(Cl)[NH3+]") cisplatin.HasSubstructMatch(not_organic_pat) On Sat, Mar 5, 2022 at 8:08 PM Rafael L via Rdkit-discuss <

Re: [Rdkit-discuss] File Formats with Partial Charges

2021-10-27 Thread Patrick Walters
Hi Hao, As a long-time file format geek. I feel the need to jump into this one. 1. mol2 I'm not a fan of using mol2. AFAIK, there is no definitive documentation for the atom typing rules or the aromaticity model. 2. sdf The RDKit has had a facility for storing atom properties in an SDF since

Re: [Rdkit-discuss] how to make a database fingerprint

2021-09-15 Thread Patrick Walters
numpy! import pandas as pd from descriptor_gen import DescriptorGen import numpy as np from rdkit import Chem, DataStructs from rdkit.Chem import AllChem def smi2fp(smi): mol = Chem.MolFromSmiles(smi) fp = AllChem.GetMorganFingerprintAsBitVect(mol, 2, nBits=2048) arr = np.zeros((0,),

Re: [Rdkit-discuss] SMILES from sdf file

2021-09-12 Thread Patrick Walters
Hi Anthony, This is pretty easy and you don't need to use PandasTools (although PandasTools are very cool). #!/usr/bin/env python import sys from rdkit import Chem suppl = Chem.SDMolSupplier(sys.argv[1]) for mol in suppl: if mol: print(Chem.MolToSmiles(mol),mol.GetProp("_Name")) By

Re: [Rdkit-discuss] Cheminformatics Graduate School Recommendations?

2021-07-20 Thread Patrick Walters
If you're looking for a more ML oriented program. I'd recommend David Koes group at Pitt. https://www.csb.pitt.edu/people/faculty/david-koes/ Jacob Durrant is also doing interesting work in another department at Pitt https://durrantlab.pitt.edu/people/ On Tue, Jul 20, 2021 at 2:47 AM Stiefl,

Re: [Rdkit-discuss] Compare molecules, get matching atom indices

2021-06-09 Thread Patrick Walters
What about something like this? https://gist.github.com/PatWalters/15352d9007c33a214ac13d3dda814624 On Wed, Jun 9, 2021 at 9:28 PM Christopher Schlicksup wrote: > Hi Rdkit community, > > I have a case where I have pairs of molecules that are very similar, but > have different atom indices. I w

Re: [Rdkit-discuss] XYZ to mol ???

2021-06-06 Thread Patrick Walters
Hi Joey, Have you looked at this? https://github.com/jensengroup/xyz2mol Pat On Fri, Jun 4, 2021 at 8:57 PM Storer, Joey (J) wrote: > Dear all, > > > > For molecular modeling workflows and interoperability with QM/MM etc., > > > > Can RDKit gain a Chem.XyzToMol(xyz) functionality? > > > > Than

Re: [Rdkit-discuss] Are the path-based fingerprints formally described in the scientific literature?

2021-05-20 Thread Patrick Walters
There's also some information on path fingerprints in the Daylight Theory Manual https://www.daylight.com/dayhtml/doc/theory/theory.finger.html On Wed, May 19, 2021 at 10:47 PM Greg Landrum wrote: > Hi Francois, > > On Thu, May 20, 2021 at 3:19 AM Francois Berenger > wrote: > >> >> The other

Re: [Rdkit-discuss] Using the RDKit with Dask

2021-03-22 Thread Patrick Walters
> > By either explicitly doing the import in calc_bcut() or referencing the > function through the module, dask seems to be able to figure out how to do > the right thing. > > -greg > p.s. in case you see different behavior: > In [2]: dask.__version__ > Out[2]: '2020

Re: [Rdkit-discuss] [*External*] Re: Using the RDKit with Dask

2021-03-22 Thread Patrick Walters
lcExactMolWt > > to avoid another error. > > Which version of rdkit do you use ? > > > > BR > > > > Guillaume > > > > > > *De : *Patrick Walters > *Date : *lundi, 22 mars 2021 à 14:20 > *À : *Guillaume GODIN > *Cc : *rdkit-discuss > *Obj

Re: [Rdkit-discuss] [*External*] Re: Using the RDKit with Dask

2021-03-22 Thread Patrick Walters
use esol.csv for > example ? > > > > Thanks > > > > Guillaume > > > > *De : *Patrick Walters > *Date : *lundi, 22 mars 2021 à 13:51 > *À : *rdkit-discuss > *Objet : *[*External*] Re: [Rdkit-discuss] Using the RDKit with Dask > > Apologies, there

Re: [Rdkit-discuss] Using the RDKit with Dask

2021-03-22 Thread Patrick Walters
Apologies, there was a bug in the code I sent in my previous message. The problem is the same. Here is the corrected code in a gist. https://gist.github.com/PatWalters/ca41289a6990ebf7af1e5c44e188fccd On Mon, Mar 22, 2021 at 8:16 AM Patrick Walters wrote: > Hi All, > > I've

[Rdkit-discuss] Using the RDKit with Dask

2021-03-22 Thread Patrick Walters
Hi All, I've been trying to calculate BCUT2D descriptors in parallel with Dask and get this error with the code below. TypeError: cannot pickle 'Boost.Python.function' object Everything works if I call mw_df, which calculates molecular weight, but I get the error above if I call bcut_df. Does an

Re: [Rdkit-discuss] XGboost and fingerprint error

2021-02-16 Thread Patrick Walters
I'm not sure why this was sent to rdkit-discuss, but I just pushed a fix to github. Sorry for the hassles. Pat On Tue, Feb 16, 2021 at 10:15 AM Mandar Kulkarni < mandar.kulkarni.c...@gmail.com> wrote: > Hi, > > I am trying to repeat the xgboost tutorial from here: > https://github.com/PatWalter

Re: [Rdkit-discuss] visualize substructure matches in Molecule Grid Image

2020-09-29 Thread Patrick Walters
I have an example in this blog post that does what you're looking for. http://practicalcheminformatics.blogspot.com/2019/09/dissecting-hype-with-cheminformatics.html On Tue, Sep 29, 2020 at 6:04 PM Markus Metz wrote: > Thank you Kangway. > So it is list of lists for each molecule in the grid.

Re: [Rdkit-discuss] GenerateDepictionMatching2DStructure question

2019-05-23 Thread Patrick Walters
>> >> mol = Chem.MolFromMolFile(src, sanitize=True) >> >> matrix = numpy.zeros((4, 4), numpy.float) >> >> >> >> for i in range(3): >> >> matrix[i, i] = factor >> >> matrix[3, 3] = 1 >> >> >> &

[Rdkit-discuss] GenerateDepictionMatching2DStructure question

2019-05-23 Thread Patrick Walters
Hi All, I'm trying to align a set of structures to a template that I have as molfile. When I call GenerateDepictionMatching2DStructure it appears that the coordinate for the template are directly copied. This results in a structure like the one below, where the bond lengths for the template are

Re: [Rdkit-discuss] Smarts conversion help

2019-03-26 Thread Patrick Walters
HI Xiaobo, There's an explicit hydrogen in the SMARTS that shouldn't be there. I also wouldn't include the single bonds around the ring closures. '[#8]=[#6]-3-c1c2c(ccc1)2-[#6](-[#7]-3-*[#1]*)=[#8]') from rdkit import Chem from rdkit.Chem import Draw smi = "O=C(C1=C2C(C=CC=C23)=CC=C1)N([H]

Re: [Rdkit-discuss] Is there any way to protonate a molecule?

2019-03-25 Thread Patrick Walters
I haven't tried it yet, but this recent paper in the Journal of Cheminformatics looks interesting. The authors supply a git repo with code based on the RDKit. Dimorphite-DL: an open-source program for enumerating the ionization states of drug-like small molecules https://jcheminf.biomedcentral.c

Re: [Rdkit-discuss] General Smarts Language to select molecules without H elements

2019-02-13 Thread Patrick Walters
Are you interested in aromatic c-H? It looks like 3 of 4 molecules have hydrogens (if you count methyls) from rdkit import Chem from rdkit.Chem.Draw import MolsToGridImage buff = """N#C/C(C#N)=C(C(F)=C/1F)\C(F)=C(F)C1=C(C#N)\C#N N#C/C(C#C)=C(C=C/1F)\C(F)=C(F)C1=C(C#N)\C#N FC1=C(F)C(C#N)=C(F)C(OC(

Re: [Rdkit-discuss] conda install rdkit

2019-02-07 Thread Patrick Walters
I've been running the conda version with Python 3.6.6 on a couple of Macs with no issues. Pat On Thu, Feb 7, 2019 at 8:34 AM Greg Landrum wrote: > Hi Paul, > > That looks like some residual of the horrible problems caused by some > conda changes that happened last year but that were fixed. I wo

[Rdkit-discuss] SMILES validation question

2019-01-11 Thread Patrick Walters
Hi all, I ran into a case that I found confusing. If convert this SMILES to an RDKit molecule, I get a valid molecule. In [2]: mol = Chem.MolFromSmiles("O=C(CC1SCCC1)c1c1N") In [3]: mol Out[3]: However, if I convert the molecule to SMILES then covert it back to a molecule, it is no longer

Re: [Rdkit-discuss] canonical atom mapping

2018-10-04 Thread Patrick Walters
I just wrote a blog post on this topic. https://practicalcheminformatics.blogspot.com/2018/09/assigning-bond-orders-to-pdb-ligands.html On Thu, Oct 4, 2018 at 3:35 PM MARIA BRANDL via Rdkit-discuss < rdkit-discuss@lists.sourceforge.net> wrote: > Hello Eric, > > RDKit can assign bond orders from

Re: [Rdkit-discuss] LogS (water solubility) descriptor

2018-08-30 Thread Patrick Walters
Hi Dimitar, I put an RDKit implementation of the ESOL method from the original paper by Delaney on my GitHub site. I also refit the coefficients to maximize performance with the RDKit calculated descriptors. https://github.com/PatWalters/solubility Note that solubility prediction is a really hard

Re: [Rdkit-discuss] Tanimoto Similarity

2018-07-04 Thread Patrick Walters
I would highly recommend this paper where the authors describe an alternative to arbitrary similarity cutoffs https://pubs.acs.org/doi/pdf/10.1021/ci7004498 Pat On Wed, Jul 4, 2018 at 9:31 AM Maciek Wójcikowski wrote: > Hi > > As Nils has mentioned this is fingerprint dependent. ECFP4 have the

Re: [Rdkit-discuss] question on rdRGroupDecomposition

2018-05-15 Thread Patrick Walters
r should I fire up the > debugger? > > -greg > > > > On Mon, May 14, 2018 at 4:24 AM Patrick Walters > wrote: > >> Hi All, >> >> I'm hoping someone can help me with rdRGroupDecomposition. I'd like to >> be able to specify specific R-group loca

[Rdkit-discuss] question on rdRGroupDecomposition

2018-05-13 Thread Patrick Walters
Hi All, I'm hoping someone can help me with rdRGroupDecomposition. I'd like to be able to specify specific R-group locations AND match cases where R=H. The example below illustrates what I'm talking about. When RGroupDecompositionParameters.onlyMatchAtRGroups = True, cases where R == H are skip

[Rdkit-discuss] SMARTS parsing error with isotopes

2018-05-04 Thread Patrick Walters
Hi All, I've been playing around with some of the structural alerts from ChEMBL and noticed that on alert was generating a SMARTS Parse Error with the RDKit. [2H,3H,13C,14C,15N,125I,23F,22Na,32P,33P,35S,45Ca,57Co,103Ru,141Ce] It appears that the issue was reported previously and that Greg fixed

[Rdkit-discuss] seg fault when importing Chem on OS-X 10.12

2018-04-16 Thread Patrick Walters
Hi All, I installed the latest RDKit using conda conda create -c rdkit -n rdkit_2017 rdkit When I import Chem I get a seg fault ➜ ~ source activate rdkit_2017 (rdkit_2017) ➜ ~ python Python 3.5.5 |Anaconda, Inc.| (default, Mar 12 2018, 16:25:05) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE

Re: [Rdkit-discuss] reassembling a molecule from R-groups

2018-04-15 Thread Patrick Walters
_weld = Chem.MolFromSmiles( "CN(C)CC(Br)c1cc([*:2])c([*:1])cn1.[H]C([H])([H])[*:1].[H][*:2]") welded_mol = weld_r_groups(mol_to_weld) print(Chem.MolToSmiles(welded_mol)) Best, Pat On Sun, Apr 15, 2018 at 12:16 PM, Patrick Walters wrote: > Hi All, > > I was about

[Rdkit-discuss] reassembling a molecule from R-groups

2018-04-15 Thread Patrick Walters
Hi All, I was about to write a function to reassemble a molecule from a core + R-groups, but I thought I'd check and see if such a function already exists. This is work with the output of rdRGroupDecomposition Gvien a core: CN(C)CC(Br)c1cc([*:2])c([*:1])cn1 Plus a set of R-groups [H]C([H])([H])

Re: [Rdkit-discuss] comparing two or more tables of molecules

2016-11-29 Thread Patrick Walters
The Layered InChI (LyChi), developed by Trung Nguyen at NCATS was designed to directly address the problem you describe. I don't have any first hand experience with this method (yet), but it looks intriguing. https://github.com/ncats/lychi Pat On Mon, Nov 28, 2016 at 11:25 AM, Stephen O'hagan

Re: [Rdkit-discuss] Clustering functions in Java API

2015-02-23 Thread Patrick Walters
I agree that there are plenty of implementations of clustering, machine learning, etc. It would be better for the RDKit developers to focus on cheminformatics. This being said, there are some opportunities for domain specific performance enhancement. One of the slow steps in many clustering alg

Re: [Rdkit-discuss] portable PostgreSQL + RDKit cartridge?

2014-08-28 Thread Patrick Walters
If you want everything in one nice package, you may want to look at MyChEMBL. This has a VM with PostgreSQL, kinime, python, the RDKit, and ChEMBL. http://chembl.blogspot.com/2013/10/chembl-virtual-machine-aka-mychembl.html Pat On Thu, Aug 28, 2014 at 10:11 AM, Michal Krompiec wrote: > Dear

Re: [Rdkit-discuss] Chem.AddHs() doesn't care about compound layout

2014-08-21 Thread Patrick Walters
Your input molfile lacks the 2D on line 2, e.g. RDKit 2D Pat On Thu, Aug 21, 2014 at 5:34 AM, Michał Nowotka wrote: > OK, I'm closer to finding bug in my code. So I have this ctab: > > >>> print ctab > > Converted by chembl_beaker ver. 0.5.20 > > 10 11 0 0 0 0 1 V

Re: [Rdkit-discuss] Sanitization Errors

2014-04-24 Thread Patrick Walters
istos Kannas > > Researcher > Ph.D Student > > Mob (UK): +44 (0) 7447700937 > Mob (Cyprus): +357 99530608 > > [image: View Christos Kannas's profile on > LinkedIn]<http://cy.linkedin.com/in/christoskannas> > > > On 24 April 2014 11:37, Patrick Walter

Re: [Rdkit-discuss] Sanitization Errors

2014-04-24 Thread Patrick Walters
It looks like the problem here is a covalent bond to the counter ion. Pat On Thu, Apr 24, 2014 at 6:04 AM, Christos Kannas wrote: > Hi all, > > I'm having a dozen of compounds, where some of them have a charged atom > (see the attached SMILES file). > > When I parse the file I get sanitization

Re: [Rdkit-discuss] build error

2014-04-04 Thread Patrick Walters
the inconvenience. > > -greg > > On Friday, April 4, 2014, Patrick Walters wrote: > >> Hi All, >> >> I ran into an error building the RDKit from the last git pull on OS-X 10.8 >> >> Has anyone else run into this? >> >> Thanks, >> >> Pat &

[Rdkit-discuss] build error

2014-04-04 Thread Patrick Walters
Hi All, I ran into an error building the RDKit from the last git pull on OS-X 10.8 Has anyone else run into this? Thanks, Pat Linking CXX shared library ../../../lib/libSubstructMatch.dylib [ 30%] Built target SubstructMatch [ 30%] [BISON][SmilesY] Building parser with bison 2.3 smiles.yy:48.

[Rdkit-discuss] 2D pharmacophore question

2013-09-03 Thread Patrick Walters
Hi All, I was working through the 2D pharmacophore example in the "Getting Started" docs http://www.rdkit.org/docs/GettingStartedInPython.html#d-pharmacophore-fingerprints and I threw an exeception that I don't understand. Here's my code == #!/usr/bin/env pytho

Re: [Rdkit-discuss] random compound

2013-07-04 Thread Patrick Walters
I'm not an expert on this, but I think this function just randomizes the order of the atoms in a molecule. Generating a random molecule for a particular molecular formula that is consistent with rules of valence is kind of tricky. If you're interested in doing this sort of thing, you may want to

Re: [Rdkit-discuss] New module for RDKit - PANDAS integration

2013-04-22 Thread Patrick Walters
I just started playing around with the Pandas module, this is very cool stuff. Thanks so much Nikolas for the contribution. I definitely owe you a beer at the UGM. It might be worth noting that the you need to install PIL in order to use the Pandas module. Everything will install without a prob

Re: [Rdkit-discuss] non-smallest rings

2013-01-22 Thread Patrick Walters
If you're just looking for 6 membered rings, you can define a SMARTS that matches 6 membered rings like this "*1~*~*~*~*~*1". You can also use this approach to identify all rings (at least those within reason). You can use an expression like this ["*1"+string.join(["*~"]*x,"")+"*1" for x in rang

[Rdkit-discuss] SLN Parse Errors

2013-01-03 Thread Patrick Walters
Hi All, I've been trying to use RDKit to parse the SLN queries in a recent paper from Jonathan Baell at Monash http://pubs.acs.org/doi/abs/10.1021/ci300461a RDKit is able to successfully parse most of the queries, but is unable to handle 18 of 539. It looks like the problem is with the "&NOT" co