Re: [Rdkit-discuss] shape descriptor expressed as array of numbers

2017-05-19 Thread Greg Landrum
Hi Thomas, There isn't currently anything there. The RDKit had a USR (and USRCAT) implementation a few years ago, but we removed it because of the patent on USR. Now that the patent has lapsed, there's an active PR to re-integrate those descriptors. There's also a PR from Guillaume Godin that

Re: [Rdkit-discuss] Non-standard Heavy Atoms and CHemFP Fingerprints

2017-05-19 Thread Andrew Dalke
On May 19, 2017, at 21:59, Markus Heller wrote: > [In chemfp] I get the following error: > > [11:37:55] Explicit valence for atom # 6 Te, 4, is greater than permitted > ERROR: Cannot parse the SMILES > 'CC(C)(/C(=C\\Cl)/[Te-2](c1ccc(cc1)OC)(Cl)Cl)O' at line 155850 of >

[Rdkit-discuss] Non-standard Heavy Atoms and CHemFP Fingerprints

2017-05-19 Thread Markus Heller
Hi all, I'm trying to work with Chembl23, calculating ChemFP fingerprints. Some compounds contain non-standard heavy atoms, e.g. this one containing Te: CC(C)(/C(=C\\Cl)/[Te](c1ccc(cc1)OC)(Cl)Cl)O My workflow is to convert SDF format to SMILES as I find it easier to correct any minor errors.

[Rdkit-discuss] Depicting reactions to the same quality as molecules

2017-05-19 Thread Ed Griffen
Is there a reaction depiction option similar to the MolDraw2DCairo which produces much better depictions that the simple Chem.Draw PIL images? Or am I just doing this wrong? Attempting to push a reaction through MolDraw2DCairo fails with: Traceback (most recent call last): File

Re: [Rdkit-discuss] Fast similarity search

2017-05-19 Thread Tim Dudgeon
Greg, Nils, Andrew, Thanks for all that info. Gives me plenty to work on! Tim On 19/05/2017 09:27, Andrew Dalke wrote: > On May 19, 2017, at 08:33, Greg Landrum wrote: >> The best solution to this is to use chemfp. It's a remarkable piece of >> software. > Thanks,

Re: [Rdkit-discuss] How to transform SMARTS of aromatic structures so that their aromatic atoms could be any?

2017-05-19 Thread Alexis Parenty
Hi Christos, thank you so much! Your approach is much simpler and quicker than what I had, and it now works with polycyclic compounds. I did try your approach at first but I could not have an image representation in ChemDraw of the SMARTS I was creating with the "a" labels. I thought I was doing

Re: [Rdkit-discuss] How to transform SMARTS of aromatic structures so that their aromatic atoms could be any?

2017-05-19 Thread Christos Kannas
Hi Alexis, In SMARTS you can define an aromateic atom with "a". So I'm thinking that something like the following, might produce more correct generalised SMARTS patterns. https://gist.github.com/CKannas/7a9e2768461260461155257fd30c2152 *Note: Please check if the chemistry is correct.* Best,

[Rdkit-discuss] shape descriptor expressed as array of numbers

2017-05-19 Thread Thomas Evangelidis
Greetings, Is there any shape descriptor available in RDKit that can be expressed as an array of numbers (e.g. like the 2D similarity fingerprints)? Alternatively is anyone aware of any other implementation of such a descriptor? The only one I know is the Ultrafast Shape Recognition (USR)

[Rdkit-discuss] How to transform SMARTS of aromatic structures so that their aromatic atoms could be any?

2017-05-19 Thread Alexis Parenty
Hi everyone, I need a function that could generalize any aromatic rings from a SMARTS: [image: Inline images 1] I have noticed that it is possible to rearrange most of SMARTS strings into a general aromatic SMARTS strings by following those simple rules: 1 Exchange any

Re: [Rdkit-discuss] Fast similarity search

2017-05-19 Thread Andrew Dalke
On May 19, 2017, at 08:33, Greg Landrum wrote: > The best solution to this is to use chemfp. It's a remarkable piece of > software. Thanks, Greg. > If you aren't willing to license that, the RDKit's search brute-force > fingerprint search capabilities aren't too bad

Re: [Rdkit-discuss] Fast similarity search

2017-05-19 Thread Greg Landrum
Hi Tim, First the best answer: The best solution to this is to use chemfp. It's a remarkable piece of software. If you aren't willing to license that, the RDKit's search brute-force fingerprint search capabilities aren't too bad for in-memory fingerprints. There's some information in this slide