Re: [Rdkit-discuss] Clustering 1M molecules

2015-08-26 Thread Greg Landrum
On Thu, Aug 27, 2015 at 3:00 AM, Jing Lu wrote: > > So, I wonder is there any way to convert fingerprint to a numpy vector? > Indeed there is: In [11]: from rdkit import Chem In [12]: from rdkit import DataStructs In [13]: import numpy In [14]: m =Chem.MolFromSmiles('C1CCC1') In [15]: fp =

Re: [Rdkit-discuss] Clustering 1M molecules

2015-08-26 Thread Jing Lu
Sorry to bother again... Now, the most time consuming part is clustering. The process getting the fingerprints only takes less than 1h. But, the process for clustering has already taken more than 30h, and I am not sure when it will finish. Currently, I use scikit learn DBSCAN, which has time comp

[Rdkit-discuss] problem of 3D flag when renumbering atms

2015-08-26 Thread Jose Manuel
Hi RDKitters, I would like to renumber atoms of molecules according to a canonized order. I used the code below and noticed that the 3D flag "RDKIT 2D" turns into "RDKIT 3D". Is it expected? How could I switch it back to "RDKIT 2D"? I remember seeing some function to toggle this flag, but just c

Re: [Rdkit-discuss] updated SMARTS filters for PAINS

2015-08-26 Thread Greg Landrum
Thanks Simon! On Wed, Aug 26, 2015 at 10:35 AM, Simon Saubern wrote: > I have the original Sybyl output from Johnathan. It's not in the most > friendly format. All I did was run a few sed commands past it to extract > the ID numbers, and also compile some frequency tables v. PAINS query. > > I'v

Re: [Rdkit-discuss] updated SMARTS filters for PAINS

2015-08-26 Thread Simon Saubern
I have the original Sybyl output from Johnathan. It's not in the most friendly format. All I did was run a few sed commands past it to extract the ID numbers, and also compile some frequency tables v. PAINS query. I've sent a zip file to you directly. Simon On 26/08/2015 15:20 , Greg Landrum