Dear RDKit community,
I was treating AllChem.GetMorganFingerprint(m1,2) the same as ECFP4. I am
writing a paper for a open source tool, so I need to be very accurate. I
have seen one open source implementation for ECFP, which is from CDK. Most
researchers are using Pipeline Pilot to calculate ECFP
Dear RDKit users,
I use Draw.MolsToFile for plotting 2D molecules. However, the size of atom
is relative small. This is ok for nitrogen and oxigen. It's not very clear
for atoms like F and Cl. Is there any better way to do it?
Thanks,
Jing
---
Just out of curiosity, I have seen many publications about scaffold tree
generation. Like the Scaffold Tree[1], Scaffold Hunter[2], inSARa,
Fragment-Augmented Molecular Hasse Diagrams, Snowflake Diagram[3]...
How do you guys choose among them? I haven't seen any comparison paper for
those methods,
>> Hi Jing,
>>
>> Most fingerprints are binary, thus can be stored as np.bool_, which
>> compared to double should be 64 times more memory efficient.
>>
>> Best,
>> Maciej
>>
>>
>> Pozdrawiam, | Best regards,
>> Maciek Wójciko
Hi Greg,
Thanks! It works! But, is that possible to fold the fingerprint to smaller
size? np.zeros((100,2048)) still takes a lot of memory...
Best,
Jing
On Wed, Aug 26, 2015 at 11:02 PM, Greg Landrum
wrote:
>
> On Thu, Aug 27, 2015 at 3:00 AM, Jing Lu wrote:
>
>>
>
PM, Jing Lu wrote:
> > I hope the memory issue won't be a problem.
>
> That's up to you and your choice of threshold.
>
> > Most AgglomerativeClustering algorithms have time complexity with N^2.
> Will that be a problem?
>
> You have to decided for yourse
ayon/
> It's not function of RDKit, but I think the library can cluster molecules
> using ECFP4.
>
> Unfortunately, input file format of bayon is not distance matrix but easy
> to prepare the format.
>
> Best regards.
>
> Takayuki
>
>
> 2015年8月23日(日) 12:03
be a problem?
Best,
Jing
On Sun, Aug 23, 2015 at 3:13 AM, Andrew Dalke
wrote:
> On Aug 23, 2015, at 3:43 AM, Jing Lu wrote:
> > If I want to cluster more than 1M molecules by ECFP4. How could I do it?
> If I calculate the distance between every pair of molecules, the size of
>
cluster it and then put the respective scaffold compounds inside the
> cluster .
>
> Sent from my iPhone
>
> > On Aug 22, 2015, at 8:43 PM, Jing Lu wrote:
> >
> > Dear RDKit users,
> >
> > If I want to cluster more than 1M molecules by ECFP4. How could I do
Dear RDKit users,
If I want to cluster more than 1M molecules by ECFP4. How could I do it? If
I calculate the distance between every pair of molecules, the size of
distance matrix will be too big. Does RDKit support any heuristic
clustering algorithm without calculating the distance matrix of the
10 matches
Mail list logo