One small notice from me - I would still use other agregative function
instead of sum to get binary FP:
np.reshape(fpa, (4, -1)).any(axis = 0)
I guess it doesn't change a thing with tanimoto, but if you try other
distances then you can get unexpected results (assuming there are crashes).

----
Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl

2015-08-28 17:17 GMT+02:00 Jing Lu <ajin...@gmail.com>:

> Thanks, Greg,
>
> Yes, sciket learn will automatically promote to arrays of float with 
> check_array()
> function. What I am currently doing is
>
>
> fpa = numpy.zeros((len(fp),),numpy.double)
> DataStructs.ConvertToNumpyArray(fp,fpa)
> np.sum(np.reshape(fpa, (4, -1)), axis = 0)
>
>
> Is this the same as FoldFingerprint()?
>
>
> Best,Jing
>
>
>
> On Fri, Aug 28, 2015 at 5:03 AM, Greg Landrum <greg.land...@gmail.com>
> wrote:
>
>> If that doesn't help (and it may not since some Scikit-Learn functions
>> automatically promote their arguments to arrays of doubles), you can always
>> just generate a shorter fingerprint from the beginning (all the
>> fingerprinting functions take an optional argument for this) or fold the
>> existing fingerprints to a new size using the function
>> rdkit.DataStructs.FoldFingerprint().
>>
>> Best,
>> -greg
>>
>>
>> On Thu, Aug 27, 2015 at 4:33 PM, Maciek Wójcikowski <
>> mac...@wojcikowski.pl> wrote:
>>
>>> Hi Jing,
>>>
>>> Most fingerprints are binary, thus can be stored as np.bool_, which
>>> compared to double should be 64 times more memory efficient.
>>>
>>> Best,
>>> Maciej
>>>
>>> ----
>>> Pozdrawiam,  |  Best regards,
>>> Maciek Wójcikowski
>>> mac...@wojcikowski.pl
>>>
>>> 2015-08-27 16:15 GMT+02:00 Jing Lu <ajin...@gmail.com>:
>>>
>>>> Hi Greg,
>>>>
>>>> Thanks! It works! But, is that possible to fold the fingerprint to
>>>> smaller size? np.zeros((1000000,2048)) still takes a lot of memory...
>>>>
>>>>
>>>> Best,
>>>> Jing
>>>>
>>>> On Wed, Aug 26, 2015 at 11:02 PM, Greg Landrum <greg.land...@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>> On Thu, Aug 27, 2015 at 3:00 AM, Jing Lu <ajin...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>> So, I wonder is there any way to convert fingerprint to a numpy
>>>>>> vector?
>>>>>>
>>>>>
>>>>> Indeed there is:
>>>>>
>>>>> In [11]: from rdkit import Chem
>>>>>
>>>>> In [12]: from rdkit import DataStructs
>>>>>
>>>>> In [13]: import numpy
>>>>>
>>>>> In [14]: m =Chem.MolFromSmiles('C1CCC1')
>>>>>
>>>>> In [15]: fp = Chem.RDKFingerprint(m)
>>>>>
>>>>> In [16]: fpa = numpy.zeros((len(fp),),numpy.double)
>>>>>
>>>>> In [17]: DataStructs.ConvertToNumpyArray(fp,fpa)
>>>>>
>>>>>
>>>>> Best,
>>>>> -greg
>>>>>
>>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> Rdkit-discuss mailing list
>>>> Rdkit-discuss@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>>
>>>>
>>>
>>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to