One small notice from me - I would still use other agregative function instead of sum to get binary FP: np.reshape(fpa, (4, -1)).any(axis = 0) I guess it doesn't change a thing with tanimoto, but if you try other distances then you can get unexpected results (assuming there are crashes).
---- Pozdrawiam, | Best regards, Maciek Wójcikowski mac...@wojcikowski.pl 2015-08-28 17:17 GMT+02:00 Jing Lu <ajin...@gmail.com>: > Thanks, Greg, > > Yes, sciket learn will automatically promote to arrays of float with > check_array() > function. What I am currently doing is > > > fpa = numpy.zeros((len(fp),),numpy.double) > DataStructs.ConvertToNumpyArray(fp,fpa) > np.sum(np.reshape(fpa, (4, -1)), axis = 0) > > > Is this the same as FoldFingerprint()? > > > Best,Jing > > > > On Fri, Aug 28, 2015 at 5:03 AM, Greg Landrum <greg.land...@gmail.com> > wrote: > >> If that doesn't help (and it may not since some Scikit-Learn functions >> automatically promote their arguments to arrays of doubles), you can always >> just generate a shorter fingerprint from the beginning (all the >> fingerprinting functions take an optional argument for this) or fold the >> existing fingerprints to a new size using the function >> rdkit.DataStructs.FoldFingerprint(). >> >> Best, >> -greg >> >> >> On Thu, Aug 27, 2015 at 4:33 PM, Maciek Wójcikowski < >> mac...@wojcikowski.pl> wrote: >> >>> Hi Jing, >>> >>> Most fingerprints are binary, thus can be stored as np.bool_, which >>> compared to double should be 64 times more memory efficient. >>> >>> Best, >>> Maciej >>> >>> ---- >>> Pozdrawiam, | Best regards, >>> Maciek Wójcikowski >>> mac...@wojcikowski.pl >>> >>> 2015-08-27 16:15 GMT+02:00 Jing Lu <ajin...@gmail.com>: >>> >>>> Hi Greg, >>>> >>>> Thanks! It works! But, is that possible to fold the fingerprint to >>>> smaller size? np.zeros((1000000,2048)) still takes a lot of memory... >>>> >>>> >>>> Best, >>>> Jing >>>> >>>> On Wed, Aug 26, 2015 at 11:02 PM, Greg Landrum <greg.land...@gmail.com> >>>> wrote: >>>> >>>>> >>>>> On Thu, Aug 27, 2015 at 3:00 AM, Jing Lu <ajin...@gmail.com> wrote: >>>>> >>>>>> >>>>>> So, I wonder is there any way to convert fingerprint to a numpy >>>>>> vector? >>>>>> >>>>> >>>>> Indeed there is: >>>>> >>>>> In [11]: from rdkit import Chem >>>>> >>>>> In [12]: from rdkit import DataStructs >>>>> >>>>> In [13]: import numpy >>>>> >>>>> In [14]: m =Chem.MolFromSmiles('C1CCC1') >>>>> >>>>> In [15]: fp = Chem.RDKFingerprint(m) >>>>> >>>>> In [16]: fpa = numpy.zeros((len(fp),),numpy.double) >>>>> >>>>> In [17]: DataStructs.ConvertToNumpyArray(fp,fpa) >>>>> >>>>> >>>>> Best, >>>>> -greg >>>>> >>>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> >>>> _______________________________________________ >>>> Rdkit-discuss mailing list >>>> Rdkit-discuss@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>>> >>>> >>> >> > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > >
------------------------------------------------------------------------------
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss