Hello Everyone,
I am trying to create a dataframe where all the different pharmacophore 
fingerprint features have their own column in my dataframe with their values.

Column1 SMILES, Column2 Pharmacophore feature #1, Column3 pharmacophore feature 
number 2, etc
C1=CC=CC=C1, 0, 1, etc

This is my code but it doesn't want to properly put itself in the nature I 
described above.
Any help would be greatly appreciated.

Main Code
df2 = pd.DataFrame()
fdefName = 'BaseFeatures.fdef'
featFactory = ChemicalFeatures.BuildFeatureFactory(fdefName)
sigFactory = SigFactory(featFactory, minPointCount=2, maxPointCount=3)
sigFactory.SetBins([(0,2),(2,5),(5,8)])
sigFactory.Init()
sigFactory.GetSigSize()
for chunk in pd.read_csv(input2, delimiter = ',', header = 0, index_col = [''],
                 dtype = {'SMILES':str},
                 names = ['SMILES'], low_memory=False, chunksize = 500000):
    PandasTools.AddMoleculeColumnToFrame(chunk, smilesCol = 'SMILES')
    SMILES = []
    SMILES = chunk.iloc[0:,1]

    molecules = [Chem.MolFromSmiles(x) for x in SMILES]
    fingerprints = [FingerprintMols.FingerprintMol(x) for x in molecules]

    for row in chunk:
        pharmacophorefps = Generate.Gen2DFingerprint(molecules[-1], sigFactory)
        pharmacophorefps.GetNumOnBits()
        list1 = list(pharmacophorefps.GetOnBits())
        df2 = (np.array(list1).reshape(1,219))
    df2 = np.stack([list1 for i in range(len(SMILES))],axis=0)


Much Thanks
Antoine Dumas
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to