Hello Everyone,
I am trying to create a dataframe where all the different pharmacophore
fingerprint features have their own column in my dataframe with their values.
Column1 SMILES, Column2 Pharmacophore feature #1, Column3 pharmacophore feature
number 2, etc
C1=CC=CC=C1, 0, 1, etc
This is my code but it doesn't want to properly put itself in the nature I
described above.
Any help would be greatly appreciated.
Main Code
df2 = pd.DataFrame()
fdefName = 'BaseFeatures.fdef'
featFactory = ChemicalFeatures.BuildFeatureFactory(fdefName)
sigFactory = SigFactory(featFactory, minPointCount=2, maxPointCount=3)
sigFactory.SetBins([(0,2),(2,5),(5,8)])
sigFactory.Init()
sigFactory.GetSigSize()
for chunk in pd.read_csv(input2, delimiter = ',', header = 0, index_col = [''],
dtype = {'SMILES':str},
names = ['SMILES'], low_memory=False, chunksize = 500000):
PandasTools.AddMoleculeColumnToFrame(chunk, smilesCol = 'SMILES')
SMILES = []
SMILES = chunk.iloc[0:,1]
molecules = [Chem.MolFromSmiles(x) for x in SMILES]
fingerprints = [FingerprintMols.FingerprintMol(x) for x in molecules]
for row in chunk:
pharmacophorefps = Generate.Gen2DFingerprint(molecules[-1], sigFactory)
pharmacophorefps.GetNumOnBits()
list1 = list(pharmacophorefps.GetOnBits())
df2 = (np.array(list1).reshape(1,219))
df2 = np.stack([list1 for i in range(len(SMILES))],axis=0)
Much Thanks
Antoine Dumas
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss