Hello Everyone, I am trying to create a dataframe where all the different pharmacophore fingerprint features have their own column in my dataframe with their values.
Column1 SMILES, Column2 Pharmacophore feature #1, Column3 pharmacophore feature number 2, etc C1=CC=CC=C1, 0, 1, etc This is my code but it doesn't want to properly put itself in the nature I described above. Any help would be greatly appreciated. Main Code df2 = pd.DataFrame() fdefName = 'BaseFeatures.fdef' featFactory = ChemicalFeatures.BuildFeatureFactory(fdefName) sigFactory = SigFactory(featFactory, minPointCount=2, maxPointCount=3) sigFactory.SetBins([(0,2),(2,5),(5,8)]) sigFactory.Init() sigFactory.GetSigSize() for chunk in pd.read_csv(input2, delimiter = ',', header = 0, index_col = [''], dtype = {'SMILES':str}, names = ['SMILES'], low_memory=False, chunksize = 500000): PandasTools.AddMoleculeColumnToFrame(chunk, smilesCol = 'SMILES') SMILES = [] SMILES = chunk.iloc[0:,1] molecules = [Chem.MolFromSmiles(x) for x in SMILES] fingerprints = [FingerprintMols.FingerprintMol(x) for x in molecules] for row in chunk: pharmacophorefps = Generate.Gen2DFingerprint(molecules[-1], sigFactory) pharmacophorefps.GetNumOnBits() list1 = list(pharmacophorefps.GetOnBits()) df2 = (np.array(list1).reshape(1,219)) df2 = np.stack([list1 for i in range(len(SMILES))],axis=0) Much Thanks Antoine Dumas
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss