Hi Jeff,
There's a lot f people here with way more experience than me, so this may
not be the optimal solution... But here is what I would do in this case:
from rdkit import Chem, DataStructs
from rdkit.Chem import Draw, PandasTools, Descriptors, rdMolDescriptors
from IPython.display import HTML
def load_sdf_file(file,source,id_column):
"""
Reads molecules from an SDF file keeping only molecules
with valid SMILES, and assign a source field
"""
df = PandasTools.LoadSDF(file)
df['Source'] = source
df['ID'] = df[id_column]
df['SMILES'] = df['ROMol'].apply(Chem.MolToSmiles)
df['LogP'] = df['ROMol'].apply(Chem.Descriptors.MolLogP)
df['MolWt'] = df['ROMol'].apply(Chem.Descriptors.MolWt)
df['LipinskyHBA'] =
df['ROMol'].apply(Chem.rdMolDescriptors.CalcNumLipinskiHBA)
df['LipinskyHBD'] =
df['ROMol'].apply(Chem.rdMolDescriptors.CalcNumLipinskiHBD)
df =
df[['Source','ID','SMILES','LogP','MolWt','LipinskyHBA','LipinskyHBD','ROMol
']]
return df
df = load_sdf_file("chembl-26_phase-1.sdf","ChEMBL_Phase-1","ID")
df.head() #Should show the top of the DataFrame, with the properties and the
structures.
All the best,
--
Gustavo Seabra
-----Original Message-----
From: Jeff Saxon <[email protected]>
Sent: Tuesday, December 1, 2020 7:35 AM
To: [email protected]
Subject: [Rdkit-discuss] Applying Lipinsky filter on ligand data set
Dear All,
I've just started working with RDKIT focusing on the application of the
Lipinsky rule on the set of my ligands. Basically I take a 3D coordinates of
each ligand file (in SDF format) and then calculate for it required 4
properties Here is my code:
# make a list of all .sdf filles present in data folder:
dirlist = [os.path.basename(p) for p in glob.glob('data' + '/*.sdf')]
# create empty data file with 5 columns:
# name of the file, value of variable p, value of ac, value of don,
value of wt
df = pd.DataFrame(columns=["key", "p", "ac", "don", "wt"])
# for each sdf file get its name and calculate 4 different
properties: p, ac, don, wt
for sdf in dirlist:
sdf_name=sdf.rsplit( ".", 1 )[ 0 ]
key = f'{sdf_name}'
mol = open(sdf,'rb')
m = Chem.ForwardSDMolSupplier(mol)
for conf in m:
if conf is None: continue
p = MolLogP(conf) # coeff conc-perm
ac = CalcNumLipinskiHBA(conf)#
don = CalcNumLipinskiHBD(conf)
wt = MolWt(conf)
#two=AllChem.Compute2DCoords(conf)
Draw.MolToFile(conf,results+f'/{key}.png')
#df[key] = [p, ac, don, wt]
Could you suggest how can I summarize the calculation of each ligand in
pandas-like DF and to then apply lipinsky filter on it?
Is it possible to convert 3D coordinates to 2D in order that I could draw it
(presently it makes a sketch based on 3d coordinates directly from SDF)?
_______________________________________________
Rdkit-discuss mailing list
<mailto:[email protected]>
[email protected]
<https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss