This line of code works for me on a data frame with over 6M compounds … PandasTools.AddMoleculeColumnToFrame(df, 'smiles', 'mol', includeFingerprints=True)
‘smiles’ is the name of the column containing the SMILES, ‘mol’ is the name of the new column with the mol objects. Once that’s done, you can address the rows where mol is ‘None’ … From: Mike Mazanetz <mi...@novadatasolutions.co.uk> Sent: Thursday, October 31, 2019 8:54 AM To: 'Fiorella Ruggiu' <ruggiu.fiore...@gmail.com> Cc: 'RDKit Discuss' <rdkit-discuss@lists.sourceforge.net> Subject: Re: [Rdkit-discuss] failed mols in converting SMILES to Pandas dataframe Molecule Hi Fio, Thanks for the tips. I’ve found that I need PandasTools to convert a smiles to a mol though, I’ve not had MolFromSmiles work on a dataframe. Have you found that this works? Cheers, mike From: Fiorella Ruggiu <ruggiu.fiore...@gmail.com<mailto:ruggiu.fiore...@gmail.com>> Sent: 31 October 2019 15:48 To: Mike Mazanetz <mi...@novadatasolutions.co.uk<mailto:mi...@novadatasolutions.co.uk>> Cc: Jan Halborg Jensen <jhjen...@chem.ku.dk<mailto:jhjen...@chem.ku.dk>>; RDKit Discuss <rdkit-discuss@lists.sourceforge.net<mailto:rdkit-discuss@lists.sourceforge.net>> Subject: Re: [Rdkit-discuss] failed mols in converting SMILES to Pandas dataframe Molecule Hello Mike, you could create a function with your if else structure and use apply on the pandas dataframe. For example, if you have a SMILES column in your df: def addMol(smiles): if Chem.MolFromSmiles(smiles) is None: Etc return None # or whatever you wish to return when it fails else: Etc return Chem.MolFromSmiles(smiles) df['RDKitMol']=df.apply(lambda row: addMol(row['SMILES']), axis=1) Might not be as efficient as the build-in PandasTools though. Best, Fio On Thu, Oct 31, 2019 at 8:07 AM Mike Mazanetz <mi...@novadatasolutions.co.uk<mailto:mi...@novadatasolutions.co.uk>> wrote: Dear RDKit’ers I’ve been trying to skip failed molecules in PandasTools.AddMoleculeColumnToFrame. This is possible if I chuck each row to a different processor, but what I really want to do is return a missing row entry. Normally I’d go: If mol is None: Etc Else: Etc But Pandas DF’s seem to being playing hard-ball. Any thoughts? Cheers, mike _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss