Hi Marawan, I'm not sure this is the cause of the problem but regarding your line
input_smiles_df.append(new_row,ignore_index=True) in contrast to appending items to lists in Python, when you use the df.append() function it will return a new dataframe instead of adding a row in place. So perhaps you need to reassign the dataframe like: input_smiles_df = input_smiles_df.append(new_row,ignore_index=True) although actually it is recommended to append to a list and then use df.concatenate() instead, according to the notes here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.append.html I also noticed that according to the documentation, EnumerateStereoisomers only returns multiple isomers if the stereocenters are undefined (http://rdkit.org/docs/source/rdkit.Chem.EnumerateStereoisomers.html) I don't think I can see the SMILES you are actually working with but see the difference in this example: from rdkit.Chem.EnumerateStereoisomers import EnumerateStereoisomers thalodamide = Chem.MolFromSmiles('O=C1CCC(N2C(=O)c3ccccc3C2=O)C(=O)N1') isomers = tuple(EnumerateStereoisomers(thalodamide)) print(len(isomers)) thalodamide2 = Chem.MolFromSmiles('O=C1CC[C@H](N2C(=O)c3ccccc3C2=O)C(=O)N1') isomers = tuple(EnumerateStereoisomers(thalodamide2)) print(len(isomers)) Output: 2 1 The option onlyUnassigned=False changes this behaviour (see documentation). Could this be why you are only getting back 1 every time from your print("Number of stereoisomer is: ", len(isomers_list)) ? Not sure this solves your problem but perhaps worth checking. Regards, Ines ________________________________ From: Marawan Hussien via Rdkit-discuss <rdkit-discuss@lists.sourceforge.net> Sent: 30 August 2020 05:05 To: rdkit-discuss@lists.sourceforge.net <rdkit-discuss@lists.sourceforge.net> Subject: [Rdkit-discuss] appending new rows in dataframe with stereo-isomers Hi, I am trying to append (update) a pandas dataframe (created by Pnadatools from a CSV file) with potential stereoisomers for each molecule in the dataframe. My understanding is that the EnumerateStereoisomers function returns a generator that I can loop through and use the mol object (or smiles created using the Chem.MolToSmiles function) to create new rows and then append this row to the end of the data frame, I tried the following code but nothing is appended: from rdkit.Chem.EnumerateStereoisomers import EnumerateStereoisomers, StereoEnumerationOptions def generate_Stereoisomers(x): opts = StereoEnumerationOptions(tryEmbedding=True) isomers = tuple(EnumerateStereoisomers(x, options=opts)) return isomers input_smiles_df["stereo_isomers"] = input_smiles_df["Cannonical_tautomer"].apply(lambda m:generate_Stereoisomers(m)) for index, row in input_smiles_df.iterrows(): isomers_list = row["stereo_isomers"] print("Number of stereoisomer is: ", len(isomers_list)) ##This line always gives 1 back, although the molecules have many stereocenters for smi in sorted(rdkit.Chem.MolToSmiles(x,isomericSmiles=True) for x in isomers_list): print(smi) new_row = {'Cannonical_tautomer':None, id_col_name:str(row[id_col_name]),\ smiles_col_name:row[smiles_col_name], 'standardized_smiles':smi,\ 'num_stereo_isomers':row["num_stereo_isomers"]} input_smiles_df.append(new_row,ignore_index=True) Any suggestion ? Thanks
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss