Re: [Rdkit-discuss] Chem.PandasTools
Dear Grégori, when storing the image into a new data frame: MMP_reaction = Chem.rdChemReactions.ReactionFromSmarts([*:1][H][*:1]C) newnew_df = pd.DataFrame(columns=['fig'],index=[1] ) newnew_df['fig'].ix[1] = Draw.ReactionToImage(MMP_reaction) apparently, the image can be stored in a data frame, but in the ipython notebook it is displayed as PIL image Cheers Thanks so far (in particular for the impressive speed in response!), Paul Hi Paul, You first have to read the MMP into a reaction object (Chem.ReactionFromSmarts). Greg On Friday, May 9, 2014, paul.czodrow...@merckgroup.com wrote: Dear Gregori Samo, thanks for your hints. I just tried running Draw.ReactionToImage([*:1][H][*:1]C) = AttributeError: 'str' object has no attribute 'GetNumReactantTemplates' BTW, how would I finally add a picture to a Pandas data frame? Cheers, Paul This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or attachment or disclose the contents to any other person. If you have received this transmission in error, please notify the sender immediately and delete the message and any attachment from your system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept liability for any omissions or errors in this message which may arise as a result of E-Mail-transmission or for damages resulting from any unauthorized changes of the content of this message and any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not guarantee that this message is free of viruses and does not accept liability for any damages caused by any virus transmitted therewith. Click http://www.merckgroup.com/disclaimer to access the German, French, Spanish and Portuguese versions of this disclaimer. -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit cartridge similarity search speeds(?)
Hi James, [fair warning before I start: we quickly hit the limits of my postgresql expertise here] On Thu, May 8, 2014 at 2:35 PM, James Davidson j.david...@vernalis.comwrote: Dear All, I have recently been spending a bit more time with the RDKit cartridge, and have what is probably a very naïve question… Having built some RDKit fingerprints for ChEMBL_18, I see the following behaviour (for clarification – ‘ecfp4_bv’ is the column in my rdk.fps table that has been generated using morganbv_fp(mol, 2)): chembl_18=# \timing on Timing is on. chembl_18=# set rdkit.tanimoto_threshold=0.5; SET Time: 0.167 ms chembl_18=# select chembl_id from rdk.fps where ecfp4_bv % morganbv_fp('c1nnccc1'::mol,2); chembl_id - CHEMBL15719 (1 row) Time: 2033.348 ms chembl_18=# select chembl_id from rdk.fps where tanimoto_sml(ecfp4_bv, morganbv_fp('c1nnccc1'::mol, 2)) 0.5; chembl_id - CHEMBL15719 (1 row) Time: 6843.605 ms I can see that the query plans are different in the two cases, but I don’t fully understand why – see below: *QUERY 1 (with explain analyze)* chembl_18=# explain analyze select chembl_id from rdk.fps where ecfp4_bv % morganbv_fp('c1nnccc1'::mol,2); QUERY PLAN Bitmap Heap Scan on fps (cost=106.91..5298.31 rows=1352 width=13) (actual time=1774.986..1774.987 rows=1 loops=1) Recheck Cond: (ecfp4_bv % '\x0100084200048204'::bfp) - Bitmap Index Scan on fps_ecfp4bv_idx (cost=0.00..106.57 rows=1352 width=0) (actual time=1774.969..1774.969 rows=1 loops=1) Index Cond: (ecfp4_bv % '\x0100084200048204'::bfp) Total runtime: 1775.035 ms (5 rows) Time: 1776.133 ms *QUERY 2 (with explain analyze)* chembl_18=# explain analyze select chembl_id from rdk.fps where tanimoto_sml(ecfp4_bv, morganbv_fp('c1nnccc1'::mol, 2)) 0.5; QUERY PLAN --- Seq Scan on fps (cost=0.00..388808.17 rows=450793 width=13) (actual time=1278.115..6953.977 rows=1 loops=1) Filter: (tanimoto_sml(ecfp4_bv, '\x0100084200048204'::bfp) 0.5::double precision) Rows Removed by Filter: 1352377 Total runtime: 6954.010 ms (4 rows) Time: 6955.103 ms What these are telling you is that the second query is not using the index: it's a sequential scan, so it has to test all rows of the database. This happens because the index is defined for the operator %, but not for the function tanimoto_sml(). There may be an approach to get the index set up using that function, but there we reach the limits of my expertise. It seems conceptually ‘easier’ to add the similarity value as part of the query, rather than setting it as a variable ahead of the query; but clearly I should be doing it the latter way for performance reasons. So even if I don’t fully understand why at the moment, am I correct in thinking that queries of this sort should always be run with the similarity operators (%, #)? And if so, is the rdkit.tanimoto_threshold variable set at the level of the session, the user, or the database? It's set at the session level. When doing similarity searches, I find it generally helpful to also include the % operator in an order by clause so that the results come back in sorted order. So instead of this; chembl_17=# select molregno from rdk.fps where mfp2 % morganbv_fp('Cc1ccc2nc(N(C)CC(=O)O)sc2c1'); molregno -- 412312 412302 412310 441378 470082 773946 775269 911501 1015485 1034321 1040255 1040496 1042958 1043871 1044892 1045663 1047691 1049393 (18 rows) Time: 1042.310 ms I do this: chembl_17=# select molregno from rdk.fps where mfp2 % morganbv_fp('Cc1ccc2nc(N(C)CC(=O)O)sc2c1') order by morganbv_fp('Cc1ccc2nc(N(C)CC(=O)O)sc2c1') % mfp2; molregno -- 412312 470082 1040255 773946 1044892 1049393 1040496 441378 1047691 1042958 412302 1043871 412310 1045663 911501 775269 1015485 1034321 (18 rows) Time: 1032.266 ms Notice that this doesn't make things any slower. It's nice to see the actual similarity values: chembl_17=# select molregno,tanimoto_sml(morganbv_fp('Cc1ccc2nc(N(C)CC(=O)O)sc2c1'),mfp2) from rdk.fps where
Re: [Rdkit-discuss] Chem.PandasTools
Hi, You can create new object that stores MMP and has default pandas and ipython representation as base64 encoded png. This usually works for me, but I'm not sure why in this case it works only for ipython representation and not for pandas.. The code: # codecell import pandas as pd import rdkit.Chem as Chem from rdkit.Chem import PandasTools from rdkit.Chem import Draw from rdkit.Chem.Draw import IPythonConsole # codecell from base64 import b64encode from StringIO import StringIO class Reaction(): def __init__(self, reaction=None): self.reaction = reaction def _repr_html_(self): sio = StringIO() Draw.ReactionToImage(self.reaction).save(sio,format='PNG') s = b64encode(sio.getvalue()) return 'img src=data:image/png;base64,%s/' %s def __str__(self): sio = StringIO() Draw.ReactionToImage(self.reaction).save(sio,format='PNG') s = b64encode(sio.getvalue()) return 'img src=data:image/png;base64,%s/' %s # codecell MMP_reaction = Chem.rdChemReactions.ReactionFromSmarts([*:1][H][*:1]C) # codecell mmp = Reaction(MMP_reaction) # codecell mmp # codecell newnew_df = pd.DataFrame(columns=['fig'],index=[1] ) newnew_df['fig'].ix[1] = mmp # codecell newnew_df # codecell Regards, Samo On Fri, May 9, 2014 at 8:19 AM, paul.czodrow...@merckgroup.com wrote: Dear Grégori, when storing the image into a new data frame: MMP_reaction = Chem.rdChemReactions.ReactionFromSmarts([*:1][H][*:1]C) newnew_df = pd.DataFrame(columns=['fig'],index=[1] ) newnew_df['fig'].ix[1] = Draw.ReactionToImage(MMP_reaction) apparently, the image can be stored in a data frame, but in the ipython notebook it is displayed as PIL image Cheers Thanks so far (in particular for the impressive speed in response!), Paul Hi Paul, You first have to read the MMP into a reaction object (Chem.ReactionFromSmarts). Greg On Friday, May 9, 2014, paul.czodrow...@merckgroup.com wrote: Dear Gregori Samo, thanks for your hints. I just tried running Draw.ReactionToImage([*:1][H][*:1]C) = AttributeError: 'str' object has no attribute 'GetNumReactantTemplates' BTW, how would I finally add a picture to a Pandas data frame? Cheers, Paul This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or attachment or disclose the contents to any other person. If you have received this transmission in error, please notify the sender immediately and delete the message and any attachment from your system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept liability for any omissions or errors in this message which may arise as a result of E-Mail-transmission or for damages resulting from any unauthorized changes of the content of this message and any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not guarantee that this message is free of viruses and does not accept liability for any damages caused by any virus transmitted therewith. Click http://www.merckgroup.com/disclaimer to access the German, French, Spanish and Portuguese versions of this disclaimer. -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit cartridge similarity search speeds(?)
Hi Greg, What these are telling you is that the second query is not using the index: it's a sequential scan, so it has to test all rows of the database. This happens because the index is defined for the operator %, but not for the function tanimoto_sml(). There may be an approach to get the index set up using that function, but there we reach the limits of my expertise. Well, I will stick to the recommended operator use then! One final advanced topic: if you are planning on making regular use of the similarity features in the cartridge and are running on a linux system or Mac I would recommend recompiling the cartridge with some optimizations for tanimoto similarity. To do this, you need to edit the cartridge Makefile from: PG_CPPFLAGS = -I${BOOSTHOME} -I${RDKIT}/Code -DRDKITVER='007200' ${INCHIFLAGS} #-DUSE_BUILTIN_POPCOUNT -msse4.2 to: PG_CPPFLAGS = -I${BOOSTHOME} -I${RDKIT}/Code -DRDKITVER='007200' ${INCHIFLAGS} -DUSE_BUILTIN_POPCOUNT -msse4.2 (I just removed a comment character here). This speeds the Tanimoto calculation up a fair bit (it's still not nearly as fast as Andrew's chemfp, but it's better than the default behavior). I'm on linux (Ubuntu), and have just re-built with the above recommendation. I'll see what the speeds look like afterwards (out of interest, I presume the timings in your examples were with this optimisation in place?). Does this also affect dice? And final question - after rebuilding the cartridge, does the extension need to be dropped and then re-created in all databases; does postgreSQL server need restarting; or neither? Hope this helps, -greg It does - thanks! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit cartridge similarity search speeds(?)
James, On Fri, May 9, 2014 at 10:25 AM, James Davidson j.david...@vernalis.comwrote: One final advanced topic: if you are planning on making regular use of the similarity features in the cartridge and are running on a linux system or Mac I would recommend recompiling the cartridge with some optimizations for tanimoto similarity. To do this, you need to edit the cartridge Makefile from: PG_CPPFLAGS = -I${BOOSTHOME} -I${RDKIT}/Code -DRDKITVER='007200' ${INCHIFLAGS} #-DUSE_BUILTIN_POPCOUNT -msse4.2 to: PG_CPPFLAGS = -I${BOOSTHOME} -I${RDKIT}/Code -DRDKITVER='007200' ${INCHIFLAGS} -DUSE_BUILTIN_POPCOUNT -msse4.2 (I just removed a comment character here). This speeds the Tanimoto calculation up a fair bit (it's still not nearly as fast as Andrew's chemfp, but it's better than the default behavior). I'm on linux (Ubuntu), and have just re-built with the above recommendation. I'll see what the speeds look like afterwards (out of interest, I presume the timings in your examples were with this optimisation in place?). I would have thought so, but it turns out that I was using a build on my Mac without the optimization. It's about 20% faster for those sample queries when I use the rebuilt version. Does this also affect dice? It should, but, stupidly, it looks like it doesn't. I'll fix that. And final question - after rebuilding the cartridge, does the extension need to be dropped and then re-created in all databases; does postgreSQL server need restarting; or neither? You do not need to drop the extension. This is just a change to the shared library and doesn't affect the API. You just need to do a make install to copy in the new shared lib and then restart the server. -greg -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] autodock vina pdbqt file to mol2
Hi Jan, AutoDock has a set of tools (MGLTools) that have tools to convert pdb to pdbqt and vice-versa. If I recall it can also convert pdbqt to mol2 also. See this discussion http://autodock.1369657.n2.nabble.com/ADL-pdbqt-to-mol2-td6755769.html Best, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas On 9 May 2014 20:17, Jan Domanski jan...@gmail.com wrote: Hi guys, I'm really stuck here: I have some output from autodock vina in a rather obscure pdbqt format. It's a little bit like pdb but not quite. I'm trying to get back a mol2 file. The autodock pdbqt file has only the polar hydrogens in it – part of the trick is to re-add the hydrogens. Example autodock vina output is attached (it's a conformer of the ACE native ligand DUDE). First of all, I convert that to a PDB file by doing a simple sed, sed -e '/ROOT/d' -e '/BRANCH/d' Then I reorder the atoms to match those of the original crystal_ligand.mol2 (because autodock re-orders the atoms duh). Finally, I save a mol2 file out (attached) ordered as the original crystal_ligand and with polar hydrogens (for each pose of a conformer). Let's go to rdkit and try to add hydrogens: mol = Chem.MolFromMol2File(output, removeHs=False) mol2 = AllChem.AddHs(mol, addCoords=True) print mol.GetNumAtoms(), mol2.GetNumAtoms() 44 44 So, only the implicit hydorgens are present. Calling AddHs doesn't raise an error and it doesn't really change the number of hydrogens... Now this may not be the best way of doing things: what I care for is to get a mol2 from autodock vina that I can compare to the original mol2 from DUD (same atom order, same number of atoms). Maybe there are other ways to achieve this: one idea would be to inject the docked pose coordinates into the original mol2 atoms (heavy and polar hydrogens) and somehow adjust the non-polar hydrogens. Thanks, - Jan -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] autodock vina pdbqt file to mol2
Thanks for the quick reply Christos! I found the pdbqt_to_pdb script that you mentioned but a google search for a pdbqt to mol2 yield nothing (other than this thread). the pdbqt_to_pdb converter is very crude: it retains only the best pose from _out.pdbqt and it basically just strips the BRANCH and ROOT tags deposited by autodock (which I was doing anyway with the sed). The main problems remaining are atom order (I can fix that) and missing hydrogens (can't fix that). There is a mode where I can prevent the prepare_ligand4.py from removing the hydrogens - but the output poses then have really weird geometry. But let's refocus a little bit: this is not an autodock vina question (although many folks here are knowledgeable enough to help me). This is a question on a mol2 file to which it should be possible to add Hs with rdkit and it's somehow not happening (at least not in my hands). My mol2 could be somehow malformatted. On 9 May 2014 20:57, Christos Kannas chriskan...@gmail.com wrote: Hi Jan, AutoDock has a set of tools (MGLTools) that have tools to convert pdb to pdbqt and vice-versa. If I recall it can also convert pdbqt to mol2 also. See this discussion http://autodock.1369657.n2.nabble.com/ADL-pdbqt-to-mol2-td6755769.html Best, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas On 9 May 2014 20:17, Jan Domanski jan...@gmail.com wrote: Hi guys, I'm really stuck here: I have some output from autodock vina in a rather obscure pdbqt format. It's a little bit like pdb but not quite. I'm trying to get back a mol2 file. The autodock pdbqt file has only the polar hydrogens in it - part of the trick is to re-add the hydrogens. Example autodock vina output is attached (it's a conformer of the ACE native ligand DUDE). First of all, I convert that to a PDB file by doing a simple sed, sed -e '/ROOT/d' -e '/BRANCH/d' Then I reorder the atoms to match those of the original crystal_ligand.mol2 (because autodock re-orders the atoms duh). Finally, I save a mol2 file out (attached) ordered as the original crystal_ligand and with polar hydrogens (for each pose of a conformer). Let's go to rdkit and try to add hydrogens: mol = Chem.MolFromMol2File(output, removeHs=False) mol2 = AllChem.AddHs(mol, addCoords=True) print mol.GetNumAtoms(), mol2.GetNumAtoms() 44 44 So, only the implicit hydorgens are present. Calling AddHs doesn't raise an error and it doesn't really change the number of hydrogens... Now this may not be the best way of doing things: what I care for is to get a mol2 from autodock vina that I can compare to the original mol2 from DUD (same atom order, same number of atoms). Maybe there are other ways to achieve this: one idea would be to inject the docked pose coordinates into the original mol2 atoms (heavy and polar hydrogens) and somehow adjust the non-polar hydrogens. Thanks, - Jan -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] autodock vina pdbqt file to mol2
Jan, On Fri, May 9, 2014 at 9:17 PM, Jan Domanski jan...@gmail.com wrote: Finally, I save a mol2 file out (attached) ordered as the original crystal_ligand and with polar hydrogens (for each pose of a conformer). Let's go to rdkit and try to add hydrogens: mol = Chem.MolFromMol2File(output, removeHs=False) mol2 = AllChem.AddHs(mol, addCoords=True) print mol.GetNumAtoms(), mol2.GetNumAtoms() 44 44 So, only the implicit hydorgens are present. Calling AddHs doesn't raise an error and it doesn't really change the number of hydrogens... The mol2 parser assumes that all Hs are present, so it sets a flag on every atom saying that it has no implicit Hs. You can manually undo this: In [23]: m = Chem.MolFromMol2File('Downloads/conformer_132_out_1.mol2',sanitize=False) In [24]: for atom in m.GetAtoms(): atom.SetNoImplicit(False) In [25]: m.UpdatePropertyCache() In [26]: mh = Chem.AddHs(m,addCoords=True) In [27]: print mh.GetNumAtoms() 69 I skipped sanitization entirely when I read the molecule in. This is to prevent the assignment of radicals. Note that the mol2 parser is really only even remotely tested with output from Corina; if you end up with alternate atom types, you should anticipate further problems. -greg -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] autodock vina pdbqt file to mol2
Babel can read and write both pdbqt and mol2 files. I'm not sure how the atom ordering might be accomplished though. TJ On May 9, 2014 2:43 PM, Jan Domanski jan...@gmail.com wrote: Thanks for the quick reply Christos! I found the pdbqt_to_pdb script that you mentioned but a google search for a pdbqt to mol2 yield nothing (other than this thread). the pdbqt_to_pdb converter is very crude: it retains only the best pose from _out.pdbqt and it basically just strips the BRANCH and ROOT tags deposited by autodock (which I was doing anyway with the sed). The main problems remaining are atom order (I can fix that) and missing hydrogens (can't fix that). There is a mode where I can prevent the prepare_ligand4.py from removing the hydrogens – but the output poses then have really weird geometry. But let's refocus a little bit: this is not an autodock vina question (although many folks here are knowledgeable enough to help me). This is a question on a mol2 file to which it should be possible to add Hs with rdkit and it's somehow not happening (at least not in my hands). My mol2 could be somehow malformatted. On 9 May 2014 20:57, Christos Kannas chriskan...@gmail.com wrote: Hi Jan, AutoDock has a set of tools (MGLTools) that have tools to convert pdb to pdbqt and vice-versa. If I recall it can also convert pdbqt to mol2 also. See this discussion http://autodock.1369657.n2.nabble.com/ADL-pdbqt-to-mol2-td6755769.html Best, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas On 9 May 2014 20:17, Jan Domanski jan...@gmail.com wrote: Hi guys, I'm really stuck here: I have some output from autodock vina in a rather obscure pdbqt format. It's a little bit like pdb but not quite. I'm trying to get back a mol2 file. The autodock pdbqt file has only the polar hydrogens in it – part of the trick is to re-add the hydrogens. Example autodock vina output is attached (it's a conformer of the ACE native ligand DUDE). First of all, I convert that to a PDB file by doing a simple sed, sed -e '/ROOT/d' -e '/BRANCH/d' Then I reorder the atoms to match those of the original crystal_ligand.mol2 (because autodock re-orders the atoms duh). Finally, I save a mol2 file out (attached) ordered as the original crystal_ligand and with polar hydrogens (for each pose of a conformer). Let's go to rdkit and try to add hydrogens: mol = Chem.MolFromMol2File(output, removeHs=False) mol2 = AllChem.AddHs(mol, addCoords=True) print mol.GetNumAtoms(), mol2.GetNumAtoms() 44 44 So, only the implicit hydorgens are present. Calling AddHs doesn't raise an error and it doesn't really change the number of hydrogens... Now this may not be the best way of doing things: what I care for is to get a mol2 from autodock vina that I can compare to the original mol2 from DUD (same atom order, same number of atoms). Maybe there are other ways to achieve this: one idea would be to inject the docked pose coordinates into the original mol2 atoms (heavy and polar hydrogens) and somehow adjust the non-polar hydrogens. Thanks, - Jan -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss