Re: [Rdkit-discuss] Warning as error
I've had similar problems; none of the claimed methods to switch off RDKit logging of warnings has worked for me. I ended up just re-directing stderr when running the script like this: python myfile.py 2> myErrorLog.txt Dr. Steve O'Hagan, -Original Message- From: Jean-Marc Nuzillard [mailto:jm.nuzill...@univ-reims.fr] Sent: 21 January 2019 12:33 To: RDKit Discuss Subject: [Rdkit-discuss] Warning as error Dear all, The minimalist python code: reader = Chem.SDMolSupplier('my_file.sdf') for mol in reader: pass gives me warning messages when run on a particular SD file. How can I simply run a specific action for the molecules that cause problem, possibly using try/catch statements? Best, Jean-Marc -- Jean-Marc Nuzillard Directeur de Recherches au CNRS Institut de Chimie Moléculaire de Reims CNRS UMR 7312 Moulin de la Housse CPCBAI, Bâtiment 18 BP 1039 51687 REIMS Cedex 2 France Tel : 03 26 91 82 10 Fax : 03 26 91 31 66 http://www.univ-reims.fr/ICMR http://eos.univ-reims.fr/LSD/CSNteam.html http://www.univ-reims.fr/LSD/ http://www.univ-reims.fr/LSD/JmnSoft/ --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] smarts substructure query match = FALSE?
Hi folks, This looks as if HasSubstructMatch should return TRUE, so why is it FALSE? [Python 3.6, RDKit 2017.09.3] from rdkit import Chem from rdkit.Chem import Draw patt = Chem.MolFromSmarts("[*,#1]-[#7]-1-[#6]-[#6]-[#7](-[#6]-[#6]-1)-[#6](\[*,#1])=[#7]\[#6]-1=[#6]-[#6](-[*,#1])=[#6](-[*,#1])-[#6]=[#6]1-[*,#1]") mol = Chem.MolFromSmiles("O=C(N3CCN(c2nc1cc(OC)c(OC)cc1c(n2)N)CC3)c4occc4") fig = Draw.MolToMPL(patt) fig2 = Draw.MolToMPL(mol) mol.HasSubstructMatch(patt) #why is this FALSE ? Cheers, Steve ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] comparing two or more tables of molecules
Thanks for the interesting links. MolVS looks good, but failed on ‘NC(CC(=O)O)C(=O)[O-].O.O.[Na+]’ which isn’t that extraordinary… Couldn’t get Standardise to work at all, even on the example given; API not intuitive or docs wrong or out of date. I will have a look at the info in the UniChem paper, though not inclined to use a web service for what I want to do. Cheers, Steve. From: George Papadatos [mailto:gpapada...@gmail.com] Sent: 01 December 2016 14:26 To: Greg Landrum <greg.land...@gmail.com> Cc: Stephen O'hagan <soha...@manchester.ac.uk>; rdkit-discuss@lists.sourceforge.net; Francis Atkinson <fran...@ebi.ac.uk> Subject: Re: [Rdkit-discuss] comparing two or more tables of molecules HI Stephen, Further to Greg's excellent reply, see this paper on how InChI strings and keys can be used in practice to map together tautomer (ones covered by InChI at least), isotope, stereo and parent-salt variants. http://rd.springer.com/article/10.1186/s13321-014-0043-5 Francis (cc'ed) has a nice notebook somewhere illustrating these nice InChI splits to find these variants. For educational purposes, there have been other approaches like the NCI's identifiers - discussion here: http://acscinf.org/docs/meetings/237nm/presentations/237nm17.pdf For pure structure standardization using RDKit see here: https://github.com/flatkinson/standardiser and https://github.com/mcs07/MolVS Cheers, George On 29 November 2016 at 17:02, Greg Landrum <greg.land...@gmail.com<mailto:greg.land...@gmail.com>> wrote: Wow, this is a great question and quite a fun thread. It's hard to really make much of a contribution here without writing a book/review article (something that I'm really not willing to do!), but I have a few thoughts. Most of this is repeating/rephrasing things others have already said. I'm going to propose some things as facts. I think that these won't be controversial: fact 1: if the structures are coming from different sources, they need to be standardized/normalized before you compare them. This is true regardless of how you want to compare them. The details of the standardization process are not incredibly important, but it does need to take care of the things you care about when comparing molecules. For example, if you don't care about differences between salts, it should strip salts. If you don't care about differences between tautomers, it should normalize tautomers. fact 2: The InChI algorithm includes a standardization step that normalizes some tautomers, but does not remove salts. fact 3: The InChI representation contain a number of layers defining the structure in increasing detail (this isn't strictly true, because some of the choices about how layers are ordered are arbitrary, but it's close). fact 4: canonicalization, the way I define it, produces a canonical atom numbering for a given structure, but it does *not* standardize fact 5: the RDKit has essentially no well-documented standardization code fact X: we don't have any standard, broadly accepted approach for standardization, canonicalization or representation that is fool-proof or that works for even all of organic chemistry, never mind organometallics. InChI, useful as it is for some things, completely fails to handle things like atropisomers (they are working on this kind of thing, but it's not out yet). Given all of this, if I wanted to have flexible duplicate checking *right* now, I think I would use the AvalonTools struchk functionality that the RDKit provides (the new pure-RDKit version still needs a bit more testing) to handle basic standardization and salt stripping and then produce a table that includes the InChI in a couple of different forms. I'd want to be able to recognize molecules that differ only by stereochemistry, molecules that differ only by location of tautomeric Hs, and molecules that differ only by the location of isotopic labels. You can do this with various clever splits of the InChI (how to do it is left as an exercise for the reader and/or a future RDKit blog post). I think there's something fun to be done here with SMILES variants, borrowing heavily from some of the things that Roger has written about: https://nextmovesoftware.com/blog/2013/04/25/finding-all-types-of-every-mer/ here's a more recent application of that from Noel: https://nextmovesoftware.com/blog/2016/06/22/fishing-for-matched-series-in-a-sea-of-structure-representations/ If I didn't really care about details and just wanted something that I could explain easily to others, I'd skip all the complication and just use InChIs (or InChI keys) to recognize duplicates. There would be times when that would be the wrong answer, but it would be a broadly accepted kind of wrong.[1] Regardless of the approach, I would not, under most any circumstances, discard the original input structures that I had. It's really good to be able to figure out what the original data looked like lat
[Rdkit-discuss] comparing two or more tables of molecules
Has anyone come up with fool-proof way of matching structurally equivalent molecules? Unique Smiles or InChI String comparisons don't appear to work presumable because there are different but equivalent structures, e.g. explicit vs non-explicit H's, Kekule vs Aromatic, isomeric forms vs non-isomeric form, tautomers etc. I also expect that comparing InChI strings might need something more than just a simple string comparison, such as masking off stereo information when you don't care about stereo isomers. I assume there are suitable tools within RDKit that can do this? N.B. I need to collate tables from several sources that have a mix of smiles / InChI / sdf molecular representations. I usually use RDKit via Python and/or Knime. Cheers, Steve. -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] MolWt of substructure hit?
Hi, Thanks for this, the clue that I needed was that there's a method: " matches = mol.GetSubstructMatches(pat) " This should work fine for what I need. Cheers, Steve. -Original Message- From: Andrew Dalke [mailto:da...@dalkescientific.com] Sent: 07 September 2016 12:10 To: Stephen O'hagan <soha...@manchester.ac.uk> Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] MolWt of substructure hit? On Sep 7, 2016, at 11:53 AM, Stephen O'hagan wrote: > How would I find the molecular weight (fraction) of that substructure within > a compounds expressed as a SMILES string, e.g.: I don't know if a built-in function which does this. It's possible to write one. Here's a function which will compute the molecular weight given the molecule and the atom indices for the fragment. def get_fragment_molwt(mol, atom_indices): assert len(atom_indices) == len(set(atom_indices)) # quick duplicate check molwt = 0.0 for atom_index in atom_indices: atom = mol.GetAtomWithIdx(atom_index) molwt += atom.GetMass() return molt If you want to include the hydrogen mass, then use this variant: from rdkit import Chem _H_mass = Chem.Atom(1).GetMass() def get_fragment_molwt(mol, atom_indices): assert len(atom_indices) == len(set(atom_indices)) # quick duplicate check molwt = 0.0 for atom_index in atom_indices: atom = mol.GetAtomWithIdx(atom_index) molwt += atom.GetMass() + atom.GetTotalNumHs() * _H_mass return molt Here's an example of how to use the function: #== from rdkit import Chem def get_fragment_molwt(): ... as above ... smiles = "CC(=O)O[C@H]1CC[C@@]2(C)C(=CCC3C4CC=C(c5cccnc5)[C@@]4(C)CCC32)C1" smarts = "[#6](:,-[#6]:,-[#6](-[#6]):,-[#6]-[#6](:[#6]:[#7]):[#6]:[#6]):,-[#6]:,-[#6]" mol = Chem.MolFromSmiles(smiles) assert mol is not None, smiles pat = Chem.MolFromSmarts(smarts) assert pat is not None, smarts matches = mol.GetSubstructMatches(pat) molwt = MolWt(mol) for match_no, match in enumerate(matches, 1): fragment_molwt = get_fragment_molwt(mol, match) print("#{}: {:.2%}".format(match_no, fragment_molwt/molwt)) #== If I don't include the hydrogens in the fragment weight calculation then I get: #1: 37.32% #2: 37.32% #3: 37.32% ... If I include the hydrogens, then I get: #1: 40.15% #2: 39.64% #3: 40.15% ... Cheers, Andrew da...@dalkescientific.com -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] MolWt of substructure hit?
Hi, Supposing I have identified a substructure as a SMARTS string, e.g. [#6](:,-[#6]:,-[#6](-[#6]):,-[#6]-[#6](:[#6]:[#7]):[#6]:[#6]):,-[#6]:,-[#6] - In general, this may have wild card atoms. How would I find the molecular weight (fraction) of that substructure within a compounds expressed as a SMILES string, e.g.: CC(=O)O[C@H]1CC[C@@]2(C)C(=CCC3C4CC=C(c5cccnc5)[C@@]4(C)CCC32)C1 I may or may not wish to count multiple hits in one target. Cheers, Steve. -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit compile is successful, but python does see RDKit?
Hi, I have it working now. Two problems were causing the errors. 1) I hadn’t fully purged the RDKit libraries from an earlier apt-get install. 2) I had assumed that LD_LIBRARY_PATH being set in the usual places (such as ~/.bashrc) would work. It seems Ubuntu has a “feature” whereby LD_LIBRARY_PATH is automatically reset. To get RDKit to work, one needs to add an entry to /etc/ld.so.conf.d/ and do ‘sudo ldconfig’. Cheers, Steve. From: JP [mailto:jeanpaul.ebe...@inhibox.com] Sent: 17 February 2015 20:43 To: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] RDKit compile is successful, but python does see RDKit? Hi Stephen, As Christos pointed out, it is almost always the environment variables which get you. What is the error message you are getting? Some installation instructions specific for Ubuntu may be found at (work and tested till version 14.04): http://www.blopig.com/blog/2013/02/how-to-install-rdkit-on-ubuntu-12-04/ Take Care, JP - Jean-Paul Ebejer Early Stage Researcher On 17 February 2015 at 17:20, Stephen O'hagan soha...@manchester.ac.ukmailto:soha...@manchester.ac.uk wrote: Hi, On one our Ubuntu machines, I’ve installed RDKit (compiled from source to get the latest version); ctest passed all tests. Cmake seemed to detect the correct python version and boost libs. However, python does not see the RDkit module(s). Any ideas what might be going wrong? Dr. Steve O'Hagan, Computer Officer, Bioanalytical Sciences Group, School of Chemistry, Manchester Institute of Biotechnology, University of Manchester, 131, Princess St, MANCHESTER M1 7DN. Email: soha...@manchester.ac.ukmailto:soha...@manchester.ac.uk Phone: 0161 306 4562 -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.netmailto:Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] conda-rdkit fails to install Win7.
Hi Riccardo, Thanks for the quick reply. Closer scrutiny revealed that CMake was not finding the correct version of MS compiler, so compilations were therefore failing. Tried “conda build rdkit” again using the VS x64 command Prompt (2010), and it now appears to have worked [mostly]. Some tests failed, but info has since scrolled off into hyperspace… is there an easy method to re-run the RDKit test suite? BTW, ‘LastTestsFailed.log’ contains: 7:pyDiscreteValueVect 8:pySparseIntVect 34:testMolSupplier 50:pyPartialCharges 71:pyGraphMolWrap 77:pyRanker 79:pyFeatures 80:pythonTestDbCLI 81:pythonTestDirML 86:pythonTestDirChem Cheers, Steve. From: Riccardo Vianello [mailto:riccardo.viane...@gmail.com] Sent: 18 November 2014 19:46 To: Stephen O'hagan Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] conda-rdkit fails to install Win7. Hi Steve, On Tue, Nov 18, 2014 at 2:33 PM, Stephen O'hagan soha...@manchester.ac.ukmailto:soha...@manchester.ac.uk wrote: Trying to install conda-rdkit on win7 64-bit as per instructions https://github.com/rdkit/conda-rdkit ‘conda build boost’ appears to work. Yes, the conda recipe for boost on windows currently simply performs a repackaging of the official boost binary distribution, so it should work in almost all cases. ‘conda build rdkit’ appears to download and re-install boost during installation. This is probably expected, during the build process boost is installed together with the other build-time dependencies into a temporary environment which is automatically created by conda. A message saying that some packages are being downloaded is most likely to refer to packages that are not already available from the local conda cache. These packages are most usually downloaded from a remote distribution channel, but the list may also include packages that are copied from the local build directory (which I think was probably the case for boost). It then fails with cmake unable to find boost, and subsequently ‘nmake error U1073’. And this is quite unexpected, but also difficult to interpret with the provided amount of information.. Could you please send a copy of the actual cmake command line that was issued, and/or the CMakeCache.txt file that should have been created inside the top-level RDKit source distribution directory at path to your anaconda installation\conda-bld\work? Finally, I don't know if it may be of help, but some windows 64-bit packages for the latest RDKit release should now be also available from the binstar rdkit channel (in order to fetch them, you would just need to add '-c rdkit' to the conda create/install/update command line. The build for these packages passed all tests but the two related to the avalon tools). Best, Riccardo -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] conda-rdkit fails to install Win7.
Trying to install conda-rdkit on win7 64-bit as per instructions https://github.com/rdkit/conda-rdkit 'conda build boost' appears to work. 'conda build rdkit' appears to download and re-install boost during installation. It then fails with cmake unable to find boost, and subsequently 'nmake error U1073'. Any ideas what might be wrong? Cheers, Steve. -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] remove redundant bits from bitvector fingerprints
OK, thanks for this – I’ll have a go and see it works for me. Cheers, Steve. From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 13 June 2014 13:23 To: Stephen O'hagan Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] remove redundant bits from bitvector fingerprints hmm, this got lost in my mailbox. Sorry. You can do what I think you want to do using the information theory machinery that the rdkit has available. Here's a short snippet that finds the bits that are not redundant in a data set (redundancy here calculated using information entropy): In [48]: ms = [Chem.MolFromSmiles(x.split()[1]) for x in file('./Target_no_107_58879.txt')] In [49]: nbits = 2048 In [50]: fps = [rdMolDescriptors.GetHashedAtomPairFingerprintAsBitVect(x,nbits) for x in ms] In [58]: entropies = [] In [59]: for i in range(nbits): arr = numpy.array([x[i] for x in fps]) e = InfoTheory.InfoEntropy(arr) entropies.append(e) : In [60]: entropies = numpy.array(entropies) In [61]: goodbits = numpy.array(range(nbits))[entropies0.0] In [62]: len(goodbits) Out[62]: 891 your case is pretty big, so this may take a bit, but it shouldn't be too slow. -greg On Wed, Jun 4, 2014 at 5:08 AM, Stephen O'hagan soha...@manchester.ac.ukmailto:soha...@manchester.ac.uk wrote: Hi, I have a set of say 1000 generated fingerprints each of length 39972; across all 1000 fingerprints many bits are the same – they contain no information about the differences between the 1000 molecules. e.g. for list 01011 010110100 010101110 010100010 The first four bits are redundant, I could just record them as: 1 10100 01110 00010 In reality, the redundant bits are distributed through the bit string, so I need a method to determine which bits are redundant, and then remove them from each fingerprint. Cheers, Steve. From: Greg Landrum [mailto:greg.land...@gmail.commailto:greg.land...@gmail.com] Sent: 04 June 2014 04:40 To: Stephen O'hagan Cc: rdkit-discuss@lists.sourceforge.netmailto:rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] remove redundant bits from bitvector fingerprints Hi Steve, On Tue, Jun 3, 2014 at 2:08 PM, Stephen O'hagan soha...@manchester.ac.ukmailto:soha...@manchester.ac.uk wrote: I have a fragment of code generating fingerprints for a long list of molecules (length ~ 1000) for index in range(0,len(smi)): smiles=smi[index] mol=Chem.MolFromSmiles(smiles) AllChem.EmbedMolecule(mol) AllChem.UFFOptimizeMolecule(mol) dm = Chem.Get3DDistanceMatrix(mol) fp = Generate.Gen2DFingerprint(mol,factory, dMat=dm) fp = fp.ToBitString() bs[index]=fp The length of each bitvectors generated is 39972, and the list has a lot of redundant ‘1’s and ‘0’s. Is there an easy method to filter out these redundant bits? What do you mean by redundant bits? The length of the bit vectors is determined by the parameters you provide for building the pharmacophore fingerprints (number of points, number of features, and number of distance bins). The length of the strings that you get from fp.ToBitString() should be equal to this number of bits. -greg -- HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions Find What Matters Most in Your Big Data with HPCC Systems Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. Leverages Graph Analysis for Fast Processing Easy Data Exploration http://p.sf.net/sfu/hpccsystems___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] remove redundant bits from bitvector fingerprints
Hi, I have a set of say 1000 generated fingerprints each of length 39972; across all 1000 fingerprints many bits are the same – they contain no information about the differences between the 1000 molecules. e.g. for list 01011 010110100 010101110 010100010 The first four bits are redundant, I could just record them as: 1 10100 01110 00010 In reality, the redundant bits are distributed through the bit string, so I need a method to determine which bits are redundant, and then remove them from each fingerprint. Cheers, Steve. From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 04 June 2014 04:40 To: Stephen O'hagan Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] remove redundant bits from bitvector fingerprints Hi Steve, On Tue, Jun 3, 2014 at 2:08 PM, Stephen O'hagan soha...@manchester.ac.ukmailto:soha...@manchester.ac.uk wrote: I have a fragment of code generating fingerprints for a long list of molecules (length ~ 1000) for index in range(0,len(smi)): smiles=smi[index] mol=Chem.MolFromSmiles(smiles) AllChem.EmbedMolecule(mol) AllChem.UFFOptimizeMolecule(mol) dm = Chem.Get3DDistanceMatrix(mol) fp = Generate.Gen2DFingerprint(mol,factory, dMat=dm) fp = fp.ToBitString() bs[index]=fp The length of each bitvectors generated is 39972, and the list has a lot of redundant ‘1’s and ‘0’s. Is there an easy method to filter out these redundant bits? What do you mean by redundant bits? The length of the bit vectors is determined by the parameters you provide for building the pharmacophore fingerprints (number of points, number of features, and number of distance bins). The length of the strings that you get from fp.ToBitString() should be equal to this number of bits. -greg -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] remove redundant bits from bitvector fingerprints
I have a fragment of code generating fingerprints for a long list of molecules (length ~ 1000) for index in range(0,len(smi)): smiles=smi[index] mol=Chem.MolFromSmiles(smiles) AllChem.EmbedMolecule(mol) AllChem.UFFOptimizeMolecule(mol) dm = Chem.Get3DDistanceMatrix(mol) fp = Generate.Gen2DFingerprint(mol,factory, dMat=dm) fp = fp.ToBitString() bs[index]=fp The length of each bitvectors generated is 39972, and the list has a lot of redundant '1's and '0's. Is there an easy method to filter out these redundant bits? Cheers, Steve. -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] 3D-Pharmacophore fingerprints ?
It appears that Eclipse PyDev code completion and syntax colouring was fooling me! Get3DDistanceMatrix is flagged as “undefined”, but code runs just fine!? Cheers, Steve. From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 08 May 2014 02:52 To: Stephen O'hagan Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] 3D-Pharmacophore fingerprints ? Hmm, it is definitely there. If you built from source and are using the new build it should be available as: Chem.Get3DDistanceMatrix() -greg On Wed, May 7, 2014 at 3:48 PM, Stephen O'hagan soha...@manchester.ac.ukmailto:soha...@manchester.ac.uk wrote: I still don’t see it in the beta of the Q1 2014 release? From: Greg Landrum [mailto:greg.land...@gmail.commailto:greg.land...@gmail.com] Sent: 02 May 2014 15:00 To: Stephen O'hagan Cc: rdkit-discuss@lists.sourceforge.netmailto:rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] 3D-Pharmacophore fingerprints ? I can find no Get3DDistanceMatrix defined? It is, unfortunately, a new feature. It's in the github version of the rdkit and will be in the next release (available next week). -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] 3D-Pharmacophore fingerprints ?
I still don’t see it in the beta of the Q1 2014 release? From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 02 May 2014 15:00 To: Stephen O'hagan Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] 3D-Pharmacophore fingerprints ? I can find no Get3DDistanceMatrix defined? It is, unfortunately, a new feature. It's in the github version of the rdkit and will be in the next release (available next week). -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] 3D-Pharmacophore fingerprints ?
Hi Greg, Should that be : dm = Chem.GetDistanceMatrix(mol) I can find no Get3DDistanceMatrix defined? For a list of molecules, do we recalculate the dm for each one, or do we use one molecule’s dm as a ‘reference’? Without trawling through the source code, I’m not clear what’s actually being done here as the documentation is a bit Spartan. Is there any reference to a journal article? Cheers, Steve. From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 01 May 2014 14:57 To: Stephen O'hagan Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] 3D-Pharmacophore fingerprints ? Steve, On Thu, May 1, 2014 at 12:23 PM, Stephen O'hagan soha...@manchester.ac.ukmailto:soha...@manchester.ac.uk wrote: Would it be possible to generate 3D-pharmacophore fingerprints similar to the existing 2D ones? Yes. The function Generate.Gen2DFingerprint() takes an optional argument dMat which can be used to provide the distance matrix. If you pass this a 3D distance matrix, you get a 3D pharmacophore fingerprint. Here's a crude example: In [34]: m = Chem.MolFromSmiles('OCN') In [35]: AllChem.EmbedMolecule(m) Out[35]: 0 In [36]: dm = Chem.Get3DDistanceMatrix(m) In [37]: from rdkit.Chem.Pharm2D import Gobbi_Pharm2D,Generate In [38]: factory = Gobbi_Pharm2D.factory In [39]: sig1 = Generate.Gen2DFingerprint(m,factory) In [40]: sig2 = Generate.Gen2DFingerprint(m,factory,dMat=dm) In [41]: sig1==sig2 Out[41]: False In [42]: sig1.GetOnBits()[0] Out[42]: 116 In [43]: sig2.GetOnBits()[0] Out[43]: 115 In [44]: factory.GetBitDescription(115) Out[44]: 'BG HA |0 3|3 0|' In [45]: factory.GetBitDescription(116) Out[45]: 'BG HA |0 4|4 0|' I hope this helps, -greg -- Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available. Simple to use. Nothing to install. Get started now for free. http://p.sf.net/sfu/SauceLabs___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] 3D-Pharmacophore fingerprints ?
Would it be possible to generate 3D-pharmacophore fingerprints similar to the existing 2D ones? Dr. Steve O'Hagan, Computer Officer, Bioanalytical Sciences Group, School of Chemistry, Manchester Institute of Biotechnology, University of Manchester, 131, Princess St, MANCHESTER M1 7DN. Email: soha...@manchester.ac.uk Phone: 0161 306 4562 -- Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available. Simple to use. Nothing to install. Get started now for free. http://p.sf.net/sfu/SauceLabs___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit pharmacophore features
OK, Adding: AllChem.EmbedMolecule(m1) AllChem.UFFOptimizeMolecule(m1) Fixed the problem. Now to work out what it all means! From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 24 April 2014 04:39 To: Stephen O'hagan Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] RDKit pharmacophore features Hi Steve, On Wed, Apr 23, 2014 at 5:41 PM, Stephen O'hagan soha...@manchester.ac.ukmailto:soha...@manchester.ac.uk wrote: I’m trying to understand how the RDKit pharmacophore features work; tried this fragment from a previous post: import os from rdkit import Chem from rdkit.Chem import ChemicalFeatures from rdkit import Geometry from rdkit import RDConfig from rdkit.Chem import AllChem from rdkit.Chem.Pharm3D import Pharmacophore, EmbedLib m1 = Chem.MolFromSmiles('Cc1c1') FEATURE_DEF_FILE = os.path.join(RDConfig.RDDataDir,'BaseFeatures.fdef') feat_factory = ChemicalFeatures.BuildFeatureFactory(FEATURE_DEF_FILE) feats = feat_factory.GetFeaturesForMol(m1) pcophore = Pharmacophore.Pharmacophore(feats) I get an immediate CTD on the call to Pharmacophore.Pharmacophore(feats) There are two bugs here: 1) in your code the molecule has no conformations generated, so trying to create a pharmacophore from the features associated with that molecule should not work 2) instead of an error message or exception you get a crash (seg fault on linux or the mac). The RDKit should never do that... If you add coordinates (2D or 3D) to your molecule before constructing the pharmacophore, your code should work. Best, -greg -- Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] RDKit pharmacophore features
I'm trying to understand how the RDKit pharmacophore features work; tried this fragment from a previous post: import os from rdkit import Chem from rdkit.Chem import ChemicalFeatures from rdkit import Geometry from rdkit import RDConfig from rdkit.Chem import AllChem from rdkit.Chem.Pharm3D import Pharmacophore, EmbedLib m1 = Chem.MolFromSmiles('Cc1c1') FEATURE_DEF_FILE = os.path.join(RDConfig.RDDataDir,'BaseFeatures.fdef') feat_factory = ChemicalFeatures.BuildFeatureFactory(FEATURE_DEF_FILE) feats = feat_factory.GetFeaturesForMol(m1) pcophore = Pharmacophore.Pharmacophore(feats) I get an immediate CTD on the call to Pharmacophore.Pharmacophore(feats) 32-bit Python 2.7; windows 7; RDKit binaries from RDKit_2013_09_2.win32.py27 Any ideas? Cheers, Steve. -- Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss