Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-01-16 Thread George Papadatos
Same here. I would also add the standardisation work done by Francis Atkinson 
at the EBI as an additional starting point. 

George. 

Sent from my giPhone

> On 16 Jan 2018, at 17:19, JP  wrote:
> 
> Joining the fray, +1 for MolVS
> 
>> On 16 January 2018 at 16:00, Brian Cole  wrote:
>> +1 to the MolVS project as well. 
>> 
>> Perhaps an easy bite-size project is to incorporate the open source mae 
>> parser code into core RDKit: https://github.com/schrodinger/maeparser
>> 
>> 
>>> On Mon, Jan 15, 2018 at 9:08 PM, Francois BERENGER 
>>>  wrote:
>>> On 01/16/2018 05:51 AM, Tim Dudgeon wrote:
>>> > Incorporating and "industrialising" Matt's MolVS tautomer and
>>> > standardizer code?
>>> > http://molvs.readthedocs.io/en/latest/index.html
>>> 
>>> If we can vote, I would vote for this one.
>>> 
>>> > On 15/01/18 07:09, Greg Landrum wrote:
>>> >> Dear all,
>>> >>
>>> >> We've been invited again to participate in the OpenChemistry
>>> >> application for Google Summer of Code.
>>> >>
>>> >> In order to participate we need ideas for projects and mentors to go
>>> >> along with them.
>>> >>
>>> >> The current list of RDKit ideas is being maintained here:
>>> >> http://wiki.openchemistry.org/GSoC_Ideas_2018#RDKit_Project_Ideas
>>> >>
>>> >> (Note: at the point that I'm pressing "send", that's still a copy of
>>> >> last year's project ideas).
>>> >>
>>> >> If you're willing to be a mentor (please ask me about the ~5
>>> >> hours/week required here) or have ideas, please reply to this thread.
>>> >>
>>> >> Best,
>>> >> -greg
>>> >>
>>> >>
>>> >> --
>>> >> Check out the vibrant tech community on one of the world's most
>>> >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> >>
>>> >>
>>> >> ___
>>> >> Rdkit-discuss mailing list
>>> >> Rdkit-discuss@lists.sourceforge.net
>>> >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>> >
>>> >
>>> >
>>> > --
>>> > Check out the vibrant tech community on one of the world's most
>>> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> >
>>> >
>>> >
>>> > ___
>>> > Rdkit-discuss mailing list
>>> > Rdkit-discuss@lists.sourceforge.net
>>> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>> >
>>> 
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> 
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Default behavior of certain calls

2017-10-12 Thread George Papadatos
Great example of functools.partial. 
For those who like functional programming, it can also be used with map and 
imap when a function needs more than one parameters. 

George. 

Sent from my giPhone

> On 12 Oct 2017, at 19:04, Andy Jennings  wrote:
> 
> Hi Paolo,
> 
> That's outstanding - thanks very much.
> 
> Best,
> Andy
> 
>> On Thu, Oct 12, 2017 at 10:27 AM, Paolo Tosco  wrote:
>> Dear Andy,
>> 
>> you may accomplish that within the scope of a Python script using 
>> functools.partial:
>> 
>> In [1]: from rdkit import Chem
>> 
>> In [2]: import functools
>> 
>> In [3]: # redefine Chem.SDMolSupplier to include a custom default parameter
>> 
>> In [4]: Chem.SDMolSupplier = functools.partial(Chem.SDMolSupplier, removeHs 
>> = False)
>> 
>> In [5]: suppl = Chem.SDMolSupplier('/home/paolo/sdf/bilastine.sdf')
>> 
>> In [6]: # hydrogens have not been stripped
>> 
>> In [7]: suppl[0].GetNumAtoms()
>> Out[7]: 71
>> 
>> In [8]: # If you wish to invoke the original function with the original 
>> default parameter:
>> 
>> In [9]: suppl = Chem.SDMolSupplier.func('/home/paolo/sdf/bilastine.sdf')
>> 
>> In [10]: # hydrogens have been stripped as the original function was invoked
>> 
>> In [11]: suppl[0].GetNumAtoms()
>> Out[11]: 34
>> HTH, cheers
>> p.
>>> On 10/12/17 18:09, Andy Jennings wrote:
>>> Hi,
>>> 
>>> First off: great work on the RDKit - a great resource for those of us that 
>>> like to cook up our own solutions to problems.
>>> 
>>> The default behavior of certain calls (e.g. Chem.SDMolSupplier, 
>>> Chem.MolToSmiles) has default behavior that is the opposite of what I would 
>>> generally want. For instance I might be processing docking files and want 
>>> to keep those pesky hydrogens, or I want to keep the stereochemical 
>>> information when I dump a smiles string.
>>> 
>>> I can understand why the current defaults might have been arrived at so I'm 
>>> not advocating the change in default behavior. Rather, I'm curious if one 
>>> could set the default behavior for an entire script (I write mostly 
>>> python). It maybe/is lazy of me but every so often I get caught out and 
>>> have to backtrack through a workflow.
>>> 
>>> Best,
>>> Andy
>>> 
>>> 
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> 
>>> 
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit-fingerprints set all bits for complex molecules?

2017-06-01 Thread George Papadatos
Example: https://www.surechembl.org/chemical/SCHEMBL1895

George. 

Sent from my giPhone

> On 1 Jun 2017, at 17:05, Greg Landrum  wrote:
> 
> Hi Nils,
> 
> Can you please send me the SMILES for those structures (or point me to an 
> easy way to lookup a SCHEMBL id)?
> 
> I will take a look at these, but I don't currently have a convenient copy of 
> SCHEMBL.
> 
> -greg
> 
> 
> 
>> On Thu, Jun 1, 2017 at 4:28 PM, Nils Weskamp  wrote:
>> Dear RDKitters,
>> 
>> I just calculated RDKit "Daylight-like" fingerprints for a number of public 
>> compound databases and found quite a number of examples where the resulting 
>> fingerprints have *all* bits set to 1. This happens in both KNIME 3.2.1 
>> (1024/1/7) and also via the command line (2048/1/7/4) for RDKit 2016.03. 
>> 
>> Examples include (from SureChEMBL):
>> 
>> SCHEMBL5141968   
>>
>> SCHEMBL13916889  
>>   
>> SCHEMBL16257315  
>>
>> SCHEMBL16257310  
>>
>> SCHEMBL16257297  
>>
>> SCHEMBL16257215  
>>
>> SCHEMBL16257169  
>>
>> SCHEMBL8232906   
>>   
>> SCHEMBL16257312  
>>
>> SCHEMBL13011081  
>>   
>> SCHEMBL12570100  
>>
>> SCHEMBL14524878  
>>   
>> SCHEMBL6370886   
>>   
>> SCHEMBL15305169  
>>   
>> SCHEMBL16912871  
>>
>> SCHEMBL13290179  
>>
>> 
>> Now, these are obviously some very large and complex molecules, so I would 
>> expect that they contain many features and thus set many bits - but all of 
>> them?
>> 
>> So, in short: Are these compounds so ugly that it is normal for the 
>> fingerprints to have all bits set or are they so ugly that they trigger some 
>> rare bug in RDKit?
>> 
>> Any ideas / suggestions / comments?
>> 
>> Thanks a lot,
>> Nils
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] substructure of a fingerprint position

2017-02-01 Thread George Papadatos
https://iwatobipen.wordpress.com/2017/01/08/get-bit-information-with-rdkit/

George. 

Sent from my giPhone

> On 26 Jan 2017, at 11:02, Gonzalo Colmenarejo  
> wrote:
> 
> Hi,
> 
> is there a way in RDKit to retrieve the substructure(s) corresponding to a 
> (hashed or unhashed) Morgan fingerprint position? 
> 
> Thanks a lot in advance
> 
> Gonzalo
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-02 Thread George Papadatos
:)

George. 

Sent from my giPhone

> On 2 Dec 2016, at 22:11, Dimitri Maziuk  wrote:
> 
>> On 12/02/2016 03:12 PM, George Papadatos wrote:
>> Here's a pragmatic idea:
> ... would it not be safe to
>> assume that *any *word containing more than 4 'C' or 'c' characters would
>> only be a SMILES string?
> 
> pneumonoultramicroscopicsilicovolcanoconiosis
> 
> 
> -- 
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
> 
> --
> Check out the vibrant tech community on one of the world's most 
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-02 Thread George Papadatos
Here's a pragmatic idea:

If Alexis wants to search for valid SMILES strings representing
typical *organic
*molecules among text of plain English words, would it not be safe to
assume that *any *word containing more than 4 'C' or 'c' characters would
only be a SMILES string?
This simple filter (word.lower().count('c')>=4) would quickly eliminate all
normal English words, leaving only SMILES to parse. No need for regexes,
unless you really care for ISIS or IOPS molecules. :)

George

On 2 December 2016 at 19:36, Andrew Dalke  wrote:

> On Dec 2, 2016, at 11:11 AM, Greg Landrum wrote:
> > An initial start on some regexps that match SMILES is here:
> https://gist.github.com/lsauer/1312860/264ae813c2bd2c27a769d261c8c6b3
> 8da34e22fb
> >
> > that may also be useful
>
>
> I've put together a more gnarly regular expression to find possible SMILES
> strings. It's configured for at least 4 atom terms, but that's easy to
> change (there's a "{3,}" which can be changed as desired.)
>
> It's follows the SMILES specification a bit more closely, which means
> there should be fewer false positives than the regular expression Greg
> pointed out.
>
> The file which constructs the regular expression, and an example driver,
> is attached. Here's what the output looks like:
>
>
>
>
> % python detect_smiles.py ~/talks/*.txt
> /Users/dalke/talks/ICCS_2014_paper.txt:528:532 'IOPS'
> /Users/dalke/talks/ICCS_2014_paper.txt:30150:30183
> 'CC12CCC3C(CCC4=CC(O)CCC34C)C1CCC2'
> /Users/dalke/talks/ICCS_2014_paper2.txt:3270:3274 'CBCC'
> /Users/dalke/talks/ICCS_2014_paper2.txt:10229:10239 'CC(=O)[O-]'
> /Users/dalke/talks/ICCS_2014_paper2.txt:32766:32770 'ISIS'
> /Users/dalke/talks/Sheffield2013.txt:25002:25013 'C1=CC=CC=C1'
> /Users/dalke/talks/Sheffield2013.txt:25039:25047 'c1c1'
> /Users/dalke/talks/Sheffield_2016.txt:2767:2771 'CBCC'
> /Users/dalke/talks/Sheffield_2016.txt:10295:10301 'O0'
> /Users/dalke/talks/Sheffield_2016_talk.txt:7302:7306 'CBCC'
> /Users/dalke/talks/Sheffield_2016_talk.txt:7564:7568 'CBCC'
> /Users/dalke/talks/Sheffield_2016_talk.txt:7716:7720 'CBCC'
> /Users/dalke/talks/Sheffield_2016_v2.txt:2874:2878 'soon'
> /Users/dalke/talks/Sheffield_2016_v2.txt:7312:7317 'O'
> /Users/dalke/talks/Sheffield_2016_v2.txt:22770:22774 'ICCS'
> /Users/dalke/talks/Sheffield_2016_v3.txt:2982:2986 'soon'
> /Users/dalke/talks/Sheffield_2016_v3.txt:7627:7632 'O'
> /Users/dalke/talks/Sheffield_2016_v3.txt:24546:24550 'ICCS'
> /Users/dalke/talks/tdd_part_2.txt:7547:7551 'scop'
>
> You can also modify the code for line-by-line processing rather than an
> entire block of text like I did.
>
>
> As others have pointed out, this is a well-trodden path. Follow their
> warnings and advice.
>
> Also, I didn't fully test it.
>
>
>
> Andrew
> da...@dalkescientific.com
>
>
> P.S.
>
> Here's the regular expression:
>
> (? term
>
> (
>
> (
> (
>  Cl? | # Cl and Br are part of the organic subset
>  Br? |
>  [NOSPFIbcnosp*] |  # as are these single-letter elements
>
>  # bracket atom
>  \[\d*  # optional atomic mass
>(# valid element names
> C[laroudsemf]? |
> Os?|N[eaibdpos]? |
> S[icernbmg]? |
> P[drmtboau]? |
> H[eofgas]? |
> c|n|o|s|p |
> A[lrsgutcm] |
> B[eraik]? |
> Dy|E[urs] |
> F[erm]? |
> G[aed] |
> I[nr]? |
> Kr? |
> L[iaur] |
> M[gnodt] |
> R[buhenaf] |
> T[icebmalh] |
> U|V|W|Xe |
> Yb?|Z[nr]
>)
>[^]]*   # ignore anything up to the ']'
> \]
> )
># allow 0 or more closures directly after any atom
> (
>   [-=#$/\\]?  # optional bond type
>   (
> [0-9] |# single digit closure
> (%[0-9][0-9])  # two digit closure
>   )
> ) *
> )
>
> (
>
> (
>  (
>   \( [-=#$/\\]?   # a '(', which can have an optional bond (no dot)
>  ) | (
>\)*   # any number of close parens, followed by
>(
>  ( \( [-=#$/\\]? ) |  # an open parens and optional bond (no dot)
>  [.-=#$/\\]?  # or a dot disconnect or bond
>)
>  )
> )
> ?
>
> (
> (
>  Cl? | # Cl and Br are part of the organic subset
>  Br? |
>  [NOSPFIbcnosp*] |  # as are these single-letter elements
>
>  # bracket atom
>  \[\d*  # optional atomic mass
>(# valid element names
> C[laroudsemf]? |
> Os?|N[eaibdpos]? |
> S[icernbmg]? |
> P[drmtboau]? |
> H[eofgas]? |
> c|n|o|s|p |
> A[lrsgutcm] |
> B[eraik]? |
> Dy|E[urs] |
> F[erm]? |
> G[aed] |
> I[nr]? |
> Kr? |
> L[iaur] |
> M[gnodt] |
> R[buhenaf] |
> T[icebmalh] |
> U|V|W|Xe |
> Yb?|Z[nr]
>)
>[^]]*   # ignore anything up to the ']'
> \]
> )
># allow 0 or more closures directly after any atom
> (
>   [-=#$/\\]?  # optional bond type
>   (
> [0-9] |# single digit closure
> (%[0-9][0-9])  # two digit closure
>   )
> ) *
> )
>
> ){3,}  # must have at least 4 a

Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-02 Thread George Papadatos
I think Alexis was referring to converting actual SMILES strings found in
random text. Chemical entity recognition and name to structure conversion
is another story altogether and nowadays one can quickly go a long way with
open tools such as OSCAR + OPSIN in KNIME or with something like this:
http://chemdataextractor.org/docs/intro

George

On 2 December 2016 at 17:35, Brian Kelley  wrote:

> This was why they started using the dictionary lookup as I recall :). The
> iupac system they ended up using was Roger's when at OpenEye.
>
> 
> Brian Kelley
>
> On Dec 2, 2016, at 12:33 PM, Igor Filippov 
> wrote:
>
> I could be wrong but I believe IBM system had a preprocessing step which
> removed all known dictionary words - which would get rid of "submarine" etc.
> I also believe this problem has been solved multiple times in the past,
> NextMove software comes to mind, chemical tagger -
> http://chemicaltagger.ch.cam.ac.uk/, etc.
>
> my 2 cents,
> Igor
>
>
>
>
> On Fri, Dec 2, 2016 at 11:46 AM, Brian Kelley 
> wrote:
>
>> I hacked a version of RDKit's smiles parser to compute heavy atom count,
>> perhaps some version of this could be used to check smiles validity without
>> making the actual molecule.
>>
>> From a fun historical perspective:  IBM had an expert system to find
>> IUPAC names in documents.  They ended up finding things like "submarine"
>> which was amusing.  It turned out that just parsing all words with the
>> IUPAC parser was by far the fastest and best solution.  I expect the same
>> will be true for finding smiles.
>>
>> It would be interesting to put the common OCR errors into the parser as
>> well (l's and 1's are hard for instance).
>>
>>
>> On Fri, Dec 2, 2016 at 10:46 AM, Peter Gedeck 
>> wrote:
>>
>>> Hello Alexis,
>>>
>>> Depending on the size of your document, you could consider limit storing
>>> the already tested strings by word length and only memoize shorter words.
>>> SMILES tend to be longer, so everything above a given number of characters
>>> has a higher probability of being a SMILES. Large words probably also
>>> contain a lot of chemical names. They often contain commas (,), so they are
>>> easy to remove quickly.
>>>
>>> Best,
>>>
>>> Peter
>>>
>>>
>>> On Fri, Dec 2, 2016 at 5:43 AM Alexis Parenty <
>>> alexis.parenty.h...@gmail.com> wrote:
>>>
 Dear Pavel And Greg,



 Thanks Greg for the regexps link. I’ll use that too.


 Pavel, I need to track on which document the SMILES are coming from,
 but I will indeed make a set of unique word for each document before
 looping. Thanks!

 Best,

 Alexis

 On 2 December 2016 at 11:21, Pavel  wrote:

 Hi, Alexis,

   if you should not track from which document SMILES come, you may just
 combine all words from all document in a list, take only unique words and
 try to test them. Thus, you should not store and check for valid/non-valid
 strings. That would reduce problem complexity as well.

 Pavel.
 On 12/02/2016 11:11 AM, Greg Landrum wrote:

 An initial start on some regexps that match SMILES is here:
 https://gist.github.com/lsauer/1312860/264ae813c2bd2c2
 7a769d261c8c6b38da34e22fb

 that may also be useful

 On Fri, Dec 2, 2016 at 11:07 AM, Alexis Parenty <
 alexis.parenty.h...@gmail.com> wrote:

 Hi Markus,


 Yes! I might discover novel compounds that way!! Would be interesting
 to see how they look like…


 Good suggestion to also store the words that were correctly identified
 as SMILES. I’ll add that to the script.


 I also like your “distribution of word” idea. I could safely skip any
 words that occur more than 1% of the time and could try to play around with
 the threshold to find an optimum.


 I will try every suggestions and will time it to see what is best. I’ll
 keep everyone in the loop and will share the script and results.


 Thanks,


 Alexis

 On 2 December 2016 at 10:47, Markus Sitzmann >>> > wrote:

 Hi Alexis,

 you may find also so some "novel" compounds by this approach :-).

 Whether your tuple solution improves performance strongly depends on
 the content of your text documents and how often they repeat the same words
 again - but my guess would be it will help. Probably the best way is even
 to look at the distribution of words before you feed them to RDKit. You
 should also "memorize" those ones that successfully generated a structure,
 doesn't make sense to do it again, then.

 Markus

 On Fri, Dec 2, 2016 at 10:21 AM, Maciek Wójcikowski <
 mac...@wojcikowski.pl> wrote:

 Hi Alexis,

 You may want to filter with some regex strings containing not valid
 characters (i.e. there is small subset of atoms that may be without
 brackets). See "Atoms" section: http://www.daylig

Re: [Rdkit-discuss] comparing two or more tables of molecules

2016-12-01 Thread George Papadatos
HI Stephen,

Further to Greg's excellent reply, see this paper on how InChI strings and
keys can be used in practice to map together tautomer (ones covered by
InChI at least), isotope, stereo and parent-salt variants.
http://rd.springer.com/article/10.1186/s13321-014-0043-5

Francis (cc'ed) has a nice notebook somewhere illustrating these nice InChI
splits to find these variants.

For educational purposes, there have been other approaches like the NCI's
identifiers - discussion here:
http://acscinf.org/docs/meetings/237nm/presentations/237nm17.pdf

For pure structure standardization using RDKit see here:
https://github.com/flatkinson/standardiser
and
https://github.com/mcs07/MolVS


Cheers,

George




On 29 November 2016 at 17:02, Greg Landrum  wrote:

> Wow, this is a great question and quite a fun thread.
>
> It's hard to really make much of a contribution here without writing a
> book/review article (something that I'm really not willing to do!), but I
> have a few thoughts. Most of this is repeating/rephrasing things others
> have already said.
>
> I'm going to propose some things as facts. I think that these won't be
> controversial:
> fact 1: if the structures are coming from different sources, they need to
> be standardized/normalized before you compare them. This is true regardless
> of how you want to compare them. The details of the standardization process
> are not incredibly important, but it does need to take care of the things
> you care about when comparing molecules. For example, if you don't care
> about differences between salts, it should strip salts. If you don't care
> about differences between tautomers, it should normalize tautomers.
> fact 2: The InChI algorithm includes a standardization step that
> normalizes some tautomers, but does not remove salts.
> fact 3: The InChI representation contain a number of layers defining the
> structure in increasing detail (this isn't strictly true, because some of
> the choices about how layers are ordered are arbitrary, but it's close).
> fact 4: canonicalization, the way I define it, produces a canonical atom
> numbering for a given structure, but it does *not* standardize
> fact 5: the RDKit has essentially no well-documented standardization code
>
> fact X: we don't have any standard, broadly accepted approach for
> standardization, canonicalization or representation that is fool-proof or
> that works for even all of organic chemistry, never mind organometallics.
> InChI, useful as it is for some things, completely fails to handle things
> like atropisomers (they are working on this kind of thing, but it's not out
> yet).
>
> Given all of this, if I wanted to have flexible duplicate checking *right*
> now, I think I would use the AvalonTools struchk functionality that the
> RDKit provides (the new pure-RDKit version still needs a bit more testing)
> to handle basic standardization and salt stripping and then produce a table
> that includes the InChI in a couple of different forms. I'd want to be able
> to recognize molecules that differ only by stereochemistry, molecules that
> differ only by location of tautomeric Hs, and molecules that differ only by
> the location of isotopic labels. You can do this with various clever splits
> of the InChI (how to do it is left as an exercise for the reader and/or a
> future RDKit blog post).
>
> I think there's something fun to be done here with SMILES variants,
> borrowing heavily from some of the things that Roger has written about:
> https://nextmovesoftware.com/blog/2013/04/25/finding-all-typ
> es-of-every-mer/
> here's a more recent application of that from Noel:
> https://nextmovesoftware.com/blog/2016/06/22/fishing-for-mat
> ched-series-in-a-sea-of-structure-representations/
>
> If I didn't really care about details and just wanted something that I
> could explain easily to others, I'd skip all the complication and just use
> InChIs (or InChI keys) to recognize duplicates. There would be times when
> that would be the wrong answer, but it would be a broadly accepted kind of
> wrong.[1]
>
> Regardless of the approach, I would not, under most any circumstances,
> discard the original input structures that I had. It's really good to be
> able to figure out what the original data looked like later.
>
> -greg
> [1] I'm crying as I write this...
>
>
>
>
> On Mon, Nov 28, 2016 at 5:25 PM, Stephen O'hagan  > wrote:
>
>> Has anyone come up with fool-proof way of matching structurally
>> equivalent molecules?
>>
>>
>>
>> Unique Smiles or InChI String comparisons don’t appear to work presumable
>> because there are different but equivalent structures, e.g. explicit vs
>> non-explicit H’s, Kekule vs Aromatic, isomeric forms vs non-isomeric form,
>> tautomers etc.
>>
>>
>>
>> I also expect that comparing InChI strings might need something more than
>> just a simple string comparison, such as masking off stereo information
>> when you don’t care about stereo isomers.
>>
>>
>>
>> I assume there are suitable to

Re: [Rdkit-discuss] Fingerprints_calculation

2016-10-02 Thread George Papadatos
Hi Sahil,

You'll find the same documentation as a Jupyter Notebook here:
http://nbviewer.jupyter.org/github/chembl/mychembl/blob/master/ipython_notebooks/02_myChEMBL_RDKit_tutorial.ipynb#Morgan-Fingerprints-(Circular-Fingerprints)

Cheers,

George

On 1 October 2016 at 04:17, Greg Landrum  wrote:

> Hi Sahil,
>
> The documentation includes some detail about the calculation of the Morgan
> fingerprints, along with a pointer to the original publication describing
> the method: http://rdkit.org/docs/GettingStartedInPython.html#
> morgan-fingerprints-circular-fingerprints
>
> Does the information there answer your question?
> -greg
>
>
>
> On Fri, Sep 30, 2016 at 6:37 AM, Sahil Kharangarh <
> sahilkharang...@gmail.com> wrote:
>
>>
>> I am facing the problem during the calculation of morgan(circular)
>> fingerprints that on which basis the RDkit calculated the fingerprints when
>> we choose the radius?
>> how to choose the radius in the circular fingerprints?
>> and what is the use of the function  useFeatures=True particularly means?
>>
>>
>>
>>
>> *With Warm Regards,*
>> *SAHIL*
>> *M.S. Research Scholar, *
>> *Department of Pharmacoinformatics,*
>> *National Institute of Pharmaceutical Education and Research (NIPER), *
>> *sector-67, S.A.S Nagar, Mohali,*
>> *Punjab- 160062, INDIA*
>> *contact no: +917508142749 <%2B917508142749>,+919813153122
>> <%2B919813153122>*
>> *email: sahilkharang...@gmail.com *
>>
>>
>> 
>> --
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] OCEAN: Our Target Prediction Paper (including Source Code)

2016-09-27 Thread George Papadatos
Hi guys,

Congrats - great use of ChEMBL and myChEMBL too :)


George



On 27 September 2016 at 05:13, Paul Czodrowski <
paul.czodrow...@merckgroup.com> wrote:

> Dear RDKitters,
>
>
>
> Our target prediction method – fully based on RDKit – has become online:
>
> OCEAN: *O*ptimized *C*ross r*EA*ctivity estimatio*N*
>
> http://pubs.acs.org/doi/abs/10.1021/acs.jcim.6b00067
>
>
>
> The source code can be found here:
>
> https://github.com/rdkit/OCEAN
>
>
>
> We will give a talk as well an hands-on workshop at the upcoming RDKit UGM
> end of October.
>
>
>
> Cheers,
>
> Guido & Paul
>
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
>
>
> Click http://www.merckgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.
>
> 
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] myChEMBL 20

2015-07-21 Thread George Papadatos
Hi RDKitters,

In case you didn't notice, we've just released myChEMBL 20.
Apart from the database upgrade, the new version contains updates to RDKit
and web services, along with 2 new iPython notebooks.
Wrt distros and virtualisation, in addition to Ubuntu and vmdk format,
myChEMBL also comes in CentOS *via* KVM, Vagrant and docker.

More info here:
http://chembl.blogspot.co.uk/2015/07/mychembl-20-has-landed.html
and
http://chembl.blogspot.co.uk/2015/07/mychembl-docker.html


Cheers,

MIchał & George
--
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit Tools for the IPython Notebook

2015-07-02 Thread George Papadatos
Axel, this is seriously cool!
Many thanks!

George

On 2 July 2015 at 13:31, Axel Pahl  wrote:

>  Dear fellow RDKitters,
>
> the RDKit community is always so helpful that I wanted share back two
> functions that I use in the IPython Notebook from which I thought that they
> could be of use to others, as well.
>
> - show_table:
> Display a list of molecules in a table with molecule properties as
> columns.
> When an ID property is given, the table becomes interactive and compounds
> can be selected.
> I know that this can be also done with PandasTools but that might be
> overkill in some situations. Also the table from Pandas is not interactive
> to my knowledge.
>
> - jsme:
> Display Peter Ertl's Javascript Melecule Editor to enter a molecule
> directly in the IPython notebook (how cool is that??)
>
> If you are interested, please have a look at the GitHub
>  repo and the example
> 
> notebook.
>
> Kind regards,
> Axel
>
>
> --
> Don't Limit Your Business. Reach for the Cloud.
> GigeNET's Cloud Solutions provide you with the tools and support that
> you need to offload your IT needs and focus on growing your business.
> Configured For All Businesses. Start Your Cloud Today.
> https://www.gigenetcloud.com/
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Molecular dis / similarity using fingerprints

2015-05-26 Thread George Papadatos
Hi JP,

Aha, so you're looking for a threshold that will exhibit the optimal
balance between the false positives and false negatives in the *biological*
*activity* space. This threshold varies depending on the fingerprint and
the dataset of course.
See here for some generalised insights:

(1) Papadatos, G.; Cooper, A. W. J.; Kadirkamanathan, V.; Macdonald, S. J.
F.; McLay, I. M.; Pickett, S. D.; Pritchard, J. M.; Willett, P.; Gillet, V.
J. Analysis of Neighborhood Behavior in Lead Optimization and Array Design. *J.
Chem. Inf. Model.* *2009*, *49*, 195–208.

especially Figure 17, and

(2) Muchmore, S. W.; Debe, D. A.; Metz, J. T.; Brown, S. P.; Martin, Y. C.;
Hajduk, P. J. Application of Belief Theory to Similarity Data Fusion for
Use in Analog Searching and Lead Hopping. *J. Chem. Inf. Model.* *2008*,
*48*, 941–948.

and also Greg's blog post:

http://rdkit.blogspot.co.uk/2013/10/fingerprint-thresholds.html


The TL/DR version is that for ECFP_4, this threshold should be around
0.45-0.55.
Wrt methodology, are you trying to score/rank the
intra-diversity/heterogeneity for different structure sets?


Cheers,

George



On 26 May 2015 at 11:59, JP  wrote:

>
> On 25 May 2015 at 22:23, Tim Dudgeon  wrote:
>
>> Maybe a clustering approach may work? Something like sphere exclusion
>> clustering with counting the number of clusters at 0.9 - 0.8 similarity)?
>> With 30K structures it sounds computationally tractable?
>
>
> Thanks Tim for this idea.  I hadn't heard of sphere exclusion.  The
> problem is we still need a distance / similarity function (which using ECFP
> with high similarity 0.8-0.9 would result in very few compounds being
> thrown out).  I think the real issue here is selecting a sensible
> similarity threshold which defines my idea of "similarity".  But that is a
> tricky number to get right - too high and you remove nothing, too low and
> you start catching "different" molecules.  I guess the best thing is try a
> few values (0.5, 0.6, 0.7, 0.8, 0.9) and have a visual look at the
> remaining compounds.
>
> -
> JP
>
>
> --
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] generating scaffold trees

2015-05-22 Thread George Papadatos
Hi all,

Coincidentally, we had a chat about this with James the other day.
Maybe the good colleagues at the ICR have implemented this already with
RDKit? Nick?

Cheers,

g


On 22 May 2015 at 13:38, Axel Pahl  wrote:

> Dear RDKitters,
>
> has someone used the RDKit to generate scaffold trees from molecules as
> described in this paper:
> Schuffenhauer, A., Ertl, P., Roggo, S., Wetzel, S., Koch, M. A.,
> Waldmann, H., J. Chem. Inf. Model. 2007, 47, 47-58
>
> I know that this is possible with ScaffoldHunter and that there is a
> Pipeline Pilot component for it, but being able to do it in RDKit would
> fit especially well in my workflow...
>
> Kind regards and have a nice weekend,
> Axel
>
>
>
> --
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rdkit.version()

2014-12-11 Thread George Papadatos
Hi Soren,

from rdkit import rdBase
print rdBase.rdkitVersion

Cheers,

George


On 11 December 2014 at 18:22, Soren Wacker  wrote:

>  Hi,
>
> I would like to find out the currently installed version on my System.
> However, I cannot find a version string in RDKit. Something like
> rdkit.version() would be nice. Is there something like this implemented??
>
> kind regards
> Soren
>
>  --
> *From:* James Davidson [j.david...@vernalis.com]
> *Sent:* Wednesday, December 10, 2014 10:48 AM
> *To:* greg.land...@gmail.com
> *Cc:* rdkit-discuss@lists.sourceforge.net
> *Subject:* Re: [Rdkit-discuss] Avalon test failing(?)
>
>   Hi Greg,
>
>
>
> > The new version of the test code is targeting the 1.2 avalon toolkit
>
> > version.
>
> > Here's the commit that did that.
>
> > https://github.com/rdkit/rdkit/commit/42dab414ee6fbe5489078e5e52046608bbf785cb
>
> >
>
> > As an FYI, to make these tests pass on windows, you need to edit the code
>
> > to fix a bug:
>
> >
>
> > you need to comment out line 1446 of reaccsio.c:
>
> >//MyFree((char *)tempdir);
>
>
>
> Following your advice, I downloaded the 1.2 source from Sourceforge (
> http://sourceforge.net/projects/avalontoolkit/files/AvalonToolkit_1.2/);
> commented-out the line in reaccsio.c; and then reconfigured in cmake and
> rebuilt in VS.  The tests pass now – thanks!
>
>
>
> Kind regards
>
>
>
> James
>
> __
> PLEASE READ: This email is confidential and may be privileged. It is
> intended for the named addressee(s) only and access to it by anyone else is
> unauthorised. If you are not an addressee, any disclosure or copying of the
> contents of this email or any action taken (or not taken) in reliance on it
> is unauthorised and may be unlawful. If you have received this email in
> error, please notify the sender or postmas...@vernalis.com. Email is not
> a secure method of communication and the Company cannot accept
> responsibility for the accuracy or completeness of this message or any
> attachment(s). Please check this email for virus infection for which the
> Company accepts no responsibility. If verification of this email is sought
> then please request a hard copy. Unless otherwise stated, any views or
> opinions presented are solely those of the author and do not represent
> those of the Company.
>
> The Vernalis Group of Companies
> 100 Berkshire Place
> Wharfedale Road
> Winnersh, Berkshire
> RG41 5RD, England
> Tel: +44 (0)118 938 
>
> To access trading company registration and address details, please go to
> the Vernalis website at www.vernalis.com and click on the "Company
> address and registration details" link at the bottom of the page..
> __
>
>
> --
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
>
> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] question on MatchedPairs

2014-08-27 Thread George Papadatos
Hi Paul,
The 'anchored' fingerprint by the attachment point is how I've done it in the 
past too. It can give you more granularity than the simple 
aromatic/non-aromatic classification. 
It's easy to do it with RDKit but I don't have the snippet at hand. 
Cheers,
George  



Sent from my gPad

On 27 Aug 2014, at 13:21, greg...@gerebtzoff.com wrote:

>> 
>> Dear RDKitters,
>> 
>> I'm using Jameed's wonderful code for a matched pair analysis.
>> 
>> Given such a transformation string "[*:1]C>>[*:1][H]"
>> => How do I check if [*:1] is an aromatic or an aliphatic atom?
>> 
>> 
>> I fear that this can only be done by going back into the original
>> data/output, or am I wrong ?
>> 
>> 
>> Cheers & Thanks,
>> Paul
> 
> Hi Paul,
> 
> I think you're right, since Jameed's MMP algorithm does not cut ring 
> systems and does not capture the environment at the cutting point.
> Two solutions come into my mind:
>  - either you go back to the original data, or
>  - you search for more specific replacements, i.e. [*:1]CC>>[*:1]C[H], 
> [*:1]c1c1C>>[*:1]c1c1[H] etc.
> In LUCID for each fragment I calculate and store circular fingerprints 
> centered at the dummy atom ([*]) of different sizes, which allow me to 
> filter the results very easily (exactly for the kind of questions you 
> wanted to answer).
> 
> Cheers,
> 
> Grégori
> 
> --
> Slashdot TV.  
> Video for Nerds.  Stuff that matters.
> http://tv.slashdot.org/
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] myChEMBL

2014-06-12 Thread George Papadatos
Hi all,

Further to Michał's announcement earlier, let me use this opportunity to
announce the new release of myChEMBL, as I'm sure it will be relevant to
many of you.

myChEMBL is an open platform which consists of a Linux (Ubuntu) Virtual
Machine featuring a PostgreSQL schema with the latest version of the ChEMBL
database, the latest RDKit toolkit and cartridge, along with several Python
tools and libraries for scientific computing and data mining.

myChEMBL offers several ways to interact with ChEMBL data locally and
provides a free and secure environment for application development,
teaching and learning.

More information here:
http://chembl.blogspot.co.uk/2014/06/mychembl-launchpadlaunched.html


Cheers,

George
--
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] flexmatch in RDKit cartridge?

2014-02-21 Thread George Papadatos
Many thanks Jan, that's very helpful.

Cheers,

George



On 20 February 2014 21:32, Jan Holst Jensen  wrote:

>  Hi George et al,
>
> flexmatch(... 'all') is the most strict exact match that the
> Symyx/Accelrys cartridge has. You can relax the matching behavior to
> varying degrees by passing it different options, e.g. using 'tau' instead
> of 'all' will make the identity check tautomer-agnostic (to the extent that
> the cartridge will perceive tautomers "correctly" - an interesting
> discussion topic in itself).
>
> The various options to flexmatch() are well documented in the Accelrys
> documentation for the cartridge, but I don't know if that is publicly
> available.
>
> The short answer in my opinion: Yes, @= should be the equivalent of
> flexmatch(m1, m2, 'all'). To emulate flexmatch(..., 'all') with rdkit, I
> find a small gotcha with regards to chiral matching:
>
> -- Clearly not identical.
> postgres=# select mol('CCC') @= mol('CCF');
>  ?column?
> --
>  f
> (1 row)
>
> -- Clearly identical.
> postgres=# select mol('CCC') @= mol('CCC');
>  ?column?
> --
>  t
> (1 row)
>
> -- Ala versus dAla - should *not* be identical ?
> postgres=# select mol('C[C@H](N)C(=O)O') @= mol('C[C@@H](N)C(=O)O');
>  ?column?
> --
>  t
> (1 row)
>
> To get the expected behavior of @= you need to turn on chiral matching.
> Even though the parameter says that is controls SSS behavior it apparently
> also has an effect on exact matching:
>
> postgres=# set rdkit.do_chiral_sss=true;
> SET
> -- Ala versus dAla - no longer identical.
> postgres=# select mol('C[C@H](N)C(=O)O') @= mol('C[C@@H](N)C(=O)O');
>  ?column?
> --
>  f
> (1 row)
>
> -- Ala versus Ala - phew, identical.
> postgres=# select mol('C[C@H](N)C(=O)O') @= mol('C[C@H](N)C(=O)O');
>  ?column?
> --
>  t
> (1 row)
>
> Cheers
> -- Jan
>
>
>
> On 2014-02-20 13:46, George Papadatos wrote:
>
> Hi there,
> Wouldn't that be (at least partly) possible with an exact structure search?
>
>- @= : returns whether or not two molecules are the same.
>
> Cheers,
> George
>
>
> On 20 February 2014 11:59, Greg Landrum  wrote:
>
>> Sounds interesting. Can anyone provide a pointer to a doc with more
>> specific info about what this actually does?
>>
>>
>> On Thursday, February 20, 2014, Michał Nowotka  wrote:
>>
>>>   Hi,
>>>
>>>  Symix cartridge defines something called flexmatch - "Finds records
>>> that are an exact match of the 2D or 3D structure that you specify in the
>>> query."
>>>  Is there anything similar in RDKit cartridge? I looked into
>>> documentation and couldn't find this feature.
>>>
>>>  Regards,
>>>  Michal Nowotka
>>>
>>
>>
>> --
>> Managing the Performance of Cloud-Based Applications
>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
>> Read the Whitepaper.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
>
>
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] flexmatch in RDKit cartridge?

2014-02-20 Thread George Papadatos
Hi there,
Wouldn't that be (at least partly) possible with an exact structure search?

   - @= : returns whether or not two molecules are the same.

Cheers,
George


On 20 February 2014 11:59, Greg Landrum  wrote:

> Sounds interesting. Can anyone provide a pointer to a doc with more
> specific info about what this actually does?
>
>
> On Thursday, February 20, 2014, Michał Nowotka  wrote:
>
>> Hi,
>>
>> Symix cartridge defines something called flexmatch - "Finds records that
>> are an exact match of the 2D or 3D structure that you specify in the query.
>> "
>> Is there anything similar in RDKit cartridge? I looked into documentation
>> and couldn't find this feature.
>>
>> Regards,
>> Michal Nowotka
>>
>
>
> --
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI roundtrip

2014-01-30 Thread George Papadatos
I agree; that's why I tried to minimise 'doctoring' as much as I could in
this case.
George


On 30 January 2014 19:46, Dimitri Maziuk  wrote:

> On 01/30/2014 01:07 PM, George Papadatos wrote:
> > OK just to add some fuel to this fire: A colleague of mine and I looked
> at
> > the inchi roundtrip using KNIME 2.9 and the latest versions of indigo and
> > rdkit nodes.
>
> > Rdkit had 10 times more discrepancies
>
> If it's any consolation OpenBabel stereo perception does not do CIP
> ordering so any input that didn't have correct stereochemistry or it was
> removed during whatever processing you did, its output InChi will have a
> wrong stereo layer. I expect with properly doctored input you'll get
> 100% discrepancies there.
>
> --
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>
>
>
> --
> WatchGuard Dimension instantly turns raw network data into actionable
> security intelligence. It gives you real-time visual feedback on key
> security issues and trends.  Skip the complicated setup - simply import
> a virtual appliance and go from zero to informed in seconds.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI roundtrip

2014-01-30 Thread George Papadatos
Hi Igor,
Thanks for the quick reply.
I just did in my workflow. The number of discrepancies increased from 200
to 950 :(
George


On 30 January 2014 19:19, Igor Filippov  wrote:

> George,
>
> Have you added coordinates to the mols converted from InChI?
> It made a huge difference for the examples I've tried.
>
> Igor
>
>
> On Thu, Jan 30, 2014 at 2:07 PM, George Papadatos wrote:
>
>> OK just to add some fuel to this fire: A colleague of mine and I looked
>> at the inchi roundtrip using KNIME 2.9 and the latest versions of indigo
>> and rdkit nodes. We used ~90,000 inchis from chembl_17, converted them to
>> mols (sanitise + remove Hs), removed the ones that fail to convert, and
>> then we converted back to inchis (standard ones, no extra parameters). We
>> assessed the discrepancies between indigo and rdkit inchis compared to the
>> original input inchis that are stored in chembl.
>> Rdkit had 10 times more discrepancies with 200 failures as opposed to 21
>> from indigo. This rate (~0.2%) was also confirmed using ~1 million inchis.
>>
>> I had a closer look to a couple of cases here:
>> http://nbviewer.ipython.org/gist/madgpap/8715974
>>
>> It seems that there is more that one reason for the failure. I totally
>> understand Greg's caution about the inchi2mol conversion, but given the
>> difference between rdkit and indigo, there might room for improvement. Any
>> insights would be very much appreciated.
>>
>> Btw, the KNIME workflow and full list of fails are available to you.
>>
>> Cheers,
>>
>> George
>>
>>
>>
>> On 30 January 2014 04:11, Greg Landrum  wrote:
>>
>>> Yeah, I have been tempted several times to remove the InChI->RDKit
>>> functionality entirely
>>>
>>>
>>>
>>> On Thu, Jan 30, 2014 at 5:05 AM, Igor Filippov <
>>> igor.v.filip...@gmail.com> wrote:
>>>
>>>> Thank you, Greg!
>>>> Very nice explanation and I think this issue has confused people before
>>>> me as well. I am going to have to keep reminding myself about it as the
>>>> subject comes up every now and then.
>>>>
>>>> Igor
>>>> On Jan 29, 2014 10:59 PM, "Greg Landrum" 
>>>> wrote:
>>>>
>>>>> Hi Igor,
>>>>>
>>>>> On Wed, Jan 29, 2014 at 2:04 PM, Igor Filippov <
>>>>> igor.v.filip...@gmail.com> wrote:
>>>>>
>>>>>> Greg et al,
>>>>>>
>>>>>> Here is a little script that demonstrates a problem with fingerprints
>>>>>> after the roundtrip through InChI.
>>>>>> My input mol file is also attached.
>>>>>> As you can see the similarity between "before" and "after" is not 1
>>>>>> in 45 out of 100 cases.
>>>>>> In one case it is as low as 0.29. Could someone take a look and tell
>>>>>> me what I'm doing wrong?
>>>>>>
>>>>>
>>>>> Ah! Now I see what you're doing and understand the problem.
>>>>>
>>>>> It's really important when using InChI to remember that InChI is
>>>>> designed to be an identifier, not an interchange format. The InChI
>>>>> algorithm modifies the molecule as part of its canonicalization step. This
>>>>> modification includes standardizing tautomers.
>>>>>
>>>>> Here's an example of the type of substructure modification that
>>>>> happens in your molecules:
>>>>> input smiles c1c1C(=O)Nc1c1 on begin converted to InChI and
>>>>> back yields: OC(=Nc1c1)c1c1
>>>>>
>>>>> Basically: If you think you know what your molecules are, you probably
>>>>> should be building them from SMILES or CTAB, not InChI.
>>>>>
>>>>> Apologies that I didn't think of this before; I was just focusing on
>>>>> the stereochemistry.
>>>>>
>>>>> -greg
>>>>>
>>>>
>>>
>>>
>>> --
>>> WatchGuard Dimension instantly turns raw network data into actionable
>>> security intelligence. It gives you real-time visual feedback on key
>>> security issues and trends.  Skip the complicated setup - simply import
>>> a virtual appliance and go from zero to informed in seconds.
>>>
>>&g

Re: [Rdkit-discuss] InChI roundtrip

2014-01-30 Thread George Papadatos
OK just to add some fuel to this fire: A colleague of mine and I looked at
the inchi roundtrip using KNIME 2.9 and the latest versions of indigo and
rdkit nodes. We used ~90,000 inchis from chembl_17, converted them to mols
(sanitise + remove Hs), removed the ones that fail to convert, and then we
converted back to inchis (standard ones, no extra parameters). We assessed
the discrepancies between indigo and rdkit inchis compared to the original
input inchis that are stored in chembl.
Rdkit had 10 times more discrepancies with 200 failures as opposed to 21
from indigo. This rate (~0.2%) was also confirmed using ~1 million inchis.

I had a closer look to a couple of cases here:
http://nbviewer.ipython.org/gist/madgpap/8715974

It seems that there is more that one reason for the failure. I totally
understand Greg's caution about the inchi2mol conversion, but given the
difference between rdkit and indigo, there might room for improvement. Any
insights would be very much appreciated.

Btw, the KNIME workflow and full list of fails are available to you.

Cheers,

George



On 30 January 2014 04:11, Greg Landrum  wrote:

> Yeah, I have been tempted several times to remove the InChI->RDKit
> functionality entirely
>
>
>
> On Thu, Jan 30, 2014 at 5:05 AM, Igor Filippov 
> wrote:
>
>> Thank you, Greg!
>> Very nice explanation and I think this issue has confused people before
>> me as well. I am going to have to keep reminding myself about it as the
>> subject comes up every now and then.
>>
>> Igor
>> On Jan 29, 2014 10:59 PM, "Greg Landrum"  wrote:
>>
>>> Hi Igor,
>>>
>>> On Wed, Jan 29, 2014 at 2:04 PM, Igor Filippov <
>>> igor.v.filip...@gmail.com> wrote:
>>>
 Greg et al,

 Here is a little script that demonstrates a problem with fingerprints
 after the roundtrip through InChI.
 My input mol file is also attached.
 As you can see the similarity between "before" and "after" is not 1 in
 45 out of 100 cases.
 In one case it is as low as 0.29. Could someone take a look and tell me
 what I'm doing wrong?

>>>
>>> Ah! Now I see what you're doing and understand the problem.
>>>
>>> It's really important when using InChI to remember that InChI is
>>> designed to be an identifier, not an interchange format. The InChI
>>> algorithm modifies the molecule as part of its canonicalization step. This
>>> modification includes standardizing tautomers.
>>>
>>> Here's an example of the type of substructure modification that happens
>>> in your molecules:
>>> input smiles c1c1C(=O)Nc1c1 on begin converted to InChI and back
>>> yields: OC(=Nc1c1)c1c1
>>>
>>> Basically: If you think you know what your molecules are, you probably
>>> should be building them from SMILES or CTAB, not InChI.
>>>
>>> Apologies that I didn't think of this before; I was just focusing on the
>>> stereochemistry.
>>>
>>> -greg
>>>
>>
>
>
> --
> WatchGuard Dimension instantly turns raw network data into actionable
> security intelligence. It gives you real-time visual feedback on key
> security issues and trends.  Skip the complicated setup - simply import
> a virtual appliance and go from zero to informed in seconds.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] MDS using RDKit, SciKit and Pandas

2014-01-21 Thread George Papadatos
Hi RDKitters,

This is not a question, more like an FYI.
Inspired by Noel's related post:
http://baoilleach.blogspot.co.uk/2014/01/convert-distance-matrix-to-2d.html,
I've put together an iPython Notebook example that performs MDS on a bunch
of ChEMBL compounds (i.e. visualises their chemical space in 2D).

http://nbviewer.ipython.org/gist/madgpap/8538507

Enjoy,

George


EMBL-EBI
--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] How to import RDkit into the ipython notebook

2013-12-23 Thread George Papadatos
Hi Mark,

Good to hear the workshop was effective!
Not sure what OS you use or how your python is set up but try adding the
RDKit folder path to your PYTHONPATH. Then launch a python prompt and type:
>>from rdkit import rdBase
>>print rdBase.rdkitVersion

if this works fine, then you'd be able to use RDKit with ipython and
ipython notebook.

Hope this helps.

George

-------
George Papadatos
EMBL-EBI




On 23 December 2013 12:42, Mark Forster  wrote:

> RDkit team and users
>
> I have just joined the RDkit discuss mailing list and as a newbie to RDkit
> I have some entry level questions.
> I did a quick search of the mailing list and some google searching, but no
> clear solution popped up.
> Can I get some pointers to this simple question?
>
> If the RDkit tools are installed as described here:
> http://rdkit.readthedocs.org/en/latest/Install.html
>
> What additional steps are required to get the RDkit functionality working
> in an ipython notebook?
>
> Experience suggests it might be something like setting PYTHONPATH but does
> a step by step guide exist?
> I saw the functionality demonstrated and used extensively at a recent EBI
> training workshop. Fantastic stuff.
> It is available with the MyCHEMBL virtual machine download, but at 18GB
> this is a network hog.
> Hence I would just like to get the installation steps needed to get RDkit
> and iPython NB working in harmony.
>
> regards
>
> Mark Forster
>
>
> --
> Rapidly troubleshoot problems before they affect your business. Most IT
> organizations don't have a clear picture of how application performance
> affects their revenue. With AppDynamics, you get 100% visibility into your
> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics
> Pro!
> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] postgres FPs in python

2013-11-08 Thread George Papadatos
Hi there,
DB-related question again:
When I retrieve fps from a postgres db, they look like this:
\x020c00102204810001040001981408420180400040048088c020800423a192001814002021044200092400040208

Is there are way to convert them to RDKit bitvector fingerprint objects or
at least bitvector strings in python?


Thanks,

George
--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Friday pandas q

2013-10-25 Thread George Papadatos
It worked! Many thanks!
g


On 25 October 2013 16:18, Greg Landrum  wrote:

> Hi George,
>
> Nikolas is really the expert here, but this just worked for me:
>
> curs.execute('select molregno,mol_send(m) from rdk.mols where m@
> >%s',('c12c1nncc2',))
>
> d = curs.fetchall()
>
> df2 = pd.DataFrame(d,columns=('molregno','pkl'))
>
> df2['romol']=df2.apply(lambda x:Chem.Mol(str(x['pkl'])),axis=1)
>
> PandasTools.RenderImagesInAllDataFrames()
> del df2['pkl']
> df2.head(2)
>
> -greg
>
>
>
> On Fri, Oct 25, 2013 at 4:43 PM, George Papadatos wrote:
>
>> Question to rdkit pandas users (pandaskitters?):
>>
>> I managed to have the mol_send(m) object in a pandas frame:
>> [image: Inline images 1]
>> if I do this: data['mol'].map(str).map(Chem.Mol)
>> I get the mol in base64 PNG:
>>
>> [image: Inline images 2]
>>
>> How do I display the column as rendered images (and keep them internally
>> as a Series of rdmols) ?
>>
>> PandasTools.ChangeMoleculeRendering seems relevant but I can't get it to
>> display the mols
>>
>> Cheers,
>>
>> George
>>
>
>
<><>--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Friday pandas q

2013-10-25 Thread George Papadatos
Question to rdkit pandas users (pandaskitters?):

I managed to have the mol_send(m) object in a pandas frame:
[image: Inline images 1]
if I do this: data['mol'].map(str).map(Chem.Mol)
I get the mol in base64 PNG:

[image: Inline images 2]

How do I display the column as rendered images (and keep them internally as
a Series of rdmols) ?

PandasTools.ChangeMoleculeRendering seems relevant but I can't get it to
display the mols

Cheers,

George
<><>--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rdkit mol objects from sql

2013-10-23 Thread George Papadatos
Yes it does; many thanks!
I've just found the notebook I mentioned:
http://nbviewer.ipython.org/4316426/
(Scroll to bottom)
I prefer Greg's first solution though, as it avoids the conversion from smiles 
completely. 

Best, 

George 

Sent from my gPad

> On 23 Oct 2013, at 20:39, JP  wrote:
> 
> Does the following help you george?
> http://comments.gmane.org/gmane.science.chemistry.rdkit.user/860
> 
> 
> 
>> On 23 October 2013 17:11, George Papadatos  wrote:
>> Hi RDKitters,
>> I must have seen this in an ipython notebook but can't find it right now:
>> If I have a table of rdkit mols generated by the cartridge, is there a way 
>> to retrieve them using a psycopg2 connection within python - ideally inside 
>> a pandas dataframe?
>> 
>> I've got this snippet:
>> import pandas as pd
>> import psycopg2
>> conn = psycopg2.connect("port=5432 user=chembl dbname=chembl_17")
>> data = pd.read_sql(sql, conn)
>> 
>> ...but I'm missing the step where I retrieve rdkit mol objects somehow 
>> instead of smiles. 
>> 
>> Many thanks in advance,
>> George
>> 
>> 
>> --
>> October Webinars: Code for Performance
>> Free Intel webinars can help you accelerate application performance.
>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
>> the latest Intel processors and coprocessors. See abstracts and register >
>> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] rdkit mol objects from sql

2013-10-23 Thread George Papadatos
Hi RDKitters,
I must have seen this in an ipython notebook but can't find it right now:
If I have a table of rdkit mols generated by the cartridge, is there a way
to retrieve them using a psycopg2 connection within python - ideally inside
a pandas dataframe?

I've got this snippet:
import pandas as pd
import psycopg2
conn = psycopg2.connect("port=5432 user=chembl dbname=chembl_17")
data = pd.read_sql(sql, conn)

...but I'm missing the step where I retrieve rdkit mol objects somehow
instead of smiles.

Many thanks in advance,
George
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Notes from the 2013 UGM

2013-10-23 Thread George Papadatos
Hi all,

I'd also like to thank you all for attending this UGM at the EBI and
contributing to its success.
And, of course, a big thanks to Greg for, well, you know.
:)

See you all next year - or sooner!

George



On 22 October 2013 09:09, Greg Landrum  wrote:

> Hi,
>
> Looks like I'm never going to have time to do a really thorough write up
> of the UGM. In the interests of getting something out there, I guess I will
> do something short.
>
> From my point of view, the UGM was a great success. George did a great job
> of getting everything organized, and everything went very smoothly. We had
> an interesting set of talks, some good questions and discussions during the
> talks, and a couple of very nice social activities at the pub.
>
> The slides and ipython notebooks for many of the talks are available in
> github:
> https://github.com/rdkit/UGM_2013
>
> A few things to note from the talks:
> 1) The code for PDB handling, MMFF94, and Open3DAlign is now all on the
> trunk. It will be in the upcoming release.
> 2) Jameed updated the MMPA code in Contrib; the new version is definitely
> worth checking out, as is Jameed's tutorial on how to use it (part of the
> materials linked to above).
> 3) Jameed (and his employer) also contributed an implementation of the
> Fraggle similarity algorithm described in his talk. The command line tools
> are now in Contrib and the main similarity code is in
> $RDBASE/rdkit/Chem/Fraggle. This will be in the upcoming release.
>
> The roundtable produced a long list of ideas for future features/changes.
> Some of these are already done, the rest will land in github as I manage to
> find time.
>
> We also had a discussion about the frequency of RDKit releases. It seems
> that the quarterly release cycle creates extra work for the community as
> well as me, so we're going to switch to doing releases every six months. If
> a critical bug is found (and fixed!) I'll do a patch release, but new
> features and improvements will only be released twice a year. Anyone who
> wants to stay on the "bleeding edge" can, of course, track the version of
> the code in github. That doesn't get checked in without passing tests on at
> least one platform. If this slower release cycle ends up creating problems,
> we can always go back to three or four times a year.
>
> Many many thanks to everyone for participating; in particular everyone who
> did a presentation or tutorial and George for the organization. I'm already
> looking forward to next year!
>
> -greg
>
>
>
>
> --
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit UGM 2013 - a few pictures

2013-10-16 Thread George Papadatos
Lovely pics Paul.
Many thanks,
g


On 14 October 2013 16:56, Paul Emsley  wrote:

>
> https://www.dropbox.com/sh/a3s55kmxa37yx7e/vLC5uea1xP
>
> Paul.
>
>
>
>
> --
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Problem drawing molecules under windows

2013-09-26 Thread George Papadatos
Hi Andrea,
Seems like a font problem to me, which could indicate the lack of
cairo/pango libraries.
George


On 26 September 2013 17:29, Greg Landrum  wrote:

> Hi Andrea,
>
> On Thu, Sep 26, 2013 at 10:59 AM, Andrea Volkamer  wrote:
>
>>
>>
>> **
>>
>> I am relatively new to rdkit, and just started using IPython notebook
>> under Windows.
>>
>> I installed WinPython-64bit-2.7.5.3 and RDKit_2013_06_1 as well as
>> Pillow-2.1.0.win-amd64-py2.7 to do so.
>>
>> Anyhow, I have some trouble drawing molecules:
>>
>> For some reason, drawing this molecule Chem.MolFromSmiles('C11CC')
>> works, adding, e.g.,  a nitrogen (Chem.MolFromSmiles('C11CCN')) doesn’t
>> work ()?
>>
>> This happens for many other examples as well. 
>>
>> Any suggestions?
>>
>
> Does it happen exclusively for molecules with heteroatoms?
>
> -greg
>
>
>
> --
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] failed tests for github master version on ubuntu

2013-09-25 Thread George Papadatos
Hi Paolo,
Thanks a lot for the quick response and the tips.
My problem was actually tk and the lack of the $DISPLAY variable.
Now everything has passed.
See you next week!
George



On 25 September 2013 10:36, Paolo Tosco  wrote:

>  Dear George,
>
> that test depends on the PIL Python module, which was not present in my
> Linux distro (Scientific Linux 6); once I installed it, the test ran fine.
>
> Regarding Ubuntu 13.04 and PIL, I just googled this thread:
> https://plus.google.com/112555004333838485342/posts/H8iRnbmdv7a
>
> Maybe this is your case too.
>
> Anyway, I suggest to try
>
> $ cd rdkit/Chem
> $ python test_list.py
>
> and see what is actually failing.
>
> HTH,
> Paolo
>
>
> On 09/25/2013 11:27 AM, George Papadatos wrote:
>
> Hello again,
> Sorry for the false alarm - that was me messing up with the env variables
> and not having enough coffee to realise it earlier!
> However, there is still one fail:
>  *99% tests passed, 1 tests failed out of 79*
>
>  Total Test time (real) =  88.37 sec
>
>  The following tests FAILED:
>  79 - pythonTestDirChem (Failed)
> Errors while running CTest
>
>  Do you have any ideas why that might be? Is it safe to ignore it?
>
>  George
>
>
>
> On 25 September 2013 10:04, George Papadatos  wrote:
>
>> Hi there,
>> I tried to install RDKit on a fresh Ubuntu 13.04 VM today.
>> I checkout out the source from GitHub master but I got the following
>> errors after ctest:
>>  65% tests passed, 28 tests failed out of 79
>>
>>  Total Test time (real) =  23.19 sec
>>
>>  The following tests FAILED:
>>   4 - pyBV (Failed)
>>   5 - pyDiscreteValueVect (Failed)
>>   6 - pySparseIntVect (Failed)
>>   9 - testPyGeometry (Failed)
>>  12 - pyAlignment (Failed)
>>  17 - pyDistGeom (Failed)
>>  27 - pyDepictor (Failed)
>>  37 - pyChemReactions (Failed)
>>  42 - pyFragCatalog (Failed)
>>  44 - pyMolDescriptors (Failed)
>>  46 - pyPartialCharges (Failed)
>>  48 - pyMolTransforms (Failed)
>>  51 - pyForceFieldHelpers (Failed)
>>  53 - pyDistGeom (Failed)
>>  55 - pyMolAlign (Failed)
>>  57 - pyChemicalFeatures (Failed)
>>  59 - pyShapeHelpers (Failed)
>>  61 - pyMolCatalog (Failed)
>>  63 - pySLNParse (Failed)
>>  64 - pyGraphMolWrap (Failed)
>>  65 - pyTestConformerWrap (Failed)
>>  68 - pyMatCalc (Failed)
>>  69 - pyCMIM (Failed)
>>  70 - pyRanker (Failed)
>>  72 - pyFeatures (Failed)
>>  73 - pythonTestDbCLI (Failed)
>>  74 - pythonTestDirML (Failed)
>>  79 - pythonTestDirChem (Failed)
>> Errors while running CTest
>>
>>  Any ideas why?
>>
>>  Thanks in advance,
>>
>>  George
>>
>>
>
>
> --
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
> the latest Intel processors and coprocessors. See abstracts and register 
> >http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
>
>
>
> ___
> Rdkit-discuss mailing 
> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
> --
> ==
> Paolo Tosco, Ph.D.
> Department of Drug Science and Technology
> Via Pietro Giuria, 9 - 10125 Torino (Italy)
> Tel: +39 011 670 7680 | Mob: +39 348 5537206
> Fax: +39 011 670 7687 | E-mail: paolo.tosco@unito.ithttp://open3dqsar.org | 
> http://open3dalign.org
> ==
>
>
>
> --
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] failed tests for github master version on ubuntu

2013-09-25 Thread George Papadatos
Hello again,
Sorry for the false alarm - that was me messing up with the env variables
and not having enough coffee to realise it earlier!
However, there is still one fail:
*99% tests passed, 1 tests failed out of 79*

Total Test time (real) =  88.37 sec

The following tests FAILED:
 79 - pythonTestDirChem (Failed)
Errors while running CTest

Do you have any ideas why that might be? Is it safe to ignore it?

George



On 25 September 2013 10:04, George Papadatos  wrote:

> Hi there,
> I tried to install RDKit on a fresh Ubuntu 13.04 VM today.
> I checkout out the source from GitHub master but I got the following
> errors after ctest:
> 65% tests passed, 28 tests failed out of 79
>
> Total Test time (real) =  23.19 sec
>
> The following tests FAILED:
>   4 - pyBV (Failed)
>   5 - pyDiscreteValueVect (Failed)
>   6 - pySparseIntVect (Failed)
>   9 - testPyGeometry (Failed)
>  12 - pyAlignment (Failed)
>  17 - pyDistGeom (Failed)
>  27 - pyDepictor (Failed)
>  37 - pyChemReactions (Failed)
>  42 - pyFragCatalog (Failed)
>  44 - pyMolDescriptors (Failed)
>  46 - pyPartialCharges (Failed)
>  48 - pyMolTransforms (Failed)
>  51 - pyForceFieldHelpers (Failed)
>  53 - pyDistGeom (Failed)
>  55 - pyMolAlign (Failed)
>  57 - pyChemicalFeatures (Failed)
>  59 - pyShapeHelpers (Failed)
>  61 - pyMolCatalog (Failed)
>  63 - pySLNParse (Failed)
>  64 - pyGraphMolWrap (Failed)
>  65 - pyTestConformerWrap (Failed)
>  68 - pyMatCalc (Failed)
>  69 - pyCMIM (Failed)
>  70 - pyRanker (Failed)
>  72 - pyFeatures (Failed)
>  73 - pythonTestDbCLI (Failed)
>  74 - pythonTestDirML (Failed)
>  79 - pythonTestDirChem (Failed)
> Errors while running CTest
>
> Any ideas why?
>
> Thanks in advance,
>
> George
>
>
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] failed tests for github master version on ubuntu

2013-09-25 Thread George Papadatos
Hi there,
I tried to install RDKit on a fresh Ubuntu 13.04 VM today.
I checkout out the source from GitHub master but I got the following errors
after ctest:
65% tests passed, 28 tests failed out of 79

Total Test time (real) =  23.19 sec

The following tests FAILED:
  4 - pyBV (Failed)
  5 - pyDiscreteValueVect (Failed)
  6 - pySparseIntVect (Failed)
  9 - testPyGeometry (Failed)
 12 - pyAlignment (Failed)
 17 - pyDistGeom (Failed)
 27 - pyDepictor (Failed)
 37 - pyChemReactions (Failed)
 42 - pyFragCatalog (Failed)
 44 - pyMolDescriptors (Failed)
 46 - pyPartialCharges (Failed)
 48 - pyMolTransforms (Failed)
 51 - pyForceFieldHelpers (Failed)
 53 - pyDistGeom (Failed)
 55 - pyMolAlign (Failed)
 57 - pyChemicalFeatures (Failed)
 59 - pyShapeHelpers (Failed)
 61 - pyMolCatalog (Failed)
 63 - pySLNParse (Failed)
 64 - pyGraphMolWrap (Failed)
 65 - pyTestConformerWrap (Failed)
 68 - pyMatCalc (Failed)
 69 - pyCMIM (Failed)
 70 - pyRanker (Failed)
 72 - pyFeatures (Failed)
 73 - pythonTestDbCLI (Failed)
 74 - pythonTestDirML (Failed)
 79 - pythonTestDirChem (Failed)
Errors while running CTest

Any ideas why?

Thanks in advance,

George
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] name generator

2013-08-27 Thread George Papadatos
OPSIN wouldn't help very much here, as it deals with the inverse problem, i.e. 
name to structure. 

George. 


Sent from my giPhone

On 27 Aug 2013, at 19:40, Vladimir Chupakhin  wrote:

> Hi,
> 
> did you tried http://opsin.ch.cam.ac.uk/ ?
> 
> Vladimir Chupakhin
> 
> 
> 
> On Tue, Aug 27, 2013 at 6:48 PM, Markus Hartenfeller 
>  wrote:
>> Hi Sergio,
>> 
>> here is a solution that uses a free web service offered by the NIH. 
>> 
>> It's independent of the rdkit but rather slow. Anyway, if you don't need to 
>> process too many molecules at a time or if time is not the critical factor 
>> maybe it could serve as an intermediate solution:
>> 
>> 
>> import urllib2
>> 
>> def smi_to_iupac(smi):
>> 
>> try:
>> url = 
>> 'http://cactus.nci.nih.gov/chemical/structure/'+smi+'/iupac_name'
>> 
>> iupacName = urllib2.urlopen(url).read()
>> #print iupacName
>> return iupacName
>> 
>> except urllib2.HTTPError, e:
>> print "HTTP error: %d" % e.code
>> return None
>> except urllib2.URLError, e:
>> print "Network error: %s" % e.reason.args[1]
>> return None
>> except:
>> print "conversion failed for smiles "+ smi
>> return None
>> 
>> smiles = ["CC(O)C","CC(=O)O",   
>> "O=C2OCC(=C2\c1c1)\c3ccc(cc3)S(=O)(=O)C"]
>> 
>> for s in smiles:
>> print smi_to_iupac(s)
>> 
>> 
>> returns
>> Propan-2-ol
>> acetic acid
>> 4-(4-methylsulfonylphenyl)-3-phenyl-5H-furan-2-one
>> 
>> By the way, this service offers conversions between many different 
>> molecule formats/identifiers. I have used it in the past for CAS number 
>> look-up.
>> 
>> Best,
>> Markus
>> 
>> 
>> On 08/27/2013 05:21 PM, Sergio Martinez Cuesta wrote:
>>> 
>>> Hi,
>>> 
>>> is there any IUPAC name generator in RDKit?
>>> 
>>> e.g. for transforming "CC(C)O" into "propan-2-ol" ?
>>> 
>>> Many thanks
>>> Sergio
>>> 
>>> 
>>> --
>>> Introducing Performance Central, a new site from SourceForge and 
>>> AppDynamics. Performance Central is your source for news, insights, 
>>> analysis and resources for efficient Application Performance Management. 
>>> Visit us today!
>>> http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
>>> 
>>> 
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> 
>> 
>> --
>> Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
>> Discover the easy way to master current and previous Microsoft technologies
>> and advance your career. Get an incredible 1,500+ hours of step-by-step
>> tutorial videos with LearnDevNow. Subscribe today and save!
>> http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 
> --
> Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
> Discover the easy way to master current and previous Microsoft technologies
> and advance your career. Get an incredible 1,500+ hours of step-by-step
> tutorial videos with LearnDevNow. Subscribe today and save!
> http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] name generator

2013-08-27 Thread George Papadatos
I think this is not an actual structure to name converter but a look-up service 
based on a a predefined dictionary. 
If this is true, then it won't return anything for any novel/unseen structures. 
Give it a try and let us know. 

George. 

Sent from my giPhone

On 27 Aug 2013, at 18:39, David Hall  wrote:

> Not sure what software is behind it, but the NCI's Chemical Identifier 
> Resolver may suit your needs.
> 
> For your example, the URL:
> 
> http://cactus.nci.nih.gov/chemical/structure/CC(C)O/iupac_name
> 
> returns Propan-2-ol
> 
> -David
> 
> On Aug 27, 2013, at 11:54 AM, Sergio Martinez Cuesta  
> wrote:
> 
>> thanks Greg,
>> 
>> indeed, I only found commercial software for it
>> 
>> http://www.chemaxon.com/marvin/help/applications/molconvert.html
>> 
>> cheers
>> Sergio
>> 
>> 
>> On 27 August 2013 16:45, Greg Landrum  wrote:
>>> Dear Sergio,
>>> 
>>> 
>>> On Tue, Aug 27, 2013 at 5:21 PM, Sergio Martinez Cuesta 
>>>  wrote:
 is there any IUPAC name generator in RDKit?
 
 e.g. for transforming "CC(C)O" into "propan-2-ol" ?
>>> 
>>> There is not. In fact, I'm not aware of any open source structure->name 
>>> converters.
>>> 
>>> -greg
>> 
>> --
>> Introducing Performance Central, a new site from SourceForge and 
>> AppDynamics. Performance Central is your source for news, insights, 
>> analysis and resources for efficient Application Performance Management. 
>> Visit us today!
>> http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> --
> Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
> Discover the easy way to master current and previous Microsoft technologies
> and advance your career. Get an incredible 1,500+ hours of step-by-step
> tutorial videos with LearnDevNow. Subscribe today and save!
> http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MMP analysis - active vs. inactive compounds

2013-05-03 Thread George Papadatos
Hi Paul,

I guess you firstly have to generate the list of MMPs as per Jameed's code,
secondly you join your property values for MolID1 and MolID2 and finally
you calculate the property difference/ratio for each MMP.

Best regards,

George


On 3 May 2013 12:10,  wrote:

> Dear RDKitters,
>
> has anyone applied Jameed's great code to the following scenario:
> -> Perform a MMP analysis with respect to a particular property (e.g.
> activity)
>
> Given the current code, I do not see any chance to consider any property
> besides the compound ID.
>
> It is also not possible to provide 2 files (one for the active compounds,
> one for the inactive compounds) - or am I wrong?
>
>
> Cheers & Thanks,
> Paul
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://www.merckgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.
>
>
> --
> Get 100% visibility into Java/.NET code with AppDynamics Lite
> It's a free troubleshooting tool designed for production
> Get down to code-level detail for bottlenecks, with <2% overhead.
> Download for free and get started troubleshooting in minutes.
> http://p.sf.net/sfu/appdyn_d2d_ap2
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Cartridge problems (again) Ubuntu 64-bit

2013-03-21 Thread George Papadatos
Ah, many thanks for the clarification Greg.
Are these changes related to the problematic phenanthrene substructure
query?
When is the new release scheduled for?

Cheers,

George
EMBL-EBI



On 21 March 2013 17:58, greg landrum  wrote:

> These are due to some ongoing changes in the rdkit fingerprint. Don't
> worry about them.
> I will fix those tests after the fingerprint changes settle down,
> definitely before the next release.
>
> -greg
>
> On Mar 21, 2013, at 1:22 PM, George Papadatos 
> wrote:
>
> Hi RDKitters,
>
> So I've successfully installed RDKit from the *svn trunk* on a brand new
> Ubuntu Server 12.10 64-bit VM.
> All 77/77 tests passed. Yay.
>
> When I tried to build the cartridge against psql 9.1.8, 4/8 tests failed:
>
> ## Build RDKit Cartridge
> cd $RDBASE/Code/PgSQL/rdkit
> make
> sudo make install
> make installcheck
>
> == dropping database "contrib_regression" ==
> NOTICE:  database "contrib_regression" does not exist, skipping
> DROP DATABASE
> == creating database "contrib_regression" ==
> CREATE DATABASE
> ALTER DATABASE
> == running regression test queries==
> test rdkit-91 ... FAILED
> test props... ok
> test btree... FAILED
> test molgist  ... ok
> test bfpgist-91   ... FAILED
> test sfpgist  ... ok
> test slfpgist ... ok
> test fps  ... FAILED
>
> ==
>  4 of 8 tests failed.
> ==
>
> However, the following works:
>
> createdb test
> psql test
>
> psql (9.1.8)
> Type "help" for help.
>
> test=# create extension "rdkit";
> CREATE EXTENSION
> test=# show rdkit.tanimoto_threshold;
>  rdkit.tanimoto_threshold
> --
>  0.5
> (1 row)
>
> test=# select 'c1c1O'::mol;
> mol
> ---
>  Oc1c1
> (1 row)
>
> Any ideas?
>
> Many thanks in advance,
>
> George
> EMBL-EBI
>
>
>
>
> --
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_mar
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Cartridge problems (again) Ubuntu 64-bit

2013-03-21 Thread George Papadatos
Hi RDKitters,

So I've successfully installed RDKit from the *svn trunk* on a brand new
Ubuntu Server 12.10 64-bit VM.
All 77/77 tests passed. Yay.

When I tried to build the cartridge against psql 9.1.8, 4/8 tests failed:

## Build RDKit Cartridge
cd $RDBASE/Code/PgSQL/rdkit
make
sudo make install
make installcheck

== dropping database "contrib_regression" ==
NOTICE:  database "contrib_regression" does not exist, skipping
DROP DATABASE
== creating database "contrib_regression" ==
CREATE DATABASE
ALTER DATABASE
== running regression test queries==
test rdkit-91 ... FAILED
test props... ok
test btree... FAILED
test molgist  ... ok
test bfpgist-91   ... FAILED
test sfpgist  ... ok
test slfpgist ... ok
test fps  ... FAILED

==
 4 of 8 tests failed.
==

However, the following works:

createdb test
psql test

psql (9.1.8)
Type "help" for help.

test=# create extension "rdkit";
CREATE EXTENSION
test=# show rdkit.tanimoto_threshold;
 rdkit.tanimoto_threshold
--
 0.5
(1 row)

test=# select 'c1c1O'::mol;
mol
---
 Oc1c1
(1 row)

Any ideas?

Many thanks in advance,

George
EMBL-EBI
--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] windows binary installation

2013-02-22 Thread George Papadatos
Missing % after RDBASE?

Sent from my giPhone

On 22 Feb 2013, at 07:41, paul.czodrow...@merckgroup.com wrote:

> ImportError: No module named rdkit 
> 
> 
> RDBASEC:\RDKit_2012_12_1 
> PYTHONPATHC:\Python27;%RDBASE 
> PATHC:\Python27;%RDBASE%\lib 
> 
> 
> 
> cheers & thanks, 
> paul 
> 
> > Paul, 
> > 
> > What's the failure? 
> > 
> > As a reminder, you need three things set: 
> > RDBASE 
> > PYTHONPATH should include RDBASE 
> > PATH should include RDBASE/lib 
> > 
> > Also: to see the contents of the PYTHONPATH, you should do: 
> > import sys 
> > print sys.path 
> > 
> > -greg 
> This message and any attachment are confidential and may be privileged or 
> otherwise protected from disclosure. If you are not the intended recipient, 
> you must not copy this message or attachment or disclose the contents to any 
> other person. If you have received this transmission in error, please notify 
> the sender immediately and delete the message and any attachment from your 
> system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not 
> accept liability for any omissions or errors in this message which may arise 
> as a result of E-Mail-transmission or for damages resulting from any 
> unauthorized changes of the content of this message and any attachment 
> thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not 
> guarantee that this message is free of viruses and does not accept liability 
> for any damages caused by any virus transmitted therewith.
> 
> Click http://www.merckgroup.com/disclaimer to access the German, French, 
> Spanish and Portuguese versions of this disclaimer.
> --
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_feb
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] problems with RDKit and Mountain Lion

2012-10-10 Thread George Papadatos
Hello again,

Success at last! I managed to build rdkit using brew and boost 1.49.
I think the cause of the problem was a strange combination of Mountain
Lion, boost 1.51 and not up-to-date rdkit svn repo in HomeBrew.

So to summarise:

brew update
brew uninstall boost
brew versions boost
cd /usr/local
git checkout e40bc41 /usr/local/Library/Formula/boost.rb #version 1.49.0
cd

brew install boost --build-from-source

brew untap edc/homebrew-rdkit
brew tap edc/homebrew-rdkit
brew uninstall rdkit
brew install --HEAD rdkit

Thanks for all the tips,

George



On 10 October 2012 12:32, George Papadatos  wrote:

> Hi Greg,
>
> I built boost 1.49 from source and tried again. There is now a similar
> error but elsewhere:
>
>  44%] Building CXX object
> Code/GraphMol/SmilesParse/CMakeFiles/SmilesParse.dir/SmartsWrite.cpp.o
> Linking CXX shared library ../../../lib/libSmilesParse.dylib
> ld: warning: path
> '//Library/Frameworks/Python.framework/Versions/2.7/Python' following -L
> not a directory
> Undefined symbols for architecture x86_64:
>   "yysmarts_parse(char const*, std::vector std::allocator >*, void*)", referenced from:
>   RDKit::(anonymous namespace)::smarts_parse(std::string const&,
> std::vector >&) in
> SmilesParse.cpp.o
>   "yysmiles_parse(char const*, std::vector std::allocator >*, std::list std::allocator >*, void*)", referenced from:
>   RDKit::(anonymous namespace)::smiles_parse(std::string const&,
> std::vector >&) in
> SmilesParse.cpp.o
>   "yysmarts_lex_init(void**)", referenced from:
>   RDKit::(anonymous namespace)::smarts_parse(std::string const&,
> std::vector >&) in
> SmilesParse.cpp.o
>   "yysmiles_lex_init(void**)", referenced from:
>   RDKit::(anonymous namespace)::smiles_parse(std::string const&,
> std::vector >&) in
> SmilesParse.cpp.o
>   "setup_smarts_string(std::string const&, void*)", referenced from:
>   RDKit::(anonymous namespace)::smarts_parse(std::string const&,
> std::vector >&) in
> SmilesParse.cpp.o
>   "setup_smiles_string(std::string const&, void*)", referenced from:
>   RDKit::(anonymous namespace)::smiles_parse(std::string const&,
> std::vector >&) in
> SmilesParse.cpp.o
>   "yysmarts_lex_destroy(void*)", referenced from:
>   RDKit::(anonymous namespace)::smarts_parse(std::string const&,
> std::vector >&) in
> SmilesParse.cpp.o
>   "yysmiles_lex_destroy(void*)", referenced from:
>   RDKit::(anonymous namespace)::smiles_parse(std::string const&,
> std::vector >&) in
> SmilesParse.cpp.o
>   "_yysmarts_debug", referenced from:
>   RDKit::SmartsToMol(std::string, int, bool, std::map std::string, std::less, std::allocator const, std::string> > >*) in SmilesParse.cpp.o
>   "_yysmiles_debug", referenced from:
>   RDKit::SmilesToMol(std::string, int, bool, std::map std::string, std::less, std::allocator const, std::string> > >*) in SmilesParse.cpp.o
> ld: symbol(s) not found for architecture x86_64
> clang: error: linker command failed with exit code 1 (use -v to see
> invocation)
> make[2]: *** [lib/libSmilesParse.2012.09.1beta.dylib] Error 1
> make[1]: *** [Code/GraphMol/SmilesParse/CMakeFiles/SmilesParse.dir/all]
> Error 2
> make: *** [all] Error 2
>
>
> On 10 October 2012 02:35, Greg Landrum  wrote:
>
>> Just an FYI, not sure if it's relevant or not: I have not yet done an
>> rdkit build with boost 1.51, so I am not sure that the problem isn't there
>>
>> -greg
>>
>>
>>
>> On Tuesday, October 9, 2012, George Papadatos wrote:
>>
>>> Hi James,
>>>
>>> You're right. I checked out the true HEAD which is 2234 but it still
>>> failed!
>>>  This is the make log:
>>>
>>> MS-Verdun:build georgep$ cmake -D PYTHON_LIBRARY=/${PYTHON_ROOT}/Python
>>> -DPYTHON_INCLUDE_DIR=${PYTHON_ROOT}/Headers .. 2>&1 | tee cmake.log
>>> -- The C compiler identification is GNU 4.2.1
>>> -- The CXX compiler identification is Clang 4.1.0
>>> -- Checking whether C compiler has -isysroot
>>> -- Checking whether C compiler has -isysroot - yes
>>> -- Checking whether C compiler supports OSX deployment target flag
>>> -- Checking whether C compiler supports OSX deployment target flag - yes
>>> -- Check for working C compiler: /usr/bin/gcc
>>> -- Check for working C compiler: /usr/bin/gcc -- works
>>> -- Detecting C compiler ABI info
>>> -- Detecting C compiler ABI info - done
>>> -- Check for working CXX compiler: /us

Re: [Rdkit-discuss] problems with RDKit and Mountain Lion

2012-10-10 Thread George Papadatos
Hi Greg,

I built boost 1.49 from source and tried again. There is now a similar
error but elsewhere:

 44%] Building CXX object
Code/GraphMol/SmilesParse/CMakeFiles/SmilesParse.dir/SmartsWrite.cpp.o
Linking CXX shared library ../../../lib/libSmilesParse.dylib
ld: warning: path
'//Library/Frameworks/Python.framework/Versions/2.7/Python' following -L
not a directory
Undefined symbols for architecture x86_64:
  "yysmarts_parse(char const*, std::vector >*, void*)", referenced from:
  RDKit::(anonymous namespace)::smarts_parse(std::string const&,
std::vector >&) in
SmilesParse.cpp.o
  "yysmiles_parse(char const*, std::vector >*, std::list >*, void*)", referenced from:
  RDKit::(anonymous namespace)::smiles_parse(std::string const&,
std::vector >&) in
SmilesParse.cpp.o
  "yysmarts_lex_init(void**)", referenced from:
  RDKit::(anonymous namespace)::smarts_parse(std::string const&,
std::vector >&) in
SmilesParse.cpp.o
  "yysmiles_lex_init(void**)", referenced from:
  RDKit::(anonymous namespace)::smiles_parse(std::string const&,
std::vector >&) in
SmilesParse.cpp.o
  "setup_smarts_string(std::string const&, void*)", referenced from:
  RDKit::(anonymous namespace)::smarts_parse(std::string const&,
std::vector >&) in
SmilesParse.cpp.o
  "setup_smiles_string(std::string const&, void*)", referenced from:
  RDKit::(anonymous namespace)::smiles_parse(std::string const&,
std::vector >&) in
SmilesParse.cpp.o
  "yysmarts_lex_destroy(void*)", referenced from:
  RDKit::(anonymous namespace)::smarts_parse(std::string const&,
std::vector >&) in
SmilesParse.cpp.o
  "yysmiles_lex_destroy(void*)", referenced from:
  RDKit::(anonymous namespace)::smiles_parse(std::string const&,
std::vector >&) in
SmilesParse.cpp.o
  "_yysmarts_debug", referenced from:
  RDKit::SmartsToMol(std::string, int, bool, std::map, std::allocator > >*) in SmilesParse.cpp.o
  "_yysmiles_debug", referenced from:
  RDKit::SmilesToMol(std::string, int, bool, std::map, std::allocator > >*) in SmilesParse.cpp.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see
invocation)
make[2]: *** [lib/libSmilesParse.2012.09.1beta.dylib] Error 1
make[1]: *** [Code/GraphMol/SmilesParse/CMakeFiles/SmilesParse.dir/all]
Error 2
make: *** [all] Error 2


On 10 October 2012 02:35, Greg Landrum  wrote:

> Just an FYI, not sure if it's relevant or not: I have not yet done an
> rdkit build with boost 1.51, so I am not sure that the problem isn't there
>
> -greg
>
>
>
> On Tuesday, October 9, 2012, George Papadatos wrote:
>
>> Hi James,
>>
>> You're right. I checked out the true HEAD which is 2234 but it still
>> failed!
>>  This is the make log:
>>
>> MS-Verdun:build georgep$ cmake -D PYTHON_LIBRARY=/${PYTHON_ROOT}/Python
>> -DPYTHON_INCLUDE_DIR=${PYTHON_ROOT}/Headers .. 2>&1 | tee cmake.log
>> -- The C compiler identification is GNU 4.2.1
>> -- The CXX compiler identification is Clang 4.1.0
>> -- Checking whether C compiler has -isysroot
>> -- Checking whether C compiler has -isysroot - yes
>> -- Checking whether C compiler supports OSX deployment target flag
>> -- Checking whether C compiler supports OSX deployment target flag - yes
>> -- Check for working C compiler: /usr/bin/gcc
>> -- Check for working C compiler: /usr/bin/gcc -- works
>> -- Detecting C compiler ABI info
>> -- Detecting C compiler ABI info - done
>> -- Check for working CXX compiler: /usr/bin/c++
>> -- Check for working CXX compiler: /usr/bin/c++ -- works
>> -- Detecting CXX compiler ABI info
>> -- Detecting CXX compiler ABI info - done
>> -- Check if the system is big endian
>> -- Searching 16 bit integer
>> -- Looking for sys/types.h
>> -- Looking for sys/types.h - found
>> -- Looking for stdint.h
>> -- Looking for stdint.h - found
>> -- Looking for stddef.h
>> -- Looking for stddef.h - found
>> -- Check size of unsigned short
>> -- Check size of unsigned short - done
>> -- Using unsigned short
>> -- Check if the system is big endian - little endian
>> -- Found PythonLibs:
>> //Library/Frameworks/Python.framework/Versions/2.7/Python (found version
>> "2.7.3")
>> -- Found PythonInterp:
>> /Library/Frameworks/Python.framework/Versions/2.7/bin/python (found version
>> "2.7.3")
>> -- Boost version: 1.51.0
>> -- Found the following Boost libraries:
>> --   python
>> -- Found BISON: /usr/bin/bison
>> -- Found FLEX: /usr/bin/flex
>> -- Looking for 

Re: [Rdkit-discuss] problems with RDKit and Mountain Lion

2012-10-09 Thread George Papadatos
Hi James,

You're right. I checked out the true HEAD which is 2234 but it still
failed!
 This is the make log:

MS-Verdun:build georgep$ cmake -D PYTHON_LIBRARY=/${PYTHON_ROOT}/Python
-DPYTHON_INCLUDE_DIR=${PYTHON_ROOT}/Headers .. 2>&1 | tee cmake.log
-- The C compiler identification is GNU 4.2.1
-- The CXX compiler identification is Clang 4.1.0
-- Checking whether C compiler has -isysroot
-- Checking whether C compiler has -isysroot - yes
-- Checking whether C compiler supports OSX deployment target flag
-- Checking whether C compiler supports OSX deployment target flag - yes
-- Check for working C compiler: /usr/bin/gcc
-- Check for working C compiler: /usr/bin/gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check if the system is big endian
-- Searching 16 bit integer
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of unsigned short
-- Check size of unsigned short - done
-- Using unsigned short
-- Check if the system is big endian - little endian
-- Found PythonLibs:
//Library/Frameworks/Python.framework/Versions/2.7/Python (found version
"2.7.3")
-- Found PythonInterp:
/Library/Frameworks/Python.framework/Versions/2.7/bin/python (found version
"2.7.3")
-- Boost version: 1.51.0
-- Found the following Boost libraries:
--   python
-- Found BISON: /usr/bin/bison
-- Found FLEX: /usr/bin/flex
-- Looking for include file pthread.h
-- Looking for include file pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - found
-- Found Threads: TRUE
-- Boost version: 1.51.0
-- Found the following Boost libraries:
--   regex
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/georgep/rdkit/rdkit-code/build

*and the error is:*

Linking CXX shared library ../../lib/libGraphMol.dylib
ld: warning: path
'//Library/Frameworks/Python.framework/Versions/2.7/Python' following -L
not a directory
Undefined symbols for architecture x86_64:
  "boost::system::system_category()", referenced from:
  __GLOBAL__I_a in QueryAtom.cpp.o
  __GLOBAL__I_a in QueryBond.cpp.o
  __GLOBAL__I_a in ROMol.cpp.o
  __GLOBAL__I_a in QueryOps.cpp.o
  boost::mutex::mutex() in MolPickler.cpp.o
  __GLOBAL__I_a in MolPickler.cpp.o
  __GLOBAL__I_a in AtomIterators.cpp.o
  ...
  "boost::system::generic_category()", referenced from:
  __GLOBAL__I_a in QueryAtom.cpp.o
  __GLOBAL__I_a in QueryBond.cpp.o
  __GLOBAL__I_a in ROMol.cpp.o
  __GLOBAL__I_a in QueryOps.cpp.o
  __GLOBAL__I_a in MolPickler.cpp.o
  __GLOBAL__I_a in AtomIterators.cpp.o
  __GLOBAL__I_a in AddHs.cpp.o
  ...
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see
invocation)
make[2]: *** [lib/libGraphMol.2012.09.1beta.dylib] Error 1
make[1]: *** [Code/GraphMol/CMakeFiles/GraphMol.dir/all] Error 2
make: *** [all] Error 2

Any more ideas?

Regards,

George




On 9 October 2012 22:33, James Swetnam  wrote:

> George-
>
> My templating fix was submitted as 2155, and HEAD in SVN is at 2234.  I'm
> not terribly familiar with homebrew, or why it thinks 2148 is HEAD
>
> James
>
>
> On Tue, Oct 9, 2012 at 2:27 PM, George Papadatos wrote:
>
>> Hi James,
>>
>> Many thanks for the quick answer. I'm afraid I'm already using the trunk:
>> brew install -v --HEAD rdkit --with-inchi
>> (revision 2148)
>>
>> Regards,
>>
>> George
>>
>>
>> On 9 October 2012 21:30, James Swetnam  wrote:
>>
>>> George-
>>>
>>> I believe you're running into an issue that was raised on the developer
>>> list.  I submitted a patch for this issue, which has been applied in the
>>> SVN trunk.  If you install from trunk you should be fine.
>>>
>>> Best
>>> James
>>>
>>> On Tue, Oct 9, 2012 at 12:52 PM, George Papadatos 
>>> wrote:
>>>
>>>> HI RDKitters,
>>>>
>>>> I get compilation errors when I try to build RDKit on a new Mountain
>>>> Lion Mac OS machine.
>>>> I've tried both Eddie's brew formula and manual installation with
>>>> cmake. I also tried both the beta 2012_09 versions and the 2012_06 one.
>>>> Apart from the system python, I use the python.org version (2.7.3)
>>>> I also used brew to build boost from source. I copied the error I get
>>>> at the bottom of this message.
&g

Re: [Rdkit-discuss] problems with RDKit and Mountain Lion

2012-10-09 Thread George Papadatos
Hi James,

Many thanks for the quick answer. I'm afraid I'm already using the trunk:
brew install -v --HEAD rdkit --with-inchi
(revision 2148)

Regards,

George


On 9 October 2012 21:30, James Swetnam  wrote:

> George-
>
> I believe you're running into an issue that was raised on the developer
> list.  I submitted a patch for this issue, which has been applied in the
> SVN trunk.  If you install from trunk you should be fine.
>
> Best
> James
>
> On Tue, Oct 9, 2012 at 12:52 PM, George Papadatos wrote:
>
>> HI RDKitters,
>>
>> I get compilation errors when I try to build RDKit on a new Mountain Lion
>> Mac OS machine.
>> I've tried both Eddie's brew formula and manual installation with cmake.
>> I also tried both the beta 2012_09 versions and the 2012_06 one.
>> Apart from the system python, I use the python.org version (2.7.3)
>> I also used brew to build boost from source. I copied the error I get at
>> the bottom of this message.
>>
>> Has anyone had a similar problem? Any ideas for troubleshooting?
>>
>> Many thanks,
>>
>> George
>>
>>
>> Linking CXX shared library ../../lib/libGraphMol.dylib
>> cd /tmp/rdkit-urlC/Code/GraphMol &&
>> /usr/local/Cellar/cmake/2.8.9/bin/cmake -E cmake_link_script
>> CMakeFiles/GraphMol.dir/link.txt --verbose=1
>> /usr/local/Library/ENV/4.3/c++   -shared   -compatibility_version 1.0.0
>> -current_version 2012.9.1 -o ../../lib/libGraphMol.2012.09.1pre.dylib
>> -install_name /tmp/rdkit-urlC/lib/libGraphMol.1.dylib
>> CMakeFiles/GraphMol.dir/Atom.cpp.o CMakeFiles/GraphMol.dir/QueryAtom.cpp.o
>> CMakeFiles/GraphMol.dir/QueryBond.cpp.o CMakeFiles/GraphMol.dir/Bond.cpp.o
>> CMakeFiles/GraphMol.dir/MolOps.cpp.o
>> CMakeFiles/GraphMol.dir/FindRings.cpp.o CMakeFiles/GraphMol.dir/ROMol.cpp.o
>> CMakeFiles/GraphMol.dir/RWMol.cpp.o
>> CMakeFiles/GraphMol.dir/PeriodicTable.cpp.o
>> CMakeFiles/GraphMol.dir/atomic_data.cpp.o
>> CMakeFiles/GraphMol.dir/QueryOps.cpp.o
>> CMakeFiles/GraphMol.dir/MolPickler.cpp.o
>> CMakeFiles/GraphMol.dir/Canon.cpp.o
>> CMakeFiles/GraphMol.dir/AtomIterators.cpp.o
>> CMakeFiles/GraphMol.dir/BondIterators.cpp.o
>> CMakeFiles/GraphMol.dir/Aromaticity.cpp.o
>> CMakeFiles/GraphMol.dir/Kekulize.cpp.o
>> CMakeFiles/GraphMol.dir/MolDiscriminators.cpp.o
>> CMakeFiles/GraphMol.dir/ConjugHybrid.cpp.o
>> CMakeFiles/GraphMol.dir/AddHs.cpp.o CMakeFiles/GraphMol.dir/RankAtoms.cpp.o
>> CMakeFiles/GraphMol.dir/Matrices.cpp.o
>> CMakeFiles/GraphMol.dir/Chirality.cpp.o
>> CMakeFiles/GraphMol.dir/RingInfo.cpp.o
>> CMakeFiles/GraphMol.dir/Conformer.cpp.o
>> -L/System/Library/Frameworks/Python.framework/Versions/2.7/Python
>> ../../lib/libRDGeometryLib.2012.09.1pre.dylib
>> ../../lib/libRDGeneral.2012.09.1pre.dylib
>> ../../lib/libDataStructs.2012.09.1pre.dylib
>> ../../lib/libRDGeneral.2012.09.1pre.dylib
>> Undefined symbols for architecture x86_64:
>>   "boost::any RDKit::Dict::toany
>> >(boost::shared_array) const", referenced from:
>>   void RDKit::Dict::setVal >(std::string
>> const&, boost::shared_array&) in Matrices.cpp.o
>>   "boost::any RDKit::Dict::toany
>> >(boost::shared_array) const", referenced from:
>>   void RDKit::Dict::setVal >(std::string
>> const&, boost::shared_array&) in Matrices.cpp.o
>>   "boost::any RDKit::Dict::toany(std::string) const",
>> referenced from:
>>   void RDKit::Dict::setVal(std::string const&,
>> std::string&) in Chirality.cpp.o
>>   "boost::any RDKit::Dict::toany >
>> >(std::list >) const", referenced from:
>>   void RDKit::Dict::setVal >
>> >(std::string const&, std::list >&) in Canon.cpp.o
>>   "boost::any RDKit::Dict::toany> std::allocator >, std::allocator
>> > > > >(std::vector >,
>> std::allocator > > >) const",
>> referenced from:
>>   void RDKit::Dict::setVal> std::allocator >, std::allocator
>> > > > >(std::string const&, std::vector> std::allocator >, std::allocator
>> > > >&) in FindRings.cpp.o
>>   "boost::any RDKit::Dict::toany> std::allocator > >(std::vector> std::allocator >) const", referenced from:
>>   void RDKit::Dict::setVal> std::allocator > >(std::string const&,
>> std::vector >&) in Atom.cpp.o
>>   void RDKit::Dict::setVal> std::allocator > >(std::string const&,
>> std::vector >&) in FindRings.cpp.o
>>   

[Rdkit-discuss] problems with RDKit and Mountain Lion

2012-10-09 Thread George Papadatos
HI RDKitters,

I get compilation errors when I try to build RDKit on a new Mountain Lion
Mac OS machine.
I've tried both Eddie's brew formula and manual installation with cmake. I
also tried both the beta 2012_09 versions and the 2012_06 one.
Apart from the system python, I use the python.org version (2.7.3)
I also used brew to build boost from source. I copied the error I get at
the bottom of this message.

Has anyone had a similar problem? Any ideas for troubleshooting?

Many thanks,

George


Linking CXX shared library ../../lib/libGraphMol.dylib
cd /tmp/rdkit-urlC/Code/GraphMol && /usr/local/Cellar/cmake/2.8.9/bin/cmake
-E cmake_link_script CMakeFiles/GraphMol.dir/link.txt --verbose=1
/usr/local/Library/ENV/4.3/c++   -shared   -compatibility_version 1.0.0
-current_version 2012.9.1 -o ../../lib/libGraphMol.2012.09.1pre.dylib
-install_name /tmp/rdkit-urlC/lib/libGraphMol.1.dylib
CMakeFiles/GraphMol.dir/Atom.cpp.o CMakeFiles/GraphMol.dir/QueryAtom.cpp.o
CMakeFiles/GraphMol.dir/QueryBond.cpp.o CMakeFiles/GraphMol.dir/Bond.cpp.o
CMakeFiles/GraphMol.dir/MolOps.cpp.o
CMakeFiles/GraphMol.dir/FindRings.cpp.o CMakeFiles/GraphMol.dir/ROMol.cpp.o
CMakeFiles/GraphMol.dir/RWMol.cpp.o
CMakeFiles/GraphMol.dir/PeriodicTable.cpp.o
CMakeFiles/GraphMol.dir/atomic_data.cpp.o
CMakeFiles/GraphMol.dir/QueryOps.cpp.o
CMakeFiles/GraphMol.dir/MolPickler.cpp.o
CMakeFiles/GraphMol.dir/Canon.cpp.o
CMakeFiles/GraphMol.dir/AtomIterators.cpp.o
CMakeFiles/GraphMol.dir/BondIterators.cpp.o
CMakeFiles/GraphMol.dir/Aromaticity.cpp.o
CMakeFiles/GraphMol.dir/Kekulize.cpp.o
CMakeFiles/GraphMol.dir/MolDiscriminators.cpp.o
CMakeFiles/GraphMol.dir/ConjugHybrid.cpp.o
CMakeFiles/GraphMol.dir/AddHs.cpp.o CMakeFiles/GraphMol.dir/RankAtoms.cpp.o
CMakeFiles/GraphMol.dir/Matrices.cpp.o
CMakeFiles/GraphMol.dir/Chirality.cpp.o
CMakeFiles/GraphMol.dir/RingInfo.cpp.o
CMakeFiles/GraphMol.dir/Conformer.cpp.o
-L/System/Library/Frameworks/Python.framework/Versions/2.7/Python
../../lib/libRDGeometryLib.2012.09.1pre.dylib
../../lib/libRDGeneral.2012.09.1pre.dylib
../../lib/libDataStructs.2012.09.1pre.dylib
../../lib/libRDGeneral.2012.09.1pre.dylib
Undefined symbols for architecture x86_64:
  "boost::any RDKit::Dict::toany
>(boost::shared_array) const", referenced from:
  void RDKit::Dict::setVal >(std::string
const&, boost::shared_array&) in Matrices.cpp.o
  "boost::any RDKit::Dict::toany
>(boost::shared_array) const", referenced from:
  void RDKit::Dict::setVal >(std::string
const&, boost::shared_array&) in Matrices.cpp.o
  "boost::any RDKit::Dict::toany(std::string) const",
referenced from:
  void RDKit::Dict::setVal(std::string const&,
std::string&) in Chirality.cpp.o
  "boost::any RDKit::Dict::toany >
>(std::list >) const", referenced from:
  void RDKit::Dict::setVal >
>(std::string const&, std::list >&) in Canon.cpp.o
  "boost::any RDKit::Dict::toany >, std::allocator
> > > >(std::vector >,
std::allocator > > >) const",
referenced from:
  void RDKit::Dict::setVal >, std::allocator
> > > >(std::string const&, std::vector >, std::allocator
> > >&) in FindRings.cpp.o
  "boost::any RDKit::Dict::toany > >(std::vector >) const", referenced from:
  void RDKit::Dict::setVal > >(std::string const&,
std::vector >&) in Atom.cpp.o
  void RDKit::Dict::setVal > >(std::string const&,
std::vector >&) in FindRings.cpp.o
  void RDKit::Dict::setVal > >(std::string const&,
std::vector >&) in ROMol.cpp.o
  void RDKit::Dict::setVal > >(std::string const&,
std::vector >&) in MolPickler.cpp.o
  void RDKit::Dict::setVal > >(std::string const&,
std::vector >&) in Canon.cpp.o
  void RDKit::Dict::setVal > >(std::string const&,
std::vector >&) in
Aromaticity.cpp.o
  void RDKit::Dict::setVal > >(std::string const&,
std::vector >&) in
MolDiscriminators.cpp.o
  ...
  "boost::any RDKit::Dict::toany >
>(std::vector >) const", referenced from:
  void RDKit::Dict::setVal >
>(std::string const&, std::vector >&) in
Chirality.cpp.o
  "boost::any RDKit::Dict::toany(bool) const", referenced from:
  void RDKit::Dict::setVal(std::string const&, bool&) in
Canon.cpp.o
  void RDKit::Dict::setVal(std::string const&, bool&) in
AddHs.cpp.o
  void RDKit::Dict::setVal(std::string const&, bool&) in
Chirality.cpp.o
  "boost::any RDKit::Dict::toany(double) const", referenced from:
  void RDKit::Dict::setVal(std::string const&, double&) in
MolDiscriminators.cpp.o
  "boost::any RDKit::Dict::toany(int) const", referenced from:
  void RDKit::Dict::setVal(std::string const&, int&) in
MolPickler.cpp.o
  void RDKit::Dict::setVal(std::string const&, int&) in
Aromaticity.cpp.o
  void RDKit::Dict::setVal(std::string const&, int&) in AddHs.cpp.o
  void RDKit::Dict::setVal(std::string const&, int&) in
Chirality.cpp.o
  "boost::any RDKit::Dict::toany(unsigned int) const",
referenced from:
  void RDKit::Dict::setVal(std::string const&, unsigned
int&) in Canon.cpp.o
  "boost::shared_array
RDKit::Dict::fromany >(boost::any const&)
cons

Re: [Rdkit-discuss] parallel conformation generation

2012-10-05 Thread George Papadatos
Hi Andrew,

Thanks for this. I didn't know about the futures and progressbar modules.

You wrote:
---
*I have to use the "zip" because map(f, iterable, [chunksize=None]) only
takes a single iterable. This also means I need to change the
"generateconformations"
function so that it takes a single element as input, which a 2-element
tuple of the molecule and the count.*
---

For such cases, there is a more elegant and pythonic way: functools.partial
http://docs.python.org/library/functools.html#functools.partial
It just freezes some of the arguments of a function, so you can use map
with a single argument.

In your case:
newfunc = partial(generateconformations, size=n)
map(newfunc, mols)


Best regards,

George P.



On 4 October 2012 22:47, Andrew Dalke  wrote:

> Hi again,
>
>  Greg asked why I used the concurrent.futures module rather than
> the multiprocessing module which is standard with Python 2.6.
>
>
> There are a few differences in the API which makes the futures
> module more interesting. First off, here's how you could write
> the same process pool part using the existing multiprocessing module:
>
>
> from multiprocessing import Pool
> p = Pool(5)
> for mol, ids in p.map(generateconformations, zip(suppl, [n]*len(suppl))):
>for id in ids:
>writer.write(mol, confId=id)
>
> I have to use the "zip" because map(f, iterable, [chunksize=None]) only
> takes a single iterable. This also means I need to change the
> "generateconformations"
> function so that it takes a single element as input, which a 2-element
> tuple of the molecule and the count. (That is, change from
>
> def generateconformations(m, n):
>   ...
>
> to
>
> def generateconformations((m, n)):
>   ...
>
> ).
>
> That's a touch uglier, but doable.
>
> Now, when I posted the code yesterday, I should have posted the simplest
> version of the code, which is:
>
> with futures.ProcessPoolExecutor(max_workers=max_workers) as executor:
>for mol, ids in executor.map(generateconformations, suppl,
> [n]*len(suppl)):
>for id in ids:
>writer.write(mol, confId=id)
>
>
> Then Greg wouldn't have asked me about how complex my code was. ;)
>
>
> This is the easiest to understand. You can see that this API supports
> multiple iterators. I used [n]*len(suppl) to make a new list containing
> repeats of the count, so I could have the twin iterators of the molecules
> and the count. This is a bit simpler than the multiprocessing code.
>
> In addition, the "with" statement know how to work with an executor. Here
> it means that all submitted jobs must finish before leaving the with block,
> and the process pool will be shut down; even if there's an exception.
> With the multiprocessing module, you need to manage that yourself, or
> trust in the memory manager.
>
>
> But I yesterday wrote something more like this:
>
># Submit a set of asynchronous jobs
>jobs = []
>for mol in suppl:
>if mol:
>job = executor.submit(generateconformations, mol, n)
>jobs.append(job)
>
># Process the job results (in submission order) and save the conformers.
>for job in jobs:
>mol, ids = job.result()
>for id in ids:
>writer.write(mol, confId=id)
>
>
> The "submit" immediately returns a 'future' object, which is called a
> "promise" in some other language. You can ask for its .result() to
> get its result. That call will block (up to a timeout) if the result
> isn't there. You can also check to see if there is a result.
>
> The reason I did this is because I usually 1) show a progress bar
> and 2) have enough memory to store all the results in memory.
>
> I've enjoyed using the 'progressbar' module, from
>  http://pypi.python.org/pypi/progressbar/
>
> I have code which looks like this:
>
>with futures.ProcessPoolExecutor(max_workers=4) as executor:
>for (collection, first_id, last_id) in blocks:
>jobs.append(executor.submit(process_block, tmpdir, config,
> collection, first_id, last_id))
>
>widgets = ["Fingerprinting ", progressbar.Percentage(), " ",
> progressbar.ETA(), " ", progressbar.Bar()]
>pbar = progressbar.ProgressBar(widgets=widgets, maxval=len(jobs))
>for job in pbar(futures.as_completed(jobs)):
>job.result()
>
>
> This submits all of the fingerprinting jobs to the process pool.
> The "futures.as_completed()" function takes an iterable of jobs
> and returns each one as they become available, no matter what the
> order is. Then the ProgressBar sees the new item, updates the
> terminal display to show progress information and an ETA, only
> to return the original object itself as an iterator. Finally,
> I call job.result() in the loop, since .result() will forward
> any exceptions if one had happened during the original call.
>
> Then if I want the results I iterate over them again:
>
>for job in jobs:
> ... do something with job.result() ...
>
>
>
> BTW, you don't need to keep things around in m

Re: [Rdkit-discuss] Reading files (SmilesMolSupplier, SDMolSupplier

2012-09-07 Thread George Papadatos
Hi Fabian,

The first one is easy: the function expects a header in the file by default. 
There is a parameter that toggles this but I don't have access to a computer 
right now. There is an example in the documentation. 

Best regards, 

George 

Sent from my gPad

On 7 Sep 2012, at 13:34, Fabian Dey  wrote:

> Hi 
> 
> I found two issues when reading files:
> 
> 1)  I might be getting something wrong here, but it seems as if 
> SmilesMolSupplier misses the very first Smiles:
> 
> input smiles file "test.smi":
> C mola
> CC molb
> CCC molc
>  mold
> 
> 
> # python script
> from rdkit import Chem
> suppl = Chem.SmilesMolSupplier('test.smi');
> 
> print "TEST-1 : %s %s"  
> %(Chem.MolToSmiles(suppl[0]),suppl[0].GetProp("_Name"))
> print ""
> 
> for mol in suppl:
> print "TEST-2 : %s %s"  %(Chem.MolToSmiles(mol),mol.GetProp("_Name"))
> 
> print ""
> 
> for i,mol in enumerate(suppl):
> print "TEST-3 : %s %s"  %(Chem.MolToSmiles(mol),mol.GetProp("_Name"))
> 
> 
> #output 
> TEST-1 : CC molb
> 
> TEST-2 : CC molb
> TEST-2 : CCC molc
> TEST-2 :  mold
> 
> TEST-3 : CC molb
> TEST-3 : CCC molc
> TEST-3 :  mold
> 
> The first molecule "mola" is not available through the supplier (also happens 
> with other smiles files).
> 
> 
> 2) SDMolSupplier  : I have a script which calculates properties from SDfiles 
> read in through the corresponding supplier
> and RDKIT occassionally reported the following errors:
> 
> 
> Pre-condition Violation
> Atomic number not found
> Violation occurred on line 56 in file 
> /home/dey/Downloads/RDKit_2012_06_1/Code/GraphMol/PeriodicTable.h
> Failed Expression: atomicNumber 
> 
> [12:25:23] Unexpected error hit on line 6
> [12:25:23] ERROR: moving to the begining of the next molecule
> ERROR for molecule at position 0
> 
> 
> It turned out that for the corresponding SD-file the atom elements were 
> written in all captial letters  (e.g. CL) - if these
> were changed to the proper format (Cl) RDKIT passed without throwing an 
> error. Although I can preprocess the SD-files
> with a script, it would be nice if RDKIT could handle these cases internally.
> 
> Best
> Fabian
> 
> 
> 
> 
> --
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and 
> threat landscape has changed and how IT managers can respond. Discussions 
> will include endpoint security, mobile security and the latest in malware 
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Faulty valence for nitrogen in aromatic ring

2012-08-16 Thread George Papadatos
Wow, this almost makes me wanting to re-write my thesis in LaTeX. Almost!
:)

George


On 15 August 2012 16:26, Greg Landrum  wrote:

> On Wed, Aug 15, 2012 at 5:10 PM, Michael Palmer 
> wrote:
> >
> >> Now that I've at least tried to clear up what is going on, maybe I can
> >> be more helpful: was there a specific question you were trying to
> >> answer that led you to your discovery that the RDKit behaves strangely
> >> in this special case?
> >
> >
> > What I'm trying to do can be inspected here:
> >
> > http://chimpsky.uwaterloo.ca/mol2chemfig/index
> >
> > Briefly, I'm building a program for converting molecular structures from
> > smiles or molfile format to TeX code, using the syntax defined by the
> > chemfig package as the target.
>
> , coool!
>
> > rdkit does all the heavy lifting. I was using
> > the GetImplicitHs method to determine how many hydrogens to attach to
> > carbons and heteroatoms and then noticed that the number of hydrogens on
> > nitrogen in rings was off.
> >
> > From your answer, it seems I should be using GetTotalNumHs. However, I
> would
> > still like to be able to distinguish between hydrogens that were
> specified
> > in a molfile, with coordinates, and those that weren't.
>
> the answer to this isn't super straightforward, so it probably won't
> come until tomorrow.
>
> >
> > Another question I ran into was accessing the coordinates of an atom,
> either
> > loaded from molfile or, with smiles, computed with
> AllChem.Compute2DCoords.
> > Does the atom object have a method to get at those? Right now, I'm using
> > some embarrassing workaround.
>
> This one I can answer quickly. You need to the molecule's conformer:
> In [7]: AllChem.Compute2DCoords(m)
> Out[7]: 0
>
> In [8]: conf = m.GetConformer()
>
> In [9]: for atom in m.GetAtoms():
>...: aid = atom.GetIdx()
>...: print aid,list(conf.GetAtomPosition(aid))
>...:
> 0 [0.15858546683951269, -1.1294387542967057, 0.0]
> 1 [-1.3046720119188515, -1.4594047386916416, 0.0]
> 2 [-2.3220596761687866, -0.35716958200679838, 0.0]
> 3 [-1.8761898616603592, 1.07503155907298, 0.0]
> 4 [-0.41293238290199596, 1.4049975434679163, 0.0]
> 5 [0.60445528134793969, 0.30276238678307321, 0.0]
> 6 [2.0677127601063026, 0.63272837117800962, 0.0]
> 7 [3.0851004243562379, -0.46950678550683356, 0.0
>
> -greg
>
>
> --
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] windows binary install

2012-08-11 Thread George Papadatos
Hi Alan,

You're almost there but it seems now that you need to upgrade your numpy from 
1.4 to 1.6. 

Regards,

George P. 

Sent from my gPhone

On 11 Aug 2012, at 21:38, stanley5101  wrote:

> I've gone to this archived message and installed the library mentioned.  It 
> just installed itself rather than asking me where I wanted to put it.  
> However, rdkit seems to be recognising it as I now get a new error (pasted 
> below) .  Do I have to go to an older rdkit which likes numpy version 4? 
>  
>  
> Python 2.7.1 |EPD 7.0-2 (32-bit)| (r271:86832, Dec  2 2010, 10:35:02) [MSC 
> v.1500 32 bit (Intel)] on win32
> Type "copyright", "credits" or "license()" for more information.
> >>> import rdkit
> >>> from rdkit import Chem
> RuntimeError: module compiled against API version 6 but this version of numpy 
> is 4
> RuntimeError: module compiled against API version 6 but this version of numpy 
> is 4
> 
> From: James Davidson 
> To: stanley5...@yahoo.co.uk 
> Cc: rdkit-discuss@lists.sourceforge.net 
> Sent: Saturday, 11 August 2012, 7:58
> Subject: Re: [Rdkit-discuss] windows binary install
> 
> Hi Alan,
>  
> My guess is that your problem is missing DLLs, available in the MS C++ 
> Redistributable package – solution described by George for a very similar 
> problem:  
> http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg02381.html.
>  
> I now tend to explicitly just put a copy of these two DLLs into the RDKit lib 
> folder when installing for others, and I can reproduce your error if I remove 
> one of these DLLs from there on my system.
>  
> Cheers
>  
> James
>  
> 
> __
> PLEASE READ: This email is confidential and may be privileged. It is intended 
> for the named addressee(s) only and access to it by anyone else is 
> unauthorised. If you are not an addressee, any disclosure or copying of the 
> contents of this email or any action taken (or not taken) in reliance on it 
> is unauthorised and may be unlawful. If you have received this email in 
> error, please notify the sender or postmas...@vernalis.com. Email is not a 
> secure method of communication and the Company cannot accept responsibility 
> for the accuracy or completeness of this message or any attachment(s). Please 
> check this email for virus infection for which the Company accepts no 
> responsibility. If verification of this email is sought then please request a 
> hard copy. Unless otherwise stated, any views or opinions presented are 
> solely those of the author and do not represent those of the Company.
> 
> The Vernalis Group of Companies
> 100 Berkshire Place
> Wharfedale Road
> Winnersh, Berkshire
> RG41 5RD, England
> Tel: +44 (0)118 938 
> 
> To access trading company registration and address details, please go to the 
> Vernalis website at www.vernalis.com and click on the "Company address and 
> registration details" link at the bottom of the page..
> __
> 
> 
> --
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and 
> threat landscape has changed and how IT managers can respond. Discussions 
> will include endpoint security, mobile security and the latest in malware 
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-30 Thread George Papadatos
Hi all,

Well, I'm glad to say that everything works as a charm at the first attempt
on a Ubuntu 12.04 *64-bit *virtual machine:

georgep@georgep-VB:~/local/rdkit/rdkit_trunk/Code/PgSQL/rdkit$ make
installcheck
/usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress
--inputdir=. --psqldir='/usr/lib/postgresql/9.1/bin'
--dbname=contrib_regression rdkit-91 props btree molgist bfpgist-91 sfpgist
slfpgist fps
(using postmaster on Unix socket, default port)
== dropping database "contrib_regression" ==
NOTICE:  database "contrib_regression" does not exist, skipping
DROP DATABASE
== creating database "contrib_regression" ==
CREATE DATABASE
ALTER DATABASE
== running regression test queries==
test rdkit-91 ... ok
test props... ok
test btree... ok
test molgist  ... ok
test bfpgist-91   ... ok
test sfpgist  ... ok
test slfpgist ... ok
test fps  ... ok

=
 All 8 tests passed.
=


This solves my problem, but the 32-bit mystery remains.

Many thanks to Jan, Adrian and Greg for the troubleshooting ideas.


George


On 30 May 2012 15:50, Greg Landrum  wrote:

> On Wed, May 30, 2012 at 4:13 PM, Jan Holst Jensen 
> wrote:
> > My failing Linux Mint is 32-bit like George's 12.04. Don't know if it is
> > significant but it could be that the problem only occurs on 32-bit.
> >
> > Greg mentioned that he has successfully built and tested on Ubuntu 12.04
> -
> > was that 64-bit or 32-bit ?
>
> 64bit. I'll try it on a 32bit VM tonight.
>
> -greg
>
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-30 Thread George Papadatos
Compare it with this one:

georgep@george-VB:~/local/rdkit/rdkit_trunk/Code/PgSQL/rdkit$ make
installcheck
/usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress
--inputdir=. --psqldir='/usr/lib/postgresql/9.1/bin'
--dbname=contrib_regression rdkit-91 props btree molgist bfpgist-91 sfpgist
slfpgist fps
(using postmaster on Unix socket, default port)
== dropping database "contrib_regression" ==
DROP DATABASE
== creating database "contrib_regression" ==
CREATE DATABASE
ALTER DATABASE
== running regression test queries==
test rdkit-91 ... FAILED
test props... FAILED
test btree... FAILED
test molgist  ... FAILED
test bfpgist-91   ... FAILED
test sfpgist  ... FAILED
test slfpgist ... FAILED
test fps  ... FAILED

==
 8 of 8 tests failed.
==

The differences that caused some tests to fail can be viewed in the
file
"/home/georgep/local/rdkit/rdkit_trunk/Code/PgSQL/rdkit/regression.diffs".
 A copy of the test summary that you see
above is saved in the file
"/home/georgep/local/rdkit/rdkit_trunk/Code/PgSQL/rdkit/regression.out".

make: *** [installcheck] Error 1

gpdb=# create extension rdkit with schema rdkit;
FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
The connection to the server was lost. Attempting reset: Succeeded.
gpdb=#


On 30 May 2012 14:49, Adrian Schreyer  wrote:

> 64-bit, PostgreSQL packages are from the official archive. I simply do
> 'make' followed by 'sudo make install' and then
>
> create schema rdkit;
> create extension rdkit with schema rdkit;
>
> and that's it.
>
> $ make installcheck
>
> /usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress
> --inputdir=. --psqldir='/usr/lib/postgresql/9.1/bin'
> --dbname=contrib_regression rdkit-91 props btree molgist bfpgist-91
> sfpgist slfpgist fps
> (using postmaster on Unix socket, default port)
> == dropping database "contrib_regression" ==
> DROP DATABASE
> == creating database "contrib_regression" ==
> CREATE DATABASE
> ALTER DATABASE
> == running regression test queries==
> test rdkit-91 ... ok
> test props... ok
> test btree... ok
> test molgist  ... ok
> test bfpgist-91   ... ok
> test sfpgist  ... ok
> test slfpgist ... ok
> test fps  ... ok
>
> =
>  All 8 tests passed.
> =
>
> On Wed, May 30, 2012 at 2:43 PM, Jan Holst Jensen 
> wrote:
> > How odd. Adrian, are you using a 32-bit or 64-bit version of Ubuntu
> 12.04 ?
> >
> > Cheers
> > -- Jan
> >
> >
> > On 2012-05-30 15:26, Adrian Schreyer wrote:
> >>
> >> Yes, I could build and install the cartridge without problems
> >> (Release_2012.03.1) on 12.04.
> >>
> >> On Wed, May 30, 2012 at 2:23 PM, George Papadatos
> >>  wrote:
> >>>
> >>> Hi Jan,
> >>>
> >>> Mine is exactly the same:
> >>> gcc test.c -I/usr/include/postgresql/9.1/server;./a.out
> >>> 90103
> >>>
> >>> So, I am back to square 1!
> >>>
> >>> I am starting to get a bit desperate here, has anyone ever successfully
> >>> built the cartridge from the trunk on a plain Ubuntu 12.04?
> >>>
> >>> Many thanks for your help,
> >>>
> >>> George
> >
> >
>
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-30 Thread George Papadatos
Hi Jan,

Mine is exactly the same:
gcc test.c -I/usr/include/postgresql/9.1/server;./a.out
90103

So, I am back to square 1!

I am starting to get a bit desperate here, has anyone ever successfully
built the cartridge from the trunk on a plain Ubuntu 12.04?

Many thanks for your help,

George


On 30 May 2012 12:48, Jan Holst Jensen  wrote:

> On 2012-05-30 13:24, George Papadatos wrote:
>
>> Thanks to both of you.
>>
>> I do not know how to check for the PG_VERSION_NUM. I tried to edit to
>> guc.c by removing the conditional check of the PG_VERSION but with the same
>> results:
>> Adrian, is this what you meant?
>>
>>  DefineCustomRealVariable(
>>   "rdkit.tanimoto_threshold",
>>   "Lower threshold of Tanimoto similarity",
>>   "Molecules with similarity lower than threshold
>> are not similar by % operation",
>> &rdkit_tanimoto_smlar_limit,
>>   0.5,
>>   0.0,
>>   1.0,
>>   PGC_USERSET,
>>   0,
>>   (GucRealCheckHook)**TanimotoLimitAssign,
>>   NULL,
>>   NULL
>>   );
>>
>>  DefineCustomRealVariable(
>>   "rdkit.dice_threshold",
>>   "Lower threshold of Dice similarity",
>>   "Molecules with similarity lower than threshold
>> are not similar by # operation",
>> &rdkit_dice_smlar_limit,
>>   0.5,
>>   0.0,
>>   1.0,
>>   PGC_USERSET,
>>   0,
>>   (GucRealCheckHook)**DiceLimitAssign,
>>   NULL,
>>   NULL
>>   );
>>
>> Regards,
>>
>> George
>>
>
> Hi George,
>
> I just tried the same on my VM, with no change for the better either. My
> version of guc.c now looks like this:
>
> static void
> initRDKitGUC()
> {
>  if (rdkit_guc_inited)
>return;
>
>
>  DefineCustomRealVariable(
>   "rdkit.tanimoto_threshold",
>   "Lower threshold of Tanimoto similarity",
>   "Molecules with similarity lower than threshold
> are not similar by % operation",
> &rdkit_tanimoto_smlar_limit,
>   0.5,
>   0.0,
>   1.0,
>   PGC_USERSET,
>   0,
> //if PG_VERSION_NUM >= 90100
>   (GucRealCheckHook)**TanimotoLimitAssign,
>   NULL,
> //else
> //   TanimotoLimitAssign,
> //endif
>
>   NULL
>   );
>
>  DefineCustomRealVariable(
>   "rdkit.dice_threshold",
>   "Lower threshold of Dice similarity",
>   "Molecules with similarity lower than threshold
> are not similar by # operation",
> &rdkit_dice_smlar_limit,
>   0.5,
>   0.0,
>   1.0,
>   PGC_USERSET,
>   0,
> //if PG_VERSION_NUM >= 90100
>   (GucRealCheckHook)**DiceLimitAssign,
>   NULL,
> //else
> //   DiceLimitAssign,
> //endif
>   NULL
>   );
>
>  rdkit_guc_inited = true;
> }
>
> Did a cartridge "make clean", "make", "sudo make install", and it still
> fails for me with
>
> postgres=# create extension rdkit;
>
> FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
> FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
> The connection to the server was lost. Attempting reset: Succeeded.
> postgres=#
>
> Cheers
> -- Jan
>
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-30 Thread George Papadatos
Thanks to both of you.

I do not know how to check for the PG_VERSION_NUM. I tried to edit to guc.c
by removing the conditional check of the PG_VERSION but with the same
results:
Adrian, is this what you meant?

  DefineCustomRealVariable(
   "rdkit.tanimoto_threshold",
   "Lower threshold of Tanimoto similarity",
   "Molecules with similarity lower than threshold
are not similar by % operation",
   &rdkit_tanimoto_smlar_limit,
   0.5,
   0.0,
   1.0,
   PGC_USERSET,
   0,
   (GucRealCheckHook)TanimotoLimitAssign,
   NULL,
   NULL
   );

  DefineCustomRealVariable(
   "rdkit.dice_threshold",
   "Lower threshold of Dice similarity",
   "Molecules with similarity lower than threshold
are not similar by # operation",
   &rdkit_dice_smlar_limit,
   0.5,
   0.0,
   1.0,
   PGC_USERSET,
   0,
   (GucRealCheckHook)DiceLimitAssign,
   NULL,
   NULL
   );

Regards,

George

On 30 May 2012 12:00, Jan Holst Jensen  wrote:

>  On 2012-05-30 11:28, George Papadatos wrote:
>
> Hi Jan,
>
>  I followed your advice and I added the new repo, however the problem
> still persists:
>
>  [...]
>
>
>  Again, *all* the tests fail as do the "create extension" attempts.
>
>  I even tried explicit postgresql-9.1 and postgresql-9.2 (beta version)
> but with the same sad results.
>
>  Do I do something wrong here, like still installing the default
> postgresql packages and not the "good" ones?
>
>
> Hi George,
>
> Sorry for leading you on a wild goose chase here, but I wasn't sure
> whether you were using custom postgres packages or the standard Ubuntu
> package. As Adrian Schreyer has pointed out, if you are using the
> Ubuntu-supplied packages then you are already using Martin Pitt's build so
> it actually shouldn't make a difference.
>
> > On 12.04 you do not need to add the PPA, PostgreSQL 9.1 is the
> > official package there. In addition, the official packages are also
> > provided by Martin Pitt, so the packages in the PPA and in the distro
> > are actually the same. It only make sense on 10.04 (and those that do
> > not ship with 9.1).
>
> I would go with Adrian's suggestion to check if the PG_VERSION_NUM is
> somehow mis-reported on your system. In fact, I will see if I still have
> the Linux Mint VM (that also has this behavior) somewhere and check it on
> that system too.
>
> Cheers
> -- Jan
>
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-30 Thread George Papadatos
l/hunspell packages...
  en_au
  en_gb
  en_us
  en_za
Setting up postgresql-9.1 (9.1.3-2) ...
Creating new cluster (configuration: /etc/postgresql/9.1/main, data:
/var/lib/postgresql/9.1/main)...
Moving configuration file /var/lib/postgresql/9.1/main/postgresql.conf to
/etc/postgresql/9.1/main...
Moving configuration file /var/lib/postgresql/9.1/main/pg_hba.conf to
/etc/postgresql/9.1/main...
Moving configuration file /var/lib/postgresql/9.1/main/pg_ident.conf to
/etc/postgresql/9.1/main...
Configuring postgresql.conf to use port 5432...
update-alternatives: using
/usr/share/postgresql/9.1/man/man1/postmaster.1.gz to provide
/usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode.
 * Starting PostgreSQL 9.1 database server

   [ OK ]
Setting up postgresql (9.1+130~precise) ...
Setting up postgresql-server-dev-9.1 (9.1.3-2) ...
Setting up postgresql-server-dev-all (130~precise) ...
Processing triggers for libc-bin ...
ldconfig deferred processing now taking place

Again, *all* the tests fail as do the "create extension" attempts.

I even tried explicit postgresql-9.1 and postgresql-9.2 (beta version) but
with the same sad results.

Do I do something wrong here, like still installing the default postgresql
packages and not the "good" ones?

Regards,

George

On 29 May 2012 23:27, George Papadatos  wrote:

> Hi Jan,
>
> Many thanks for the reply.
> Yes, I used apt-get and the default repositories to install postgresql on
> a  Ubuntu 12.04.
> I'll follow your guidelines and the new repos tomorrow and I'll let you
> know.
>
> Many thanks again,
>
> George
>
>
> On 29 May 2012 23:15, Jan Holst Jensen  wrote:
>
>>  On 2012-05-29 17:45, George Papadatos wrote:
>>
>> Hi RDKitters,
>>
>>  Today I tried to install the RDKit and cartridge to a brand new Ubuntu
>> 12.04 32-bit running on a Virtual Box.
>>
>>  [...]
>>
>>  Then when I tried to install the extension:
>>  georgep@george-VB:~$ psql -c 'CREATE EXTENSION rdkit' gpdb
>> FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
>> FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
>> connection to server was lost
>>
>>  or even:
>> georgep@george-VB:~/local/rdkit/rdkit_trunk/Code/PgSQL/rdkit$ psql gpdb
>> psql (9.1.3)
>> Type "help" for help.
>>
>>  gpdb=# create extension "rdkit";
>> FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
>> FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
>> The connection to the server was lost. Attempting reset: Succeeded.
>> gpdb=# show rdkit.tanimoto_threshold;
>> ERROR:  unrecognized configuration parameter "rdkit.tanimoto_threshold"
>>
>>
>>  Any ideas would be much appreciated!
>>
>>
>> Hi George,
>>
>> Sounds exactly like the behavior I described in this thread:
>>
>>
>> http://sourceforge.net/mailarchive/forum.php?thread_name=CAD4fdRRHdpqDRCWd5AjEzDWJia5WM6zsq%3DosvVmb%3DYHe%3DpmR7A%40mail.gmail.com&forum_name=rdkit-discuss
>>
>> My issue seemed to be related to the OpenSCG version of PostgreSQL and
>> for my purposes the issue was solved by using Martin Pitt's postgres
>> packages instead. However, on one machine, a Linux Mint box, I never got it
>> working. Are you using all plain vanilla packages from Ubuntu 12.04 ?
>>
>> Cheers
>> -- Jan
>>
>
>
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-29 Thread George Papadatos
Hi Jan,

Many thanks for the reply.
Yes, I used apt-get and the default repositories to install postgresql on a
 Ubuntu 12.04.
I'll follow your guidelines and the new repos tomorrow and I'll let you
know.

Many thanks again,

George

On 29 May 2012 23:15, Jan Holst Jensen  wrote:

>  On 2012-05-29 17:45, George Papadatos wrote:
>
> Hi RDKitters,
>
>  Today I tried to install the RDKit and cartridge to a brand new Ubuntu
> 12.04 32-bit running on a Virtual Box.
>
>  [...]
>
>  Then when I tried to install the extension:
>  georgep@george-VB:~$ psql -c 'CREATE EXTENSION rdkit' gpdb
> FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
> FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
> connection to server was lost
>
>  or even:
> georgep@george-VB:~/local/rdkit/rdkit_trunk/Code/PgSQL/rdkit$ psql gpdb
> psql (9.1.3)
> Type "help" for help.
>
>  gpdb=# create extension "rdkit";
> FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
> FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
> The connection to the server was lost. Attempting reset: Succeeded.
> gpdb=# show rdkit.tanimoto_threshold;
> ERROR:  unrecognized configuration parameter "rdkit.tanimoto_threshold"
>
>
>  Any ideas would be much appreciated!
>
>
> Hi George,
>
> Sounds exactly like the behavior I described in this thread:
>
>
> http://sourceforge.net/mailarchive/forum.php?thread_name=CAD4fdRRHdpqDRCWd5AjEzDWJia5WM6zsq%3DosvVmb%3DYHe%3DpmR7A%40mail.gmail.com&forum_name=rdkit-discuss
>
> My issue seemed to be related to the OpenSCG version of PostgreSQL and for
> my purposes the issue was solved by using Martin Pitt's postgres packages
> instead. However, on one machine, a Linux Mint box, I never got it working.
> Are you using all plain vanilla packages from Ubuntu 12.04 ?
>
> Cheers
> -- Jan
>
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] cartridge problems

2012-05-29 Thread George Papadatos
Hi RDKitters,

Today I tried to install the RDKit and cartridge to a brand new Ubuntu
12.04 32-bit running on a Virtual Box.

I installed RDKit from trunk and it passed all the tests.
When I tried to build and use the cartridge I got into troubles, even
though I followed this to the letter:
http://code.google.com/p/rdkit/wiki/BuildingTheCartridge

First of all, it failed all the tests - Greg mentioned that it might be
normal:
georgep@george-VB:~/local/rdkit/rdkit_trunk/Code/PgSQL/rdkit$ sudo make
install
/bin/mkdir -p '/usr/lib/postgresql/9.1/lib'
/bin/mkdir -p '/usr/share/postgresql/9.1/extension'
/bin/sh
/usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../config/install-sh -c
-m 755  rdkit.so '/usr/lib/postgresql/9.1/lib/rdkit.so'
/bin/sh
/usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../config/install-sh -c
-m 644 ./rdkit.control '/usr/share/postgresql/9.1/extension/'
/bin/sh
/usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../config/install-sh -c
-m 644 ./rdkit--3.1.sql  '/usr/share/postgresql/9.1/extension/'
georgep@george-VB:~/local/rdkit/rdkit_trunk/Code/PgSQL/rdkit$ make
installcheck
/usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress
--inputdir=. --psqldir='/usr/lib/postgresql/9.1/bin'
--dbname=contrib_regression rdkit-91 props btree molgist bfpgist-91 sfpgist
slfpgist fps
(using postmaster on Unix socket, default port)
== dropping database "contrib_regression" ==
DROP DATABASE
== creating database "contrib_regression" ==
CREATE DATABASE
ALTER DATABASE
== running regression test queries==
test rdkit-91 ... FAILED (test process exited with exit
code 2)
test props... FAILED
test btree... FAILED
test molgist  ... FAILED
test bfpgist-91   ... FAILED
test sfpgist  ... FAILED
test slfpgist ... FAILED
test fps  ... FAILED

==
 8 of 8 tests failed.
==

Then when I tried to install the extension:
georgep@george-VB:~$ psql -c 'CREATE EXTENSION rdkit' gpdb
FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
connection to server was lost

or even:
georgep@george-VB:~/local/rdkit/rdkit_trunk/Code/PgSQL/rdkit$ psql gpdb
psql (9.1.3)
Type "help" for help.

gpdb=# create extension "rdkit";
FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
The connection to the server was lost. Attempting reset: Succeeded.
gpdb=# show rdkit.tanimoto_threshold;
ERROR:  unrecognized configuration parameter "rdkit.tanimoto_threshold"


Any ideas would be much appreciated!

Many thanks,

George
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Python 2.7 binaries for WinXP VirtualBox - FYI

2012-05-10 Thread George Papadatos
Hello RDKiters,

Just a quick thing to say that I had some problems installing the latest
2012_03 binaries on a Virtual WinXP Pro machine.
The error was deceptively familiar:

In [1]: from rdkit.Chem import AllChem as Chem
---
ImportError   Traceback (most recent call last)
C:\Documents and Settings\georgep\ in
()
> 1 from rdkit.Chem import AllChem as Chem

C:\RDKit_2012_03_1\rdkit\Chem\__init__.py in ()
 16
 17 """
---> 18 from rdkit import rdBase
 19 from rdkit import RDConfig
 20

ImportError: DLL load failed: The specified module could not be found.

...and it is usually attributed to not setting the PATH properly. After
make sure that this was fine, I had to use the Dependency Walker against
the rdBase.pyd, which pointed out that there were a couple of dlls missing
(msvcp100 and msvcr100). Everything was solved after the installation of MS
C++ redist package I found here:
http://www.microsoft.com/en-us/download/details.aspx?id=

I hope this will prevent somebody else from wasting their morning with
troubleshooting!

Regards,

George Papadatos
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Failed Expression: pick >= 0

2012-05-02 Thread George Papadatos
Hi Andrew,

Since you don't have access to the database in-house you may want to check
the web services for simple queries:
https://www.ebi.ac.uk/chembldb/index.php/ws
Wrt to the names, you're right, there's no explicit name for the structures
in the sd and the chemblid is included just as a property. This is how it
has been historically but I'll pass on your request.

Regards,

George
EMBL-EBI

On 2 May 2012 12:34, Andrew Dalke  wrote:

> Hi George,
>
> > This is probably not going to solve the problem at hand but it may be
> useful to you or others in the future:
> > ChEMBLdb maintains a molecular hierarchy table where you can retrieve
> the parent (=desalted - using Pipeline Pilot) structures for each molecular
> entity.
> > You may try something like this:
> >
> > select distinct cs.molregno, cs.molfile, cs.canonical_smiles
> > from compound_structures cs, molecule_hierarchy mh
> > where cs.molregno = mh.parent_molregno
>
> I confess pure ignorance here. While I've worked with databases, it's far
> from the list of things I know well. Reading the ERD is not simple for me,
> I don't have MySQL or Oracle installed on my machines, and I don't know how
> to browse through the schema and tables like I've seen those who are more
> database proficient than I do. So while I have an idea of what you are
> talking about, it's not something I can easily put into place.
>
> But as you say, it's not the problem, because RDKit's failure exception
> comes even using the original, unprocessed/un-de-salted record.
>
>
> Since you're here -- how come ChEMBL doesn't put an identifier on the
> first line of the SD record? Nearly all of them are blank; the exceptions
> are a dozen with mostly useless titles like:
>
> Acetic acid 6-(1-phenyl-ethyl)-6-aza-bicyclo[3.2.1]oct-3-yl ester
> 4-(4-Fluoro-phenyl)-2-methylsulfanyl-thiophene-3-carbonitrile
>
> 6-amino-9-(5-{[(1,2,3,3-tetrahydroxy-1,2,3-trioxidotriphosphanyl)oxy]methyl}tetr
> 2-Methyl-2,3-dihydro-benzofuran-7-carboxylic acid
> 8-methyl-8-aza-bicyclo[3.2.1]o
>
> (S)-N-((S)-1,6-diamino-1-oxohexan-2-yl)-1-((S)-5-guanidino-2-((2S,3S)-2-((S)-5-g
> Acetic acid 6-(1-phenyl-ethyl)-6-aza-bicyclo[3.2.1]oct-3-yl ester
>
>
> I end up doing a mol.SetProp("_Name", mol.GetProp("chembl_id")) so that my
> output SMILES have an identifier tied to them, and that seems like a
> needless extra step.
>
>
>Andrew
>da...@dalkescientific.com
>
>
>
>
> --
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Failed Expression: pick >= 0

2012-05-02 Thread George Papadatos
Hi Andrew,

This is probably not going to solve the problem at hand but it may be
useful to you or others in the future:
ChEMBLdb maintains a molecular hierarchy table where you can retrieve the
parent (=desalted - using Pipeline Pilot) structures for each molecular
entity.
You may try something like this:

select distinct cs.molregno, cs.molfile, cs.canonical_smiles
from compound_structures cs, molecule_hierarchy mh
where cs.molregno = mh.parent_molregno

This will give you all the *unique* desalted structures in chEMBL.
In case you want to keep also track of the molregnos of the salt forms for
each parent structure, try (mysql-specific):

select cs.molregno, group_concat(mh.molregno), cs.molfile,
cs.canonical_smiles
from compound_structures cs, molecule_hierarchy mh
where cs.molregno = mh.parent_molregno
group by cs.molregno

I hope it hels.

Best regards,

George Papadatos
EMBL-EBI


On 30 April 2012 21:32, Andrew Dalke  wrote:

> I'm desalting the ChEMBL data set and generating the corresponding
> de-salted SD and SMILES files. I found a problem in the conversion step,
> and found that the problem has nothing to do with the de-salting.
>
> My code failed with CHEMBL1269997, which is record ~750,200 out of
> 1,142,974. (In other words, it took a while to get to this point.) Here's a
> reproducible:
>
> >>> from rdkit import Chem
> >>> writer = Chem.SDWriter("/dev/stdout")
> >>> for mol in Chem.ForwardSDMolSupplier("CHEMBL1269997.sdf"):
> ...   writer.write(mol)
> ...
> [22:11:05]
>
> 
> Invariant Violation
>
> Violation occurred on line 388 in file
> /tmp/homebrew-rdkit-HEAD-Ebdo/Code/GraphMol/FileParsers/MolFileStereochem.cpp
> Failed Expression: pick >= 0
> 
>
> Traceback (most recent call last):
>  File "", line 2, in 
> RuntimeError: Invariant Violation
> >>> Chem.MolToSmiles(mol)
> 'OCC1=CC2OC(CC(C)C)(CC(C)C)C3C456C(OC(C)(C)O5)C1(O)C46C23'
> >>> Chem.MolToSmiles(mol, isomericSmiles=True)
> 'OCC1=C[C@@H]2OC(CC(C)C)(CC(C)C)[C@@H]3[C@H]4CCC[C@@]56[C@
> @H](OC(C)(C)O5)[C@]1(O)[C@]46[C@H]23'
> >>>
>
> You can see that the molecule was read in, is not None, and can be used to
> generate a SMILES.
>
> The CHEMBL1269997.sdf is attached.
>
> This error was previously reported in the thread JP started, titled
> "Invariant violation...", dated July 6, 2011. Greg replied:
>
> > Wow that is certainly an error I never expected to see. From the code,
> > I guess the molecule has a stereocenter that is surrounded by other
> > stereocenters and something extremely unfortunate is happening with
> > the way decisions are being made about which bonds to wedge. As Eddie
> > requested in an earlier message, it would be helpful to have the input
> > that produced the error so that it can be added to the test cases (and
> > so that I can be sure the problem is fixed once I figure out how to).
>
> but I see no posting of a failing structure. I hope the attached structure
> helps resolve this problem.
>
>
>
>Andrew
>da...@dalkescientific.com
>
>
>
> --
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Strange SMILES behaviour

2012-03-12 Thread George Papadatos
Thanks for the prompt reply Greg; this is what I suspected too!

Regards,

George

On 12 March 2012 17:22, Greg Landrum  wrote:

> Hi George,
>
> On Mon, Mar 12, 2012 at 5:58 PM, George Papadatos 
> wrote:
> >
> > Could anyone please explain this:
> >
> > In [21]: Chem.CanonSmiles('C1=CC=C2C(=C1)NC=S2')
> > Out[21]: 'c1[nH]c2c2s1'
> >
> > In [22]: Chem.MolFromSmiles(Out[21])
> > [16:47:14] Can't kekulize mol
> >
> > In other words, how is it possible that a valid RDKit SMILES output
> fails to
> > be converted to molecule again?
>
> I'm sure the general answer isn't a surprise: it's a bug
>
> It may actually be more than one bug.
>
> The SMILES 'C1=CC=C2C(=C1)NC=S2' probably shouldn't produce a legal
> molecule. It certainly shouldn't produce one with an aromatic ring.
> That's not really a valid/reasonable resonance structure for
> benzothiazole. This would be ok: S1C=NC2=CC=CC=C12 o
>
> The output smiles:  'c1[nH]c2c2s1' is also not a reasonable
> molecule, which the RDKit recognizes when it tries to read it back in.
>
> I'm going to have to think about where the right place to fix this is.
>
> -greg
>
--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Strange SMILES behaviour

2012-03-12 Thread George Papadatos
Hello all,

Could anyone please explain this:

In [21]: Chem.CanonSmiles('C1=CC=C2C(=C1)NC=S2')
Out[21]: 'c1[nH]c2c2s1'

In [22]: Chem.MolFromSmiles(Out[21])
[16:47:14] Can't kekulize mol

In other words, how is it possible that a valid RDKit SMILES output fails
to be converted to molecule again?
I'm sure this has to do with aromaticity and kekulization for benzothiazole
but still


Many thanks in advance,
George
--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] 2011.09 (Q3 2011) RDKit release

2011-10-16 Thread George Papadatos
Hi Greg,

This should work - this is how I solved a similar problem with the latest
RDKit version for Windows.

Cheers,

George

On 16 October 2011 16:32, Greg Landrum  wrote:

> I'm traveling and don't have access to the machine where I normally do
> windows builds, but I tried to create an alternate binary using dlls
> from an older RDKit distribution.
>
> Please give this a try:
>
> http://code.google.com/p/rdkit/downloads/detail?name=RDKit_2011_09_1.win32.py27.pkg2.zip
> and let me know if it works. If so I will go ahead and replace the
> current binaries with this one.
>
> Sorry for the hassle and thanks for the help,
> -greg
>
>
> --
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2d-oct
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] 2011.09 (Q3 2011) RDKit release

2011-10-16 Thread George Papadatos
Hi James,

This looks like there are missing dlls from the lib folder. The easiest
solution would to be copy all the files from the lib folder of the previous
(working) RDKit version and paste them in the lib folder of the current one
(without overwriting them).

Regards,

George


On 16 October 2011 15:56, James Davidson  wrote:

> **
> Hi Greg,
>
> I probably should have picked this up in the beta (but didn't...)  When I
> try to import AllChem, I see the following:
>
> >>> from rdkit import Chem
> >>> from rdkit.Chem import AllChem
>
> Traceback (most recent call last):
>   File "", line 1, in 
> from rdkit.Chem import AllChem
>   File "C:\Python27\RDKit_2011_09_1\rdkit\Chem\AllChem.py", line 28, in
> 
> from rdkit.Chem.rdSLNParse import *
> ImportError: DLL load failed: The specified module could not be found.
>
> Any advice?
>
> Kind regards
>
> James
>
> __
> PLEASE READ: This email is confidential and may be privileged. It is
> intended for the named addressee(s) only and access to it by anyone else is
> unauthorised. If you are not an addressee, any disclosure or copying of the
> contents of this email or any action taken (or not taken) in reliance on it
> is unauthorised and may be unlawful. If you have received this email in
> error, please notify the sender or postmas...@vernalis.com. Email is not a
> secure method of communication and the Company cannot accept responsibility
> for the accuracy or completeness of this message or any attachment(s).
> Please check this email for virus infection for which the Company accepts no
> responsibility. If verification of this email is sought then please request
> a hard copy. Unless otherwise stated, any views or opinions presented are
> solely those of the author and do not represent those of the Company.
>
> The Vernalis Group of Companies
> Oakdene Court
> 613 Reading Road
> Winnersh, Berkshire
> RG41 5UA.
> Tel: +44 118 977 3133
>
> To access trading company registration and address details, please go to
> the Vernalis website at www.vernalis.com and click on the "Company address
> and registration details" link at the bottom of the page..
> __
>
>
> --
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2d-oct
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Partial/Rooted/Anchored Morgan Fingerprint

2011-10-06 Thread George Papadatos
Hi Greg,

That's great , thanks a lot for your help.

Regards,

George

On 30 September 2011 07:01, Greg Landrum  wrote:

> Hi George,
>
> On Thu, Sep 29, 2011 at 1:11 PM, George Papadatos 
> wrote:
> > I'd like to calculate the *rooted* Morgan fingerprint for a set of
> > molecules. By rooted I mean the subset of the whole-molecule fingerprint
> > which contains just the bits which correspond to circular atom layers (up
> to
> > N bond lengths) that include a specific atom.
> > So let's say that there is a single Uranium atom in each molecule. What I
> > want to calculate is the subset of the Morgan fingerprint (let's say with
> a
> > radius of 3) which contains the bits set on by layers including my U
> atom.
> > This should include not only the bits where U was the root of the layer,
> but
> > also the bits where U was in the layer of neighboring atoms, up to 3
> bonds
> > away.
>
> A minor point: I wouldn't call this the rooted fingerprint since it
> includes bits that are set by layers that are not centered at your U
> atom.
>
> > After checking the super-helpful "Getting Started with the RDKit in
> Python"
> > (Q2 2011) tutorial, section 5.4.1, I can see one way of doing this:
> > calculating the Morgan fp and then enumerating all the sub-molecules (or
> > layers) that set the corresponding bits on and then checking if U is in
> any
> > one of these submolecules. If it is then the corresponding bit is part of
> > the root Morgan fp.
> > Is there any other more efficient way???
>
> If you only want the bits that are set by a particular atom (i.e.
> those that are centered at that atom), you can use the fromAtoms
> argument:
> >>> from rdkit import Chem
> >>> from rdkit.Chem import rdMolDescriptors
> >>> m1 = Chem.MolFromSmiles('Cc1c1')
> >>> m2 = Chem.MolFromSmiles('Cc1c(C)1')
> >>>
> rdMolDescriptors.GetMorganFingerprint(m1,1,fromAtoms=[0]).GetNonzeroElements()
> {2246728737: 1, 422715066: 1}
> >>>
> rdMolDescriptors.GetMorganFingerprint(m1,2,fromAtoms=[0]).GetNonzeroElements()
> {2246728737: 1, 422715066: 1, 2218109011: 1}
> >>>
> rdMolDescriptors.GetMorganFingerprint(m2,1,fromAtoms=[0]).GetNonzeroElements()
> {2246728737: 1, 422715066: 1}
> >>>
> rdMolDescriptors.GetMorganFingerprint(m2,2,fromAtoms=[0]).GetNonzeroElements()
> {2246728737: 1, 422715066: 1, 2368203427: 1}
>
> Note that I just fixed a bug that was leading to missing bits in the
> morgan fingerprints generated with a fromAtoms argument.
>
> If you want all bits that the atom is involved in, I would suggest
> using the fromAtoms argument, but also including all the atoms that
> are within the appropriate radius of your atom. You can find these
> atoms using the molecule's distance matrix:
> >>> m1 = Chem.MolFromSmiles('Cc1c1')
> >>> dm=Chem.GetDistanceMatrix(m1)
> >>> dm
> array([[ 0.,  1.,  2.,  3.,  4.,  3.,  2.],
>   [ 1.,  0.,  1.,  2.,  3.,  2.,  1.],
>   [ 2.,  1.,  0.,  1.,  2.,  3.,  2.],
>   [ 3.,  2.,  1.,  0.,  1.,  2.,  3.],
>   [ 4.,  3.,  2.,  1.,  0.,  1.,  2.],
>   [ 3.,  2.,  3.,  2.,  1.,  0.,  1.],
>   [ 2.,  1.,  2.,  3.,  2.,  1.,  0.]])
>
>
> I hope this helps,
> -greg
>
--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Partial/Rooted/Anchored Morgan Fingerprint

2011-09-29 Thread George Papadatos
Hi all,

I'd like to calculate the **rooted** Morgan fingerprint for a set of
molecules. By rooted I mean the subset of the whole-molecule fingerprint
which contains just the bits which correspond to circular atom layers (up to
N bond lengths) that include a specific atom.
So let's say that there is a single Uranium atom in each molecule. What I
want to calculate is the subset of the Morgan fingerprint (let's say with a
radius of 3) which contains the bits set on by layers including my U atom.
This should include not only the bits where U was the root of the layer, but
also the bits where U was in the layer of neighboring atoms, up to 3 bonds
away.
After checking the super-helpful "Getting Started with the RDKit in Python"
(Q2 2011) tutorial, section 5.4.1, I can see one way of doing this:
calculating the Morgan fp and then enumerating all the sub-molecules (or
layers) that set the corresponding bits on and then checking if U is in any
one of these submolecules. If it is then the corresponding bit is part of
the root Morgan fp.
Is there any other more efficient way???

Thanks in advance,

George Papadatos

On 28 September 2011 17:16, Sarah Langdon  wrote:

> Hi there,
>
> I've tried to write a function in Python to generate the Murcko
> Framework of a molecule, then remove a ring from the framework. I want
> to remove a ring based on the atom ID of the atoms of the ring, rather
> than as a substructure so that in the case of a molecule containing
> more than one of the same ring, only one ring remains. Therefore I
> have used RemoveAtoms to remove each ring atom one by one. My code is
> as follows.
>
> def removeRings(mol):
>
> frame = MurckoScaffold.GetScaffoldForMol(mol)
>
> getRings = frame.GetRingInfo()
> rings = getRings.AtomRings()
>
> splitFrames = [x for x in range(len(rings))]
>
> for x in range(len(rings)):
> editMol = Chem.EditableMol(frame)
>
> for atom in rings[x]:
> editMol.RemoveAtom(atom)
>
> splitFrames[x] = editMol.GetMol()
>
> for x in splitFrames:
> print Chem.MolToSmiles(x)
>
> However when I use this function I get the following error:
>
> Range Error
> idx
> Violation occurred on line 143 in file /usr/local/bin/src/
> RDKit_2011_03_2/Code/GraphMol/ROMol.cpp
> Failed Expression: 0 <= 22 <= 20
>
> I assume that this means that the atom I am trying to remove is not in
> the range of the atoms in the framework. I have checked the atom IDs
> for all atoms in the framework and all atoms in the rings and they are
> within the same range, so I do not understand why I am getting this
> error. Can anyone help please?
>
> Many thanks,
>
> Sarah Langdon
> PhD student
> Cancer Research UK  Cancer Therapeutics Unit
> Institute of Cancer Research
> Haddow Laboratories
> 15 Cotswold Road
> Sutton
> Surrey  SM2 5NG
>
> Tel: 0208 722 4139
> Email: sarah.lang...@icr.ac.uk
>
>
> The Institute of Cancer Research: Royal Cancer Hospital, a charitable
> Company Limited by Guarantee, Registered in England under Company No. 534147
> with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
>
> This e-mail message is confidential and for use by the addressee only.  If
> the message is received by anyone other than the addressee, please return
> the message to the sender by replying to it and then delete the message from
> your computer and network.
>
>
> --
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2dcopy1
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] label properties on 2D depiction

2011-07-13 Thread George Papadatos
Thanks Greg, I'll have a go.

Regards,

George


On 13 July 2011 04:11, Greg Landrum  wrote:

> George,
>
> On Tue, Jul 12, 2011 at 6:54 PM, George Papadatos 
> wrote:
> > On a related topic, is it possible to depict an arbitrary string on a
> cairo
> > canvas? I am thinking particularly depicting the name or ID of a molecule
> > below its structure.
>
> It's not there at the moment, but it's definitely possible. The
> process would be similar to what I proposed to Peter. The additional
> piece of information you'd need is that the MolDrawing instance also
> has a boundingBoxes data member that is keyed by the molecule. You
> could do something like this:
>
>from rdkit.Chem.Draw.MolDrawing import Font
>bbox = drawer.boundingBoxes[mol]
>pos = (bbox[2]-bbox[0])/2,bbox[3]
>font=Font(face='sans',size=12)
>canvas.addCanvasText(legend,pos,font)
>
> (you'll almost definitely have to experiment a bit to get the text
> placed correctly)
>
> -greg
>
--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on "Lean Startup 
Secrets Revealed." This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] label properties on 2D depiction

2011-07-12 Thread George Papadatos
Hi all,

On a related topic, is it possible to depict an arbitrary string on a cairo
canvas? I am thinking particularly depicting the name or ID of a molecule
below its structure.

Regards,

George

On 12 July 2011 06:36, Peter Schmidtke  wrote:

> Thanks Greg,
>
> I'll give it a try ;)
>
> ++
>
> Peter
>
> On 07/11/2011 06:47 PM, Greg Landrum wrote:
> > Hi Peter,
> >
> > On Mon, Jul 11, 2011 at 1:35 PM, Peter Schmidtke 
> wrote:
> >
> >> I wondered if it was possible and easy to show some numerical properties
> >> or strings or whatever for each atom on a 2d representation of a
> molecule.
> >>
> >> Is something like that implemented (didn't really see it right now)?
> >>
> > There's nothing like that built in, but pretty much everything you
> > need to be able to annotate the drawing yourself after the molecule
> > has been drawn is already there.
> >
> > Take a look at the code in $RDBASE/rdkit/Chem/Draw/__init__.py:MolToImage
> > After line 70 executes, you have a canvas (either cairo, aggdraw, or
> > sping, depending on which system you have installed) that contains the
> > molecule drawing as well as MolDrawing instance named drawer. drawer
> > has a data element atomPs that can be used to get the position of
> > atoms in canvas coordinates : drawer.atomPs[mol][atomIdx]. The code
> > for the individual canvases shows how to do something with these
> > coordinates.
> >
> >  -greg
> >
>
>
> --
>
> Peter Schmidtke
> PhD Student
> Dept. Physical Chemistry
> Faculty of Pharmacy
> University of Barcelona
>
>
>
> --
> All of the data generated in your IT infrastructure is seriously valuable.
> Why? It contains a definitive record of application performance, security
> threats, fraudulent activity, and more. Splunk takes this data and makes
> sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2d-c2
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on "Lean Startup 
Secrets Revealed." This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] random forest in RDKit

2011-05-02 Thread George Papadatos
Hi guys,

I'd also be interested in some ML examples.

Regards,

George


On 2 May 2011 20:52, Igor Filippov  wrote:

> Hi Greg,
>
> Yes, actually for this project I'm interested in Python specifically!
> Time to learn me some new tricks :)
> I was looking through the docs online but I cannot figure it out :(
>
> Best regards,
> Igor
>
> On Mon, 2011-05-02 at 21:45 +0200, Greg Landrum wrote:
> > Hi Igor,
> >
> > On Mon, May 2, 2011 at 9:08 PM, Igor Filippov 
> wrote:
> > >
> > > Can anybody point me in the right direction (some simple code snippets
> > > would be best) how to use machine learning methods in RDkit? I am
> > > especially interested in RandomForest implementation.
> > >
> >
> > The machine learning code is mostly written in Python. I know you're
> > primarily a C++ user, are you still interested?
> >
> > -greg
>
>
>
>
> --
> WhatsUp Gold - Download Free Network Management Software
> The most intuitive, comprehensive, and cost-effective network
> management toolset available today.  Delivers lowest initial
> acquisition cost and overall TCO of any competing solution.
> http://p.sf.net/sfu/whatsupgold-sd
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Python 2.7 binaries for win32

2011-04-19 Thread George Papadatos
Hello,

FYI, it seems there is an inconsistency in the RDKit binaries for Windows
Python 2.7, as the dependency walker indicated: The rdBase.pyd looks for a
boost_python-vc-mt-1_44.dll in %RDBASE%/lib whereas the actual name of the
dll is boost_python-vc*90*-mt-1_44.dll
This is probably what caused the problem for me.

Removing the '90' from the 2 dlls in lib folder seems to do the trick.

Regards,

George



On 18 April 2011 08:17, George Papadatos  wrote:

> Hi Uwe,
>
>  Thanks for the reply. Perhaps I did not make it clear but what I meant is
> that I appended %RDBASE%\lib to the PATH variable.
>
> Regards,
>
> George
>
> Sent from my gPhone
>
> On 18 Apr 2011, at 07:49, Uwe Hoffmann  wrote:
>
> > Hi George,
> > Am 17.04.2011 12:03, schrieb George Papadatos:
> >> So...
> >> I've copied the binaries folder to C:\RDKit_2011_03_1
> >> I've added the variables:
> >> RDBASE = C:\RDKit_2011_03_1
> >> PYTHONPATH = %RDBASE%
> >> PATH = %RDBASE%\lib
> > This seems to be problematic because you overwrite the whole PATH
> > environment variable.
> >>
> >> ImportError: DLL load failed: The specified module could not be found.
> >>
> > regards,
> >
> >   Uwe
> >
> >
> >
> --
> > Benefiting from Server Virtualization: Beyond Initial Workload
> > Consolidation -- Increasing the use of server virtualization is a top
> > priority.Virtualization can reduce costs, simplify management, and
> improve
> > application availability and disaster protection. Learn more about
> boosting
> > the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
> > ___
> > Rdkit-discuss mailing list
> > Rdkit-discuss@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Python 2.7 binaries for win32

2011-04-18 Thread George Papadatos
Hi Uwe,

 Thanks for the reply. Perhaps I did not make it clear but what I meant is that 
I appended %RDBASE%\lib to the PATH variable. 

Regards,

George

Sent from my gPhone

On 18 Apr 2011, at 07:49, Uwe Hoffmann  wrote:

> Hi George,
> Am 17.04.2011 12:03, schrieb George Papadatos:
>> So...
>> I've copied the binaries folder to C:\RDKit_2011_03_1
>> I've added the variables:
>> RDBASE = C:\RDKit_2011_03_1
>> PYTHONPATH = %RDBASE%
>> PATH = %RDBASE%\lib
> This seems to be problematic because you overwrite the whole PATH 
> environment variable.
>> 
>> ImportError: DLL load failed: The specified module could not be found.
>> 
> regards,
> 
>   Uwe
> 
> 
> --
> Benefiting from Server Virtualization: Beyond Initial Workload 
> Consolidation -- Increasing the use of server virtualization is a top
> priority.Virtualization can reduce costs, simplify management, and improve 
> application availability and disaster protection. Learn more about boosting 
> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Python 2.7 binaries for win32

2011-04-17 Thread George Papadatos
So...
I've copied the binaries folder to C:\RDKit_2011_03_1
I've added the variables:
RDBASE = C:\RDKit_2011_03_1
PYTHONPATH = %RDBASE%
PATH = %RDBASE%\lib

And yet, I still get this error:
In [1]: from rdkit.Chem import AllChem as Chem
---
ImportError   Traceback (most recent call last)

C:\Documents and Settings\yex7845\Desktop\ in ()

C:\RDKit_2011_03_1\rdkit\Chem\__init__.py in ()
 16
 17 """
---> 18 from rdkit import rdBase
 19 from rdkit import RDConfig
 20

ImportError: DLL load failed: The specified module could not be found.


What's wrong?

Thanks in advance,

George




On 17 April 2011 10:07, George Papadatos  wrote:

> Cheers, Greg.
>
> George
>
> On 17 April 2011 06:14, Greg Landrum  wrote:
>
>> Dear all,
>>
>> After a couple of requests, I just uploaded a win32 build of the
>> 2011.03 release that supports Python 2.7 to both the google code and
>> sourceforge download sites.
>>
>> Best Regards,
>> -greg
>>
>>
>> --
>> Benefiting from Server Virtualization: Beyond Initial Workload
>> Consolidation -- Increasing the use of server virtualization is a top
>> priority.Virtualization can reduce costs, simplify management, and improve
>> application availability and disaster protection. Learn more about
>> boosting
>> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
>
--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Python 2.7 binaries for win32

2011-04-17 Thread George Papadatos
Cheers, Greg.

George

On 17 April 2011 06:14, Greg Landrum  wrote:

> Dear all,
>
> After a couple of requests, I just uploaded a win32 build of the
> 2011.03 release that supports Python 2.7 to both the google code and
> sourceforge download sites.
>
> Best Regards,
> -greg
>
>
> --
> Benefiting from Server Virtualization: Beyond Initial Workload
> Consolidation -- Increasing the use of server virtualization is a top
> priority.Virtualization can reduce costs, simplify management, and improve
> application availability and disaster protection. Learn more about boosting
> the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Installation driving me mad (RDKit on Centos 5.4 final)

2011-02-23 Thread George Papadatos
Fair enough, I did not know that! However, according to the same
documentation, these packages are highly recommended for NumPy and required
for SciPy:
http://scipy.org/Installing_SciPy/Linux#head-9cf6f4b7fe9ba63fc228203c4f28554a74970847

<http://scipy.org/Installing_SciPy/Linux#head-9cf6f4b7fe9ba63fc228203c4f28554a74970847>In
any case, here is a repository for CentOS 5/RHEL 5 with the necessary rpms
(for those who can't access yum):
http://download.opensuse.org/repositories/home:/ashigabou/
<http://download.opensuse.org/repositories/home:/ashigabou/>After that,
Kirk's walk though has been most helpful.

George


On 23 February 2011 11:12, Greg Landrum  wrote:

> Let me elaborate on that... from the numpy installation page
> (http://docs.scipy.org/doc/numpy/user/install.html:
> "NumPy does not require any external linear algebra libraries to be
> installed. However, if these are available, NumPy’s setup script can
> detect them and use them for building. A number of different LAPACK
> library setups can be used, including optimized LAPACK libraries such
> as ATLAS, MKL or the Accelerate/vecLib framework on OS X."
>
> Best,
> -greg
>
>
>
>
> On Wed, Feb 23, 2011 at 12:10 PM, Greg Landrum 
> wrote:
> > I'm not convinced of that. I'm pretty sure that I have built numpy on
> > redhat and ubuntu systems without ever installing lapack.
> >
> > -greg
> >
> >
> > On Wed, Feb 23, 2011 at 12:06 PM, George Papadatos 
> wrote:
> >> ...yet you need them to build Numpy...
> >> George
> >>
> >> On 23 February 2011 11:03, Greg Landrum  wrote:
> >>>
> >>> To be very clear: you do not need *any* of these packages to install
> the
> >>> RDKit.
> >>>
> >>> -greg
> >>>
> >>>
> >>> On Wed, Feb 23, 2011 at 10:53 AM, JP 
> wrote:
> >>> > Great wiki - I wonder how I missed that.
> >>> > But the first instruction
> >>> > sudo yum install atlas, atlas-devel, blas blas-devel lapack
> lapack-devel
> >>> >
> >>> > Gives me the following error:
> >>> > No package atlas, available.
> >>> > No package atlas-devel, available.
> >>> > No package blas available.
> >>> > No package lapack available.
> >>> > Is there a repos I have to add to /etc/yum.repos.d/ ?
> >>> >
> >>> >
> >>> > On 22 February 2011 18:41, Robert DeLisle 
> wrote:
> >>> >>
> >>> >> What are your environment settings?  You should have at minimum,
> these:
> >>> >>
> >>> >> $RDBASE = 
> >>> >>
> >>> >>
> >>> >> $LD_LIBRARY_PATH = /usr/local/lib:/$RDBASE/lib
> >>> >>
> >>> >> $PYTHONPATH = $RDBASE
> >>> >>
> >>> >>
> >>> >> At least this worked for me for a CentOS installation, detailed here
> -
> >>> >> http://code.google.com/p/rdkit/wiki/BuildingOnCentOS
> >>> >>
> >>> >>
> >>> >>
> >>> >> Another possibility is your PATH variable.  Make sure that
> /usr/local
> >>> >> pathnames precede any /usr options.
> >>> >> This will ensure looking into /usr/local first.
> >>> >>
> >>> >> There also may be options for cmake that will force it into the
> correct
> >>> >> directory.  I've found in the past that even though
> >>> >>
> >>> >>
> >>> >> it says in the initial output that is looking in the correct
> location
> >>> >> for
> >>> >> boost and python, it doesn't necessarily follow its
> >>> >> own advice.
> >>> >>
> >>> >> -Kirk
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Tue, Feb 22, 2011 at 9:44 AM, JP 
> >>> >> wrote:
> >>> >>>
> >>> >>> I ended up not using yum to install Numpy - I installed it from
> >>> >>> source,
> >>> >>> which was only slightly painful.
> >>> >>> >>> import platform; print platform.python_version()
> >>> >>> # /usr/local/lib/python2.7/platform.pyc matches
> >>> >>> /usr/local/lib/python2.7/platform.py
> >>> >>> import platform # precompiled from

Re: [Rdkit-discuss] Installation driving me mad (RDKit on Centos 5.4 final)

2011-02-23 Thread George Papadatos
Linking CXX shared library ../../lib/libRDBoost.so
> >>>> > cd /share/apps/RDKit_2010_12_1/build/Code/RDBoost &&
> >>>> > /usr/local/bin/cmake -E cmake_link_script
> CMakeFiles/RDBoost.dir/link.txt
> >>>> > --verbose=1
> >>>> >
> >>>> > /usr/bin/c++  -fPIC -O3 -DNDEBUG  -shared
> -Wl,-soname,libRDBoost.so.1
> >>>> > -o ../../lib/libRDBoost.so.1.2010.12.1
> CMakeFiles/RDBoost.dir/Wrap.cpp.o
> >>>> > -L/usr/local/lib/libpython2.7.a -L/share/apps/boost_1_45_0/lib
> >>>> > /usr/local/lib/libpython2.7.a
> >>>> > /share/apps/boost_1_45_0/lib/libboost_python.so
> >>>> >
> -Wl,-rpath,/usr/local/lib/libpython2.7.a:/share/apps/boost_1_45_0/lib:
> >>>> >
> >>>> > /usr/bin/ld: /usr/local/lib/libpython2.7.a(exceptions.o): relocation
> >>>> > R_X86_64_32 against `_Py_NoneStruct' can not be used when making a
> shared
> >>>> > object; recompile with -fPIC
> >>>> >
> >>>> > /usr/local/lib/libpython2.7.a: could not read symbols: Bad value
> >>>> > collect2: ld returned 1 exit status
> >>>> >
> >>>> > make[2]: *** [lib/libRDBoost.so.1.2010.12.1] Error 1
> >>>> > make[2]: Leaving directory `/share/apps/RDKit_2010_12_1/build'
> >>>> >
> >>>> > make[1]: *** [Code/RDBoost/CMakeFiles/RDBoost.dir/all] Error 2
> >>>> > make[1]: Leaving directory `/share/apps/RDKit_2010_12_1/build'
> >>>> >
> >>>> > make: *** [all] Error 2
> >>>> >
> >>>> >
> >>>> >
> >>>> >
> >>>> > Note that all env variables have been set:
> >>>> >
> >>>> > [jpebe@caio build]$ echo $LD_LIBRARY_PATH
> >>>> >
> >>>> >
> >>>> >
> /share/apps/boost_1_45_0/lib:/share/apps/openbabel-2.3.0/lib:/usr/lib64/atlas/sse2:/share/apps/RDKit_2010_12_1/lib:/opt/gridengine/lib/lx26-amd64
> >>>> >
> >>>> >
> >>>> > Any ideas? I spent the whole day on this...
> >>>> >
> >>>> >
> >>>> > And its freaking me out...
> >>>> >
> >>>
> >>>
> >>> --
> >>>
> >>> Jean-Paul Ebejer
> >>> Early Stage Researcher
> >>> InhibOx Ltd
> >>> Pembroke House
> >>> 36-37 Pembroke Street
> >>> Oxford
> >>> OX1 1BP
> >>> UK
> >>> (+44 / 0) 1865 262 034
> >>>
> >>>
> >>> This email and any files transmitted with it are confidential and
> >>> intended solely for the use of the individual or entity to whom they
> are
> >>> addressed. Any unauthorised dissemination or copying of this email or
> its
> >>> attachments, and any use or disclosure of any information contained in
> them,
> >>> is strictly prohibited and may be illegal.  If you have received this
> email
> >>> in error please notify the sender and delete all copies from your
> system.
> >>>
> >>> We and our group companies accept no liability or responsibility for
> >>> personal emails or emails unconnected with our business.
> >>>
> >>> Internet communications including emails and access and use of web
> sites
> >>> cannot be guaranteed to be secure or error free as information can be
> >>> intercepted, corrupted, lost or arrive late. Furthermore, while we have
> >>> taken steps to control the spread of viruses on our systems, we cannot
> >>> guarantee that this email and any files transmitted with it are virus
> free.
> >>> No liability is accepted for any errors, omissions, interceptions,
> corrupted
> >>> mail, lost communications or late delivery arising as a result of
> receiving
> >>> this message via the Internet or for any virus that may be contained in
> it.
> >>>
> >>
> >
> >
> >
> > --
> >
> > Jean-Paul Ebejer
> > Early Stage Researcher
> > InhibOx Ltd
> > Pembroke House
> > 36-37 Pembroke Street
> > Oxford
> > OX1 1BP
> > UK
> > (+44 / 0) 1865 262 034
> >
> >
> > This email and any files transmitted with it are confidential and
> intended
> > solely for the use of the individual or entity to whom they are
>

[Rdkit-discuss] KNIME + Java RDKit library problem

2010-11-19 Thread George Papadatos
Hi guys,

I installed the RDKit nodes for KNIME (by copying the plugins folder
manually, as I too had problems with the 'update from file' feature).
Inspired by the source code that was bundled with the nodes, I tried to use
the RDKit libraries in KNIME/Eclipse in order to develop my own nodes based
on the RDKit toolkit.

For example:

import org.RDKit.RDKFuncs;
import org.RDKit.ROMol;

public class RDKitTest {
 public static void main (String[] args) throws Exception
{
ROMol mol = null;
 String smi = "c1c1N";
mol = RDKFuncs.MolFromSmiles(smi);
System.out.println(mol.getNumAtoms());
 }


However, this script throws the following runtime error:

Exception in thread "main" java.lang.UnsatisfiedLinkError:
org.RDKit.RDKFuncsJNI.MolFromSmiles(Ljava/lang/String;)J
 at org.RDKit.RDKFuncsJNI.MolFromSmiles(Native Method)
at org.RDKit.RDKFuncs.MolFromSmiles(RDKFuncs.java:65)


In the Eclipse lib folder, I included all the .jar files and the
RDKFuncs.dll.

Any ideas???


Regards,

George Papadatos
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Antwort: Installation fails for KNIME nodes

2010-11-19 Thread George Papadatos
Hi Paul,

No worries! :)


Regards,

George
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] KNIME + Java RDKit library

2010-11-19 Thread George Papadatos
Hi guys,

I installed the RDKit nodes for KNIME (by copying the plugins folder
manually, as I too had problems with the 'update from file' feature).
Inspired by the source code that was bundled with the nodes, I tried to use
the RDKit libraries in KNIME/Eclipse in order to develop my own nodes based
on the RDKit toolkit.

For example:

import org.RDKit.RDKFuncs;
import org.RDKit.ROMol;

public class RDKitTest {
 public static void main (String[] args) throws Exception
{
ROMol mol = null;
String smi = "c1c1N";
mol = RDKFuncs.MolFromSmiles(smi);
System.out.println(mol.getNumAtoms());
}


However, this script throws the following runtime error:

Exception in thread "main" java.lang.UnsatisfiedLinkError:
org.RDKit.RDKFuncsJNI.MolFromSmiles(Ljava/lang/String;)J
at org.RDKit.RDKFuncsJNI.MolFromSmiles(Native Method)
at org.RDKit.RDKFuncs.MolFromSmiles(RDKFuncs.java:65)


In the Eclipse lib folder, I included all the .jar files and the
RDKFuncs.dll.

Any ideas???


Regards,

George Papadatos
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] KNIME + Java RDKit library problem

2010-11-17 Thread George Papadatos
Hi Thorsten and Greg,

Many thanks for your replies.

On 16 November 2010 20:16, Thorsten Meinl wrote:

> Hi George,
>
> > I installed the RDKit nodes for KNIME (by copying the plugins folder
> > manually, as I too had problems with the 'update from file' feature).
> > Inspired by the source code that was bundled with the nodes,
> Did you have the same problems as Paul, i.e. KNIME complaining about
> some osbi.bundles not being found?


This is the error I get when I use the local update site:
Cannot complete the request.  See the details.
Unsatisfied dependency: [org.rdkit.knime.source.feature.feature.group
0.9.0.0027626] requiredCapability:
org.eclipse.equinox.p2.iu/org.rdkit.knime.bin.macosx.x86_64/[0.9.0.0027589,0.9.0.0027589]
Unsatisfied dependency: [org.rdkit.knime.source.feature.feature.group
0.9.0.0027626] requiredCapability:
org.eclipse.equinox.p2.iu/org.rdkit.knime.bin.linux.x86/[0.9.0.0027561,0.9.0.0027561]
Unsatisfied dependency: [org.rdkit.knime.source.feature.feature.group
0.9.0.0027626] requiredCapability:
org.eclipse.equinox.p2.iu/org.rdkit.knime.bin.linux.x86_64/[1.0.0.0027615,1.0.0.0027615]


> > However, this script throws the following runtime error:
> > Exception in thread "main" java.lang.UnsatisfiedLinkError:
> > org.RDKit.RDKFuncsJNI.MolFromSmiles(Ljava/lang/String;)J
> > at org.RDKit.RDKFuncsJNI.MolFromSmiles(Native Method)
> > at org.RDKit.RDKFuncs.MolFromSmiles(RDKFuncs.java:65)
> >
> > In the Eclipse lib folder, I included all the .jar files and the
> RDKFuncs.dll.
> > Any ideas???
> In order to use code from native libaries Java needs to be told where to
> look for them. This is usually done by defining -Djava.library.path
> appropriately. If the application consists of Eclipse plugins (i.e. not
> just a bunch of jars), then there is some magic that loads the native
> libraries w/o needing to specify the explicitly. This is what happens
> wiht the KNIME plugins. So you either need to set the Java property or
> put your code in a plugin, which depends on org.rdkit.knime.types, and
> run an Eclipse application.
>
> Thanks to your tip and my colleague Nico Fechner, it is working now.
For those with the same problem, you need this line at the beginning of your
code: System.load("Path//to//the//dll//RDKFuncs.dll");
Alternatively, as you suggested, you need to set the VM arguments
appropriately in Eclipse, i.e. -Djava.library.path=Path//to///the/dll// and
then add this line in the code: System.loadLibrary("RDKFuncs");

Thanks again,

George
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss