Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-02 Thread Igor Filippov
I could be wrong but I believe IBM system had a preprocessing step which
removed all known dictionary words - which would get rid of "submarine" etc.
I also believe this problem has been solved multiple times in the past,
NextMove software comes to mind, chemical tagger -
http://chemicaltagger.ch.cam.ac.uk/, etc.

my 2 cents,
Igor




On Fri, Dec 2, 2016 at 11:46 AM, Brian Kelley  wrote:

> I hacked a version of RDKit's smiles parser to compute heavy atom count,
> perhaps some version of this could be used to check smiles validity without
> making the actual molecule.
>
> From a fun historical perspective:  IBM had an expert system to find IUPAC
> names in documents.  They ended up finding things like "submarine" which
> was amusing.  It turned out that just parsing all words with the IUPAC
> parser was by far the fastest and best solution.  I expect the same will be
> true for finding smiles.
>
> It would be interesting to put the common OCR errors into the parser as
> well (l's and 1's are hard for instance).
>
>
> On Fri, Dec 2, 2016 at 10:46 AM, Peter Gedeck 
> wrote:
>
>> Hello Alexis,
>>
>> Depending on the size of your document, you could consider limit storing
>> the already tested strings by word length and only memoize shorter words.
>> SMILES tend to be longer, so everything above a given number of characters
>> has a higher probability of being a SMILES. Large words probably also
>> contain a lot of chemical names. They often contain commas (,), so they are
>> easy to remove quickly.
>>
>> Best,
>>
>> Peter
>>
>>
>> On Fri, Dec 2, 2016 at 5:43 AM Alexis Parenty <
>> alexis.parenty.h...@gmail.com> wrote:
>>
>>> Dear Pavel And Greg,
>>>
>>>
>>>
>>> Thanks Greg for the regexps link. I’ll use that too.
>>>
>>>
>>> Pavel, I need to track on which document the SMILES are coming from, but
>>> I will indeed make a set of unique word for each document before looping.
>>> Thanks!
>>>
>>> Best,
>>>
>>> Alexis
>>>
>>> On 2 December 2016 at 11:21, Pavel  wrote:
>>>
>>> Hi, Alexis,
>>>
>>>   if you should not track from which document SMILES come, you may just
>>> combine all words from all document in a list, take only unique words and
>>> try to test them. Thus, you should not store and check for valid/non-valid
>>> strings. That would reduce problem complexity as well.
>>>
>>> Pavel.
>>> On 12/02/2016 11:11 AM, Greg Landrum wrote:
>>>
>>> An initial start on some regexps that match SMILES is here:
>>> https://gist.github.com/lsauer/1312860/264ae813c2bd2c2
>>> 7a769d261c8c6b38da34e22fb
>>>
>>> that may also be useful
>>>
>>> On Fri, Dec 2, 2016 at 11:07 AM, Alexis Parenty <
>>> alexis.parenty.h...@gmail.com> wrote:
>>>
>>> Hi Markus,
>>>
>>>
>>> Yes! I might discover novel compounds that way!! Would be interesting to
>>> see how they look like…
>>>
>>>
>>> Good suggestion to also store the words that were correctly identified
>>> as SMILES. I’ll add that to the script.
>>>
>>>
>>> I also like your “distribution of word” idea. I could safely skip any
>>> words that occur more than 1% of the time and could try to play around with
>>> the threshold to find an optimum.
>>>
>>>
>>> I will try every suggestions and will time it to see what is best. I’ll
>>> keep everyone in the loop and will share the script and results.
>>>
>>>
>>> Thanks,
>>>
>>>
>>> Alexis
>>>
>>> On 2 December 2016 at 10:47, Markus Sitzmann 
>>> wrote:
>>>
>>> Hi Alexis,
>>>
>>> you may find also so some "novel" compounds by this approach :-).
>>>
>>> Whether your tuple solution improves performance strongly depends on
>>> the content of your text documents and how often they repeat the same words
>>> again - but my guess would be it will help. Probably the best way is even
>>> to look at the distribution of words before you feed them to RDKit. You
>>> should also "memorize" those ones that successfully generated a structure,
>>> doesn't make sense to do it again, then.
>>>
>>> Markus
>>>
>>> On Fri, Dec 2, 2016 at 10:21 AM, Maciek Wójcikowski <
>>> mac...@wojcikowski.pl> wrote:
>>>
>>> Hi Alexis,
>>>
>>> You may want to filter with some regex strings containing not valid
>>> characters (i.e. there is small subset of atoms that may be without
>>> brackets). See "Atoms" section: http://www.daylight.com/dayhtm
>>> l/doc/theory/theory.smiles.html
>>>
>>> The set might grow pretty quick and may be inefficient, so I'd parse all
>>> strings passing above filter. Although there will be some false positives
>>> like "CC" which may occur in text (emails especially).
>>>
>>> 
>>> Pozdrawiam,  |  Best regards,
>>> Maciek Wójcikowski
>>> mac...@wojcikowski.pl
>>>
>>> 2016-12-02 10:11 GMT+01:00 Alexis Parenty >> >:
>>>
>>> Dear all,
>>>
>>>
>>> I am looking for a way to extract SMILES scattered in many text
>>> documents (thousands documents of several pages each).
>>>
>>> At the moment, I am thinking to scan each words 

Re: [Rdkit-discuss] library name change?

2016-08-19 Thread Igor Filippov
Paul,

Both or neither - for me it's important that RDKit::foo() is working in my
code, not which specific library it is in or
what other libraries the first lib depends upon. So if I need to use foo()
I'd like to know what to include into my Makefile.
That's why I suggested having a monolithic static library as an option or
an easy way to get a list of libs to add to the linker command.

Igor


On Fri, Aug 19, 2016 at 11:22 AM, Paul Emsley <pems...@mrc-lmb.cam.ac.uk>
wrote:

>
> I'd like to pick apart this comment:
>
> On 19/08/2016 15:45, Igor Filippov wrote:
>
> > It is sometimes a bit of a pain to collect the list of the dependencies.
>
> Do you mean that (for example) if you wanted to link with
> libMolChemicalFeatures, you also
> have to add libSubstructureMatch and libSmilesParse - and it isn't readily
> apparent to you
> which additional libraries you need to add when linking?
>
> > Alternatively some easier way to discover what belongs to what library
> would be appreciated...
>
> Do you mean which libraries depend on which libraries (as above) - or
> which functions are in
> which libraries?
>
>
>
> 
> --
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] library name change?

2016-08-19 Thread Igor Filippov
If we are talking about the changes to the way the libs are build is there
a chance to get a (possibly optional) monolithic static library?
It is sometimes a bit of a pain to collect the list of the dependencies.
Alternatively some easier way to discover what belongs to what library
would be appreciated...

Igor


On Fri, Aug 19, 2016 at 10:40 AM, Greg Landrum 
wrote:

> Ok, here's the issue in github:
> https://github.com/rdkit/rdkit/issues/1036
>
> These are famous last words, but it looks like adding this, and making it
> optional, may be trivial. Cmake is *awesome*.
> Let's move any technical discussion to github
>
> -greg
>
>
> On Fri, Aug 19, 2016 at 2:28 PM, Brian Kelley 
> wrote:
>
>> Perhaps announce at the RDKit meeting and make the full change for the
>> first release of next year?  We could also make it a CMAKE option to
>> use/build the old names, but this would be a bit of work.
>>
>> Cheers,
>>  Brian
>>
>> On Fri, Aug 19, 2016 at 7:52 AM, Greg Landrum 
>> wrote:
>>
>>> Hi Paul,
>>>
>>> Nice suggestion. It seems logical to me, though I would probably go with
>>> RDKit instead of RD as the prefix.
>>> It's not a small change for people who are using the C++ libs without
>>> cmake (I wouldn't change the names of the cmake projects, so if you're
>>> using the RDKit cmake stuff nothing changes), but I suspect there aren't
>>> that many of you guys.
>>>
>>> @Gianluca: what do you think?
>>>
>>> Anyone else have an opinion?
>>>
>>> -greg
>>>
>>>
>>>
>>> On Fri, Aug 19, 2016 at 1:15 PM, Paul Emsley 
>>> wrote:
>>>

 Greg,

 RDKit is becoming increasingly popular and is getting picked up by
 third parties, including
 the Linux distros.  It seems to me that several RDKit library names are
 too generic (and
 hence confusing) for such an environment: I have in mind libs such as
 Alignment, Catalog,
 FileParsers (and others).  I suggest that all RDKit libraries are
 prefixed with RD (like
 RDGeneral and RDInchi). I think that this should be done by at
 RDKit-Central rather than by
 patches applied by package maintainers at the distros.

 Yes, this will involve some fiddling for us C++ RDKit users, but worth
 it, I think.

 Paul.

 
 --
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

>>>
>>>
>>> 
>>> --
>>>
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>
> 
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] GetMol and GetMolFrags in C++

2016-04-10 Thread Igor Filippov
Did you want
std::vector> >
RDKit::MolOps::getMolFrags ( const ROMol
 &  *mol*, bool
*sanitizeFrags* = true,
by any chance? This will return a vector of ROMol's which correspond to the
contiguous fragments.

Igor


On Sun, Apr 10, 2016 at 11:04 PM, Yingfeng Wang  wrote:

> Greg,
>
> Thanks. In python, I am using GetMolFrags by the following way,
>
> Chem.GetMolFrags(current_modified_mol, asMols=True, sanitizeFrags=False)
>
> However, there is not a version at the link you mentioned allows me to
> specify asMols. Could you please give me more hints?
>
> Thanks.
> Yingfeng
>
> On Sun, Apr 10, 2016 at 10:49 PM, Greg Landrum 
> wrote:
>
>>
>>
>> On Mon, Apr 11, 2016 at 1:35 AM, Yingfeng Wang 
>> wrote:
>>>
>>>
>>> Thanks. Say, the current RWMol object has 10 bonds, so ids of these
>>> bonds should be from 1-10. Now, I remove one bond. How do I reset all nine
>>> bonds with ids from 1-9 (if one or two atoms are gone with this bond, I
>>> also want to reset the ids of these atoms).
>>>
>>
>> You don't need to do that, it happens automatically
>>
>>
>>> In addition, say, there are two fragments after one bond is removed. How
>>> do I get these two fragments as mols. In python, I can use
>>> Chem.GetMolFrags. But how do I get my job done in C++?
>>>
>>
>> There are a number of different versions of MolOps::getMolFrags(), pick
>> the on that does what you like:
>>
>> http://rdkit.org/docs/cppapi/namespaceRDKit_1_1MolOps.html#ad8100d785d32fb3c173b83949766b87b
>>
>> Since the Python and C++ function/method names are often similar to each
>> other, a good way to find answers like this is to google for the name of
>> the Python function you are looking for and look for the link to the C++
>> documentation in the results.
>>
>> Best,
>> -greg
>>
>
>
>
> --
> Find and fix application performance issues faster with Applications
> Manager
> Applications Manager provides deep performance insights into multiple
> tiers of
> your business applications. It resolves application problems quickly and
> reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/
> gampad/clk?id=1444514301=/ca-pub-7940484522588532
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/
gampad/clk?id=1444514301=/ca-pub-7940484522588532___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Molecular dis / similarity using fingerprints

2015-05-27 Thread Igor Filippov
JP,

A bit of self-advertisement if I may  - our Diversity Genie, which uses
RDKit on the background by the way - was initially created to
answer this exact question. www.diversitygenie.com - hope it may come
useful.

Igor

On Wed, May 27, 2015 at 4:05 AM, JP jeanpaul.ebe...@inhibox.com wrote:

 Thanks all for the lit. references (and for the ever useful TL;DR).  It
 now seems clear that 0.7 is too high a value for ECFP4 (you convinced me).

 Yes George, that was what I was trying to do - make statements like this
 compound library is more diverse than this other, and quantify that
 diversity with a set of numbers.

 -
 Jean-Paul Ebejer
 Early Stage Researcher

 On 26 May 2015 at 12:57, George Papadatos gpapada...@gmail.com wrote:

 Hi JP,

 Aha, so you're looking for a threshold that will exhibit the optimal
 balance between the false positives and false negatives in the
 *biological* *activity* space. This threshold varies depending on the
 fingerprint and the dataset of course.
 See here for some generalised insights:

 (1) Papadatos, G.; Cooper, A. W. J.; Kadirkamanathan, V.; Macdonald, S.
 J. F.; McLay, I. M.; Pickett, S. D.; Pritchard, J. M.; Willett, P.; Gillet,
 V. J. Analysis of Neighborhood Behavior in Lead Optimization and Array
 Design. *J. Chem. Inf. Model.* *2009*, *49*, 195–208.

 especially Figure 17, and

 (2) Muchmore, S. W.; Debe, D. A.; Metz, J. T.; Brown, S. P.; Martin, Y.
 C.; Hajduk, P. J. Application of Belief Theory to Similarity Data Fusion
 for Use in Analog Searching and Lead Hopping. *J. Chem. Inf. Model.*
 *2008*, *48*, 941–948.

 and also Greg's blog post:

 http://rdkit.blogspot.co.uk/2013/10/fingerprint-thresholds.html


 The TL/DR version is that for ECFP_4, this threshold should be around
 0.45-0.55.
 Wrt methodology, are you trying to score/rank the
 intra-diversity/heterogeneity for different structure sets?


 Cheers,

 George



 On 26 May 2015 at 11:59, JP jeanpaul.ebe...@inhibox.com wrote:


 On 25 May 2015 at 22:23, Tim Dudgeon tdudgeon...@gmail.com wrote:

 Maybe a clustering approach may work? Something like sphere exclusion
 clustering with counting the number of clusters at 0.9 - 0.8 similarity)?
 With 30K structures it sounds computationally tractable?


 Thanks Tim for this idea.  I hadn't heard of sphere exclusion.  The
 problem is we still need a distance / similarity function (which using ECFP
 with high similarity 0.8-0.9 would result in very few compounds being
 thrown out).  I think the real issue here is selecting a sensible
 similarity threshold which defines my idea of similarity.  But that is a
 tricky number to get right - too high and you remove nothing, too low and
 you start catching different molecules.  I guess the best thing is try a
 few values (0.5, 0.6, 0.7, 0.8, 0.9) and have a visual look at the
 remaining compounds.

 -
 JP


 --
 One dashboard for servers and applications across Physical-Virtual-Cloud
 Widest out-of-the-box monitoring support with 50+ applications
 Performance metrics, stats and reports that give you Actionable Insights
 Deep dive visibility with transaction tracing using APM Insight.
 http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





 --

 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] A RDKit/Scikit-learn question

2015-02-20 Thread Igor Filippov
Oops, sorry, got into a wrong branch! I am just repeating Maciek's answer
it looks like!

Igor

On Fri, Feb 20, 2015 at 7:50 AM, Igor Filippov igor.v.filip...@gmail.com
wrote:

 Maciek,

 I think scikit-learn is using numpy arrays and not plain Python lists.
 They look very similar, but are not quite the same thing.
 Maybe post a bit more complete code sample for people to play with?

 Igor

 On Fri, Feb 20, 2015 at 4:06 AM, Maciek Wójcikowski mac...@wojcikowski.pl
  wrote:

 Hello,

 If I can remember correctly coefficients are Numpy array. You can try
 model.coef_.flatten() to get flat Numpy Array. If you really want a
 python list, then you probably should wrap it up with list(model.
 coef_.flatten()).

 The main reason, why the vector is nested is that you can have many
 output values for one feature vector.

 PS.
 I could also recommend my Open Drug Discovery Toolkit for playing
 around with RDKit and scikit-learn.
 https://github.com/oddt/oddt

 
 Pozdrawiam,  |  Best regards,
 Maciek Wójcikowski
 mac...@wojcikowski.pl

 2015-02-20 7:29 GMT+01:00 Greg Landrum greg.land...@gmail.com:



 On Thu, Feb 19, 2015 at 11:59 PM, Matthew Lardy mla...@gmail.com
 wrote:


 I have been able to build models via scikit-learn with the RDKit python
 wrappers.  That all works beautifully!


 It's a nice combination, isn't it?


 What I am struggling to get are the weights, or scalers, applied to
 each bit position.  For a SVM regression model (SVR) I think that the
 values I seek are in the coef_ (if the model is created via the linear
 kernel).  But, all I get is something like this when I print that out:

 [[-0. -0.87146158 -0.46331996 ...,  0.31076767 -0.
 -0.81882195]]


 I don't really know the SVM regression approach particularly well, but
 it looks like that's a vector of vectors. Is the length of the inner vector
 the same as the length of the fingerprint/descriptor vector you are
 providing?

 -greg



 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE

 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE

 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] A RDKit/Scikit-learn question

2015-02-20 Thread Igor Filippov
Maciek,

I think scikit-learn is using numpy arrays and not plain Python lists. They
look very similar, but are not quite the same thing.
Maybe post a bit more complete code sample for people to play with?

Igor

On Fri, Feb 20, 2015 at 4:06 AM, Maciek Wójcikowski mac...@wojcikowski.pl
wrote:

 Hello,

 If I can remember correctly coefficients are Numpy array. You can try
 model.coef_.flatten() to get flat Numpy Array. If you really want a
 python list, then you probably should wrap it up with list(model.
 coef_.flatten()).

 The main reason, why the vector is nested is that you can have many
 output values for one feature vector.

 PS.
 I could also recommend my Open Drug Discovery Toolkit for playing
 around with RDKit and scikit-learn.
 https://github.com/oddt/oddt

 
 Pozdrawiam,  |  Best regards,
 Maciek Wójcikowski
 mac...@wojcikowski.pl

 2015-02-20 7:29 GMT+01:00 Greg Landrum greg.land...@gmail.com:



 On Thu, Feb 19, 2015 at 11:59 PM, Matthew Lardy mla...@gmail.com wrote:


 I have been able to build models via scikit-learn with the RDKit python
 wrappers.  That all works beautifully!


 It's a nice combination, isn't it?


 What I am struggling to get are the weights, or scalers, applied to each
 bit position.  For a SVM regression model (SVR) I think that the values I
 seek are in the coef_ (if the model is created via the linear kernel).
 But, all I get is something like this when I print that out:

 [[-0. -0.87146158 -0.46331996 ...,  0.31076767 -0.
 -0.81882195]]


 I don't really know the SVM regression approach particularly well, but it
 looks like that's a vector of vectors. Is the length of the inner vector
 the same as the length of the fingerprint/descriptor vector you are
 providing?

 -greg



 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE

 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE

 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-19 Thread Igor Filippov
Dmitri,

As others before me I tried to explain to you that the simplistic
definition of unique molecule
is rather naive and is neither  useful nor reflects reality. Perhaps my
explanation is woefully inadequate to convey
the meaning I would like to convey but that is no excuse to reply with
rudeness and condescension.
If you are unable to present your arguments in a civilized manner then
please cease this discussion.

Best regards,
Igor


On Thu, Feb 19, 2015 at 2:53 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu
wrote:

 On 02/19/2015 01:24 PM, Igor Filippov wrote:
  Markus also spelled out for you different variations for context in the
  same exchange.
  Do different tautomers represent different chemicals or the same one?

 Read the thread.

  Do face recognition identifiers even approach the accuracy of InChI
  identifiers?

 At this point it's the other way around: facial recognition success
 rates are anywhere between 25% and 95%. How many of the existing
 molecules can be represented by InChI? (I'll give you a hint: none
 longer than MAX_ATOMS defined in ichisize.h.)

  If you still insist that there could be only one singular definition of
  unique in the universe then I am afraid
  this definition has no meaning and you are alone...ehm... unique.. in
 using
  it.

 When you have a different analytical engine built on different logic,
 then you can have your different definition of unique in the context
 of a computer system. As long as you're using a digital computer you're
 using the same simplistic integers, boolean algebra, and discrete math.
 That's the objective reality, it won't change no matter how much you can
 argue faces in universe.

 Similarly, when you get yourself a different English language, then you
 can have a different definition of unique as an English word. In this
 version of English, go buy a dictionary.
 --
 Dimitri Maziuk
 Programmer/sysadmin
 BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-19 Thread Igor Filippov
  No. there's only one definition if unique
This is way too simplistic. The definition of unique depends on the
application.
Not only in chemistry but other fields as well. The way you just defined
unique is appropriate for integer numbers,
but not everything is quite so trivial.
Is human face unique? What about picture of the same person taken at 5, 15,
25, 45 years of age?
Is it the same picture or completely different? Faces of identical twins?
The uniqueness is defined by what you need to accomplish, not by some
god-given attribute of the object,
otherwise no two things are the same and unique loses all meaning.

Best,
Igor


On Thu, Feb 19, 2015 at 12:54 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu
wrote:

 On 02/19/2015 08:54 AM, Markus Sitzmann wrote:
  A database can have several definitions of unique for anything - a
  structure database can have this, too. If you have a chemical compound
  which can form 10 different tautomers, you can represent the compound
  by 10 chemical structures (it is still the same compound, though). So,
  if you define uniqueness on basis of chemical compound, you have one
  db entry and this one entry has a single (tatuomer-sensitive) InChI
  covering 10 chemical structures; if you define uniqueness on basis of
  tautomers/chemical structures (because all are relevant, for instance,
  in NMR spectrosopy) you have (and want) 10 database entries, each with
  a single (tautomer-sensitive) InChI. Two definitions of unique.

 No. there's only one definition if unique: unique key is a set of
 attributes that is guaranteed to be unique for each entity. The
 relationship between the key and the entity is symmetric: if x is the
 inchi string for compound y then y is the compound for inchi string x.

 If follows that if y is the compound for inchi string x, and z is also
 the compound for inchi string x, then x is not unique.

 What you have is two definitions of chemical compound.

 You can, in your database, define 10 different tautomers as ein
 compound, ein unique key. Your database will be useless for any number
 of applications. You can define 10 different tautomers as 10 different
 compounds with 10 different unique keys. Your database will be too
 heavy for any number of applications. It's your database.

 What you can't do is redefine unique to mean two things at once: it's
 not your discrete math. Sorry.
 --
 Dimitri Maziuk
 Programmer/sysadmin
 BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE

 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-18 Thread Igor Filippov
 update the bug report and work on tracking down the wrong problem

That's how I sometimes do it too... ;)

Igor

On Wed, Feb 18, 2015 at 12:35 PM, Greg Landrum greg.land...@gmail.com
wrote:

 Yep, you guys are right.
 I diagnosed that too quickly.
 Thanks for pointing out the mistake.

 I'll update the bug report and work on tracking down the wrong problem

 -greg


 On Wed, Feb 18, 2015 at 4:58 PM, Markus Sitzmann 
 markus.sitzm...@gmail.com wrote:

 I agree with John, the InChI for mol1 and mol2 should be


 http://cactus.nci.nih.gov/chemical/structure/O=C(NCCc1c1)[C@H]1CC[C@H](Cn2c(O)nc3c3c2=O)CC1/stdinchi


 InChI=1S/C24H27N3O3/c28-22(25-15-14-17-6-2-1-3-7-17)19-12-10-18(11-13-19)16-27-23(29)20-8-4-5-9-21(20)26-24(27)30/h1-9,18-19H,10-16H2,(H,25,28)(H,26,30)/t18-,19-

 So the + at the end should be a -

 Markus

 On Wed, Feb 18, 2015 at 2:53 PM, John M john.wilkinson...@gmail.com
 wrote:
  Hi Greg,
 
  I believe it's an RDKitMol - InChI issue rather than InChI -
 RDKitMol. The
  correct InChI (below) is different from that in the iPython listing.
 
 
 InChI=1S/C24H27N3O3/c28-22(25-15-14-17-6-2-1-3-7-17)19-12-10-18(11-13-19)16-27-23(29)20-8-4-5-9-21(20)26-24(27)30/h1-9,18-19H,10-16H2,(H,25,28)(H,26,30)/t18-,19-
 
  J
 
 
  Regards,
  John W May
  john.wilkinson...@gmail.com
 
  On 18 February 2015 at 04:57, Greg Landrum greg.land...@gmail.com
 wrote:
 
  JP,
 
  Looks like that's a bug in the way ring stereochemistry is handled
 while
  translating the InChI back into an molecule.
 
  It's reproducible with a small example:
  In [1]: from rdkit import Chem
 
  In [2]: mol1 = Chem.MolFromSmiles(C[C@H]1CC[C@H](O)CC1)
 
  In [3]: Chem.MolToSmiles(mol1,True)
  Out[3]: 'C[C@H]1CC[C@H](O)CC1'
 
  In [4]: inchi = Chem.MolToInchi(mol1)
 
  In [5]: mol2 = Chem.MolFromInchi(inchi)
 
  In [6]: Chem.MolToSmiles(mol2,True)
  Out[6]: 'C[C@H]1CC[C@@H](O)CC1'
 
  Conversion of InChI to molecules is something that's not in general
  guaranteed to work perfectly, but I will go ahead and create a bug
 report.
 
  -greg
 
 
 
  On Tue, Feb 17, 2015 at 2:50 PM, JP jeanpaul.ebe...@inhibox.com
 wrote:
 
  Hi there,
 
  I have a question for the 3D enabled of you (I wish the world looked
 like
  GTA2 !)
 
  I am seeing a case of an RDKit mol - Inchi - RDKit mol, that I
 think is
  changing the  stereochemistry of the molecule.  I have 12
 example-pairs
  where this happens (but all very structurally similar).  I don't care
 much
  that the last rdkit molecule is a different tautomer than the
 starting one -
  but if this is the case the stereochemistry should still be
 conserved, no?
 
  I did an ipython notebook (most useful tool of the decade after
 RDKit?)
  gist here:
 
 
 
 http://nbviewer.ipython.org/urls/gist.githubusercontent.com/anonymous/7c158926a0f3bf9a4978/raw/d91cc808ac91eccc8bf0e45d9eacd2af382e5105/gistfile1.txt
 
  I appreciate if anyone could shed some light.  I'd just like to
  understand.
 
  Thank you for your time!
 
  -
  JP
 
 
 
 --
  Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
  from Actuate! Instantly Supercharge Your Business Reports and
 Dashboards
  with Interactivity, Sharing, Native Excel Exports, App Integration 
 more
  Get technology previously reserved for billion-dollar corporations,
 FREE
 
 
 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
  ___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
 
 
 
 
 
 --
  Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
  from Actuate! Instantly Supercharge Your Business Reports and
 Dashboards
  with Interactivity, Sharing, Native Excel Exports, App Integration 
 more
  Get technology previously reserved for billion-dollar corporations,
 FREE
 
 
 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
  ___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
 
 
 
 
 --
  Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
  from Actuate! Instantly Supercharge Your Business Reports and Dashboards
  with Interactivity, Sharing, Native Excel Exports, App Integration 
 more
  Get technology previously reserved for billion-dollar corporations, FREE
 
 http://pubads.g.doubleclick.net/gampad/clk?id=190641631iu=/4140/ostg.clktrk
  ___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
 




 --

Re: [Rdkit-discuss] Tests failing on Windows: more info

2015-02-11 Thread Igor Filippov
Not that I use it or recommend it but a bit of googling brought me this:
http://www.appveyor.com/

Also, perhaps something is possible with azure cloud and visual studio
online?

HTH,
Igor


On Tue, Feb 10, 2015 at 11:52 PM, Greg Landrum greg.land...@gmail.com
wrote:

 This particular problem was my fault. It had been a while since I did a
 windows build, so I hadn't caught the problem you point out below. I just
 fixed it and checked in the changes.

 Apologies that you ended up wasting time on this.

 If anyone knows of an online continuous-integration providers (like
 Travis) that support Windows, I would be very, very happy to hear about it.

 -greg


 On Tue, Feb 10, 2015 at 7:14 PM, James Davidson j.david...@vernalis.com
 wrote:

  Hi Paolo, Greg, et al.



 I have also been having some problems recently building (64-bit Windows)
 from recent github versions, but I don’t know if this is related to what
 you see, Paolo…

 My environment is Win 7 64-bit, CMake 3.0.0, boost_1_55_0-msvc-11.0-64,
 MS Visual Studio Express 2012.



 I have done a bit of version rolling-back and forwards to see if I can
 pinpoint the last version that builds with no errors, and this is what I
 have found so far (sorted by revision, not by sequence of attempts!):



 4577   - compiles fine, - passes all tests

 *4618*   - as above

 4649   - some errors during compile, -passes all tests except the
 molDraw2D bits (which are also involved in the errors)

 4651   - as above

 4743   - as above

 *4765*   - as above

 4780   - pyGraphMolWrap now fails test

 4826   - this is where significant problems start (for me at least).
 pyGraphMolWrap still fails, but now with a segfault

 4859   - same segfault as above.  Also pymolDraw2D test fails…





 The errors I start to see for molDraw2D are this sort of thing (is this
 expected?):



 Error  49   error C2668: 'boost::tuples::tie' : ambiguous
 call to overloaded function
 C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp   341
 1  MolDraw2D

 Error  50   error C2668: 'boost::tuples::tie' : ambiguous
 call to overloaded function
 C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp   353
 1  MolDraw2D

 Error  51   error C2668: 'boost::tuples::tie' : ambiguous
 call to overloaded function
 C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp   357
 1  MolDraw2D

 Error  61   error C2668: 'boost::tuples::tie' : ambiguous
 call to overloaded function
 C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp   544
 1  MolDraw2D

 Error  63   error C2668: 'boost::tuples::tie' : ambiguous
 call to overloaded function
 C:\RDKit\Code\GraphMol\MolDraw2D\MolDraw2D.cpp   591
 1  MolDraw2D

 Error  131 error LNK1181: cannot open input file
 '..\..\..\lib\Release\MolDraw2D.lib'
 C:\RDKit\build\Code\GraphMol\MolDraw2D\LINK  moldraw2DTest1

 Error  149 error LNK1181: cannot open input file
 '..\..\..\lib\Release\MolDraw2D.lib'
 C:\RDKit\build\Code\GraphMol\Wrap\LINK   rdmolops





 If I see the above errors when building ‘ALL_BUILD’, I also see the
 following error when building the ‘INSTALL’ section:



 Error  41   error MSB3073: The command setlocal

 C:\Program Files (x86)\CMake\bin\cmake.exe -DBUILD_TYPE=Release -P
 cmake_install.cmake

 if %errorlevel% neq 0 goto :cmEnd

 :cmEnd

 endlocal  call :cmErrorLevel %errorlevel%  goto :cmDone

 :cmErrorLevel

 exit /b %1

 :cmDone

 if %errorlevel% neq 0 goto :VCEnd

 :VCEnd exited with code 1.C:\Program Files
 (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets
 134 5  INSTALL







 Anyway, 4618 is the latest revision that I have tested where I see no
 build errors, and 4765 is the latest revision I’ve found before I start to
 see pyGraphMolWrap tests failing (or segfaults).  For now, I have
 rolled-back my installation to 4618 (but would be very happy if anyone can
 figure-out what causes the problems with later revisions).



 Kind regards



 James

 __
 PLEASE READ: This email is confidential and may be privileged. It is
 intended for the named addressee(s) only and access to it by anyone else is
 unauthorised. If you are not an addressee, any disclosure or copying of the
 contents of this email or any action taken (or not taken) in reliance on it
 is unauthorised and may be unlawful. If you have received this email in
 error, please notify the sender or postmas...@vernalis.com. Email is not
 a secure method of communication and the Company cannot accept
 responsibility for the accuracy or completeness of this message or any
 attachment(s). Please check this email for virus infection for which the
 Company accepts no responsibility. If verification of this email is sought
 then please request 

Re: [Rdkit-discuss] Tests failing on Windows

2015-01-23 Thread Igor Filippov
I think this kind of errors pops up when the environment variables haven't
been set up  - RDBASE, PYTHONPATH and LD_LIBRARY_PATH.
Also, make sure you run make install (or its equivalent for MSVC, I only
used Linux/MSYS/OSX versions) before running the tests.

Hope this helps,
Igor

On Fri, Jan 23, 2015 at 9:42 AM, Paolo Tosco paolo.to...@unito.it wrote:

 Hi,

 I have a strong feeling I am doing something silly but I cannot find out
 what it is.
 The current RDKit development version builds fine on Windows using MSVC
 2013, pre-built Python27, pre-built Boost_1_55_0.
 However, it fails a few tests:

 The following tests FAILED:
4 - pyDiscreteValueVect (Failed)
5 - pySparseIntVect (Failed)
   32 - testMolSupplier (Failed)
   48 - pyPartialCharges (Failed)
   69 - pyGraphMolWrap (Failed)
   75 - pyRanker (Failed)
   77 - pyFeatures (Failed)
   78 - pythonTestDbCLI (Failed)
   79 - pythonTestDirML (Failed)
   84 - pythonTestDirChem (Failed)

 More specifically, testMolSupplier.exe crashes (testMolSupplier.exe has
 stopped working), while
 Python tests fail because of Python modules failing to load:

 c:\build\rdkit\Code\ChemicalFeatures\Wrapc:\Python27\python.exe
 testFeatures.py
 Testing ChemicalFeatures Wrapper code:
 .E
 ==
 ERROR: testPickle (__main__.TestCase)
 --
 Traceback (most recent call last):
File testFeatures.py, line 81, in testPickle
  ffeat2=cPickle.load(inF, encoding='bytes')
File C:\build\rdkit\rdkit\_py2_pickle.py, line 5, in load
  def load(f, **kwargs): return _load(f)
 ImportError: No module named rdChemicalFeatures

 --
 Ran 2 tests in 0.001s

 FAILED (errors=1)

 However, the following works fine:

 c:\build\rdkit\Code\ChemicalFeatures\Wrapc:\Python27\python.exe
 Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit
 (Intel)] on win32
 Type help, copyright, credits or license for more information.
   import rdkit
   from rdkit import Chem
   from rdkit.Chem import rdChemicalFeatures
  


 In the past, I have been able to build the RDKit on Windows without
 problems, using MSVC 2010.
 Can anyone find a solution to these issues?

 Thank you very much in advance, kind regards,
 Paolo



 --
 New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
 GigeNET is offering a free month of service with a new server in Ashburn.
 Choose from 2 high performing configs, both with 100TB of bandwidth.
 Higher redundancy.Lower latency.Increased capacity.Completely compliant.
 http://p.sf.net/sfu/gigenet
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] ubuntu 14.04 hangs on building SLNAttribs.cpp.o

2014-10-09 Thread Igor Filippov
-DRDK_BUILD_SLN_SUPPORT=OFF

there is an additional bonus when getting rid of sln support - boost can be
used as headers-only library, it does not have to be compiled prior to
rdkit compilation.

Igor

On Thu, Oct 9, 2014 at 10:07 AM, Michał Nowotka mmm...@gmail.com wrote:

 Hi,

 I'm trying to prepare my_chembl_19 vagrant distribution.
 I'm using Ubuntu 14.04 as a vagrant base box. Unfortunately during
 building RDKit VM hangs, this is how it looks in the console:

 == default: [ 80%]
 == default: Building CXX object

 Code/ChemicalFeatures/Wrap/CMakeFiles/rdChemicalFeatures.dir/rdChemicalFeatures.cpp.o
 == default: [ 81%]
 == default: Building CXX object
 Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/SLNAttribs.cpp.o

 And then it hangs forever...

 I'm using Release_2014_03_1 branch, if you want to see the script
 installing RDKit it's here:

 https://github.com/chembl/mychembl/blob/master/rdkit_install.sh


 In https://code.google.com/p/rdkit/wiki/BuildingWithCmake there is
 Frequently Encountered Problems section suggesting to use
 -DBoost_USE_STATIC_LIBS=OFF for a problem related to libSLNParse.so.

  - Do you think it can help in my case?
  - is there any way to exclude SLN Parser from RDKit at all - do I
 really need it?

 Kind regards,
 Michał Nowotka


 --
 Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
 Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
 Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
 Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer

 http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Having problems with installing RDKit_2014_03_1 in Ubuntu

2014-07-24 Thread Igor Filippov
Why are you linking with a static libpyton in /usr/local/lib? Ubuntu does
not come with packaged libpyton?

I would try either compiling RDKit as a set of static libraries (since
you're linking to a static libpyton), or getting a shared libpython.so.
Or/and also try to recompile with  -fPIC as the error message recommends.

Cheers,
Igor


On Thu, Jul 24, 2014 at 9:40 AM, Jessica Krause jessica.kra...@tu-bs.de
wrote:

 Dear all,

 I tried to install RDKit 2014 on Ubuntu 14.04 but I did
 not succeed!


 While executing the make command in the RDKit_2014_03_1/build directory, I
 recieved the following error:

 [  0%] Built target inchi_support
 [  1%] Built target RDGeneral
 [  3%] Built target RDGeneral_static
 [  3%] Built target testDict
 Linking CXX shared library ../../lib/libRDBoost.so
 /usr/bin/ld: /usr/local/lib/libpython2.7.a(exceptions.o): relocation
 R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared
 object; recompile with -fPIC
 /usr/local/lib/libpython2.7.a: error adding symbols: Bad value
 collect2: error: ld returned 1 exit status
 make[2]: *** [lib/libRDBoost.so.1.2014.03.1] Error 1
 make[1]: *** [Code/RDBoost/CMakeFiles/RDBoost.dir/all] Error 2
 make: *** [all] Error 2




 the environmental variables that I have used are:

 export RDBASE=opt/RDKit_2014_03_1/
 export
 LD_LIBRARY_PATH=opt/RDKit_2014_03_1/build/lib/:usr/local/src/boost_1_55_0/libs/
 export PYTHONPATH=opt/RDKit_2014_03_1/


 Please help me with this problem.

 Thanks in advance.

 Regards,
 Jessica Krause



 --
 Want fast and easy access to all the code in your enterprise? Index and
 search up to 200,000 lines of code with a free copy of Black Duck
 Code Sight - the same software that powers the world's largest code
 search on Ohloh, the Black Duck Open Hub! Try it now.
 http://p.sf.net/sfu/bds
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MaxMin Picker and Python

2014-07-16 Thread Igor Filippov
Matthew,

Two lines of shameless self-promotion:
This is exactly the kind of problem for Diversity Genie -
http://www.diversitygenie.com/
It is using RDKit library underneath, but wraps it in a simple, easy to use
GUI front-end.

Best regards,
Igor


On Wed, Jul 16, 2014 at 6:18 PM, Matthew Lardy mla...@gmail.com wrote:

 Hi all,

 I have been playing with the diversity selection in RDKit.  I am running
 through a set of ~26,000 molecules to pick a set of 200 diverse molecules.
 I saw some examples of how to do this in Python (my variant of their script
 below), but the memory consumption is massive.  I burned through ~15GB of
 memory before I killed it off.  Is this about what others have seen, or
 should I move to doing this in C++ or Java (assuming that others have seen
 a significantly lower level of memory consumption)?

 Here is the script:

 from rdkit import Chem
 from rdkit.Chem import AllChem
 from rdkit import DataStructs
 import gzip
 from rdkit.Chem import Draw
 from rdkit.SimDivFilters import rdSimDivPickers

 zims = [x for x in Chem.ForwardSDMolSupplier(gzip.open('a.sdf.gz')) if x
 is not None]

 zims_fps=[AllChem.GetMorganFingerprintAsBitVect(x,2) for x in zims]

 dm=[]
 for i,fp in enumerate(zims_fps[:26000]): # only 1000 in the demo (in
 the interest of time)

 dm.extend(DataStructs.BulkTanimotoSimilarity(fp,zims_fps[1+1:26000],returnDistance=True))
 dm = array(dm)
 picker = rdSimDivPickers.MaxMinPicker()
 ids = picker.Pick(dm,26000,200)
 list(ids[:200])


 Thanks in advance!
 Matt


 --
 Want fast and easy access to all the code in your enterprise? Index and
 search up to 200,000 lines of code with a free copy of Black Duck
 Code Sight - the same software that powers the world's largest code
 search on Ohloh, the Black Duck Open Hub! Try it now.
 http://p.sf.net/sfu/bds
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] boost::regex

2014-02-12 Thread Igor Filippov
Dear Greg et al,

I was wondering if the only binary dependency on boost is
boost::regex used in SLN reader/writer?
If that is the case would it be possible to ifdef this code with a flag - I
never
had a need to use SLN but compiling the correct version of boost::regex and
caring around
associated dependencies can sometimes be quite a burden (I'm looking at
OSX)?

That is, if I don't need SLN format I can just do cmake  -DNO_SLN and I
won't have to worry about boost::regex, etc.?

Just a question!
Igor
--
Android apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience.  Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] boost::regex

2014-02-12 Thread Igor Filippov
Fantastic!
You're ahead of the curve as always!

Igor


On Wed, Feb 12, 2014 at 10:55 AM, Greg Landrum greg.land...@gmail.comwrote:

 Hi Igor,


 On Wed, Feb 12, 2014 at 4:42 PM, Igor Filippov 
 igor.v.filip...@gmail.comwrote:

 Dear Greg et al,

 I was wondering if the only binary dependency on boost is
 boost::regex used in SLN reader/writer?
 If that is the case would it be possible to ifdef this code with a flag -
 I never
 had a need to use SLN but compiling the correct version of boost::regex
 and caring around
 associated dependencies can sometimes be quite a burden (I'm looking at
 OSX)?

 That is, if I don't need SLN format I can just do cmake  -DNO_SLN and I
 won't have to worry about boost::regex, etc.?


 Correct. The argument (already there) is:
 -DRDK_BUILD_SLN_SUPPORT=OFF

 -greg






--
Android apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience.  Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit+InChI on MinGW64

2014-02-03 Thread Igor Filippov
I ran into a small problem compiling InChI (as part of RDKit) on 64-bit
Windows
using MinGW64 - _strdup was defined in util.c and it also seems to be
available from
some default Windows library.
The attached tiny patch fixed the problem for me, perhaps it will be useful
to someone else
as well.

Igor


inchi.patch
Description: Binary data
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI roundtrip

2014-01-30 Thread Igor Filippov
George,

Have you added coordinates to the mols converted from InChI?
It made a huge difference for the examples I've tried.

Igor


On Thu, Jan 30, 2014 at 2:07 PM, George Papadatos gpapada...@gmail.comwrote:

 OK just to add some fuel to this fire: A colleague of mine and I looked at
 the inchi roundtrip using KNIME 2.9 and the latest versions of indigo and
 rdkit nodes. We used ~90,000 inchis from chembl_17, converted them to mols
 (sanitise + remove Hs), removed the ones that fail to convert, and then we
 converted back to inchis (standard ones, no extra parameters). We assessed
 the discrepancies between indigo and rdkit inchis compared to the original
 input inchis that are stored in chembl.
 Rdkit had 10 times more discrepancies with 200 failures as opposed to 21
 from indigo. This rate (~0.2%) was also confirmed using ~1 million inchis.

 I had a closer look to a couple of cases here:
 http://nbviewer.ipython.org/gist/madgpap/8715974

 It seems that there is more that one reason for the failure. I totally
 understand Greg's caution about the inchi2mol conversion, but given the
 difference between rdkit and indigo, there might room for improvement. Any
 insights would be very much appreciated.

 Btw, the KNIME workflow and full list of fails are available to you.

 Cheers,

 George



 On 30 January 2014 04:11, Greg Landrum greg.land...@gmail.com wrote:

 Yeah, I have been tempted several times to remove the InChI-RDKit
 functionality entirely



 On Thu, Jan 30, 2014 at 5:05 AM, Igor Filippov igor.v.filip...@gmail.com
  wrote:

 Thank you, Greg!
 Very nice explanation and I think this issue has confused people before
 me as well. I am going to have to keep reminding myself about it as the
 subject comes up every now and then.

 Igor
 On Jan 29, 2014 10:59 PM, Greg Landrum greg.land...@gmail.com wrote:

 Hi Igor,

 On Wed, Jan 29, 2014 at 2:04 PM, Igor Filippov 
 igor.v.filip...@gmail.com wrote:

 Greg et al,

 Here is a little script that demonstrates a problem with fingerprints
 after the roundtrip through InChI.
 My input mol file is also attached.
 As you can see the similarity between before and after is not 1 in
 45 out of 100 cases.
 In one case it is as low as 0.29. Could someone take a look and tell
 me what I'm doing wrong?


 Ah! Now I see what you're doing and understand the problem.

 It's really important when using InChI to remember that InChI is
 designed to be an identifier, not an interchange format. The InChI
 algorithm modifies the molecule as part of its canonicalization step. This
 modification includes standardizing tautomers.

 Here's an example of the type of substructure modification that happens
 in your molecules:
 input smiles c1c1C(=O)Nc1c1 on begin converted to InChI and
 back yields: OC(=Nc1c1)c1c1

 Basically: If you think you know what your molecules are, you probably
 should be building them from SMILES or CTAB, not InChI.

 Apologies that I didn't think of this before; I was just focusing on
 the stereochemistry.

 -greg




 --
 WatchGuard Dimension instantly turns raw network data into actionable
 security intelligence. It gives you real-time visual feedback on key
 security issues and trends.  Skip the complicated setup - simply import
 a virtual appliance and go from zero to informed in seconds.

 http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




 --
 WatchGuard Dimension instantly turns raw network data into actionable
 security intelligence. It gives you real-time visual feedback on key
 security issues and trends.  Skip the complicated setup - simply import
 a virtual appliance and go from zero to informed in seconds.

 http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI roundtrip

2014-01-29 Thread Igor Filippov
Thank you, Greg!
Very nice explanation and I think this issue has confused people before me
as well. I am going to have to keep reminding myself about it as the
subject comes up every now and then.

Igor
On Jan 29, 2014 10:59 PM, Greg Landrum greg.land...@gmail.com wrote:

 Hi Igor,

 On Wed, Jan 29, 2014 at 2:04 PM, Igor Filippov 
 igor.v.filip...@gmail.comwrote:

 Greg et al,

 Here is a little script that demonstrates a problem with fingerprints
 after the roundtrip through InChI.
 My input mol file is also attached.
 As you can see the similarity between before and after is not 1 in 45
 out of 100 cases.
 In one case it is as low as 0.29. Could someone take a look and tell me
 what I'm doing wrong?


 Ah! Now I see what you're doing and understand the problem.

 It's really important when using InChI to remember that InChI is designed
 to be an identifier, not an interchange format. The InChI algorithm
 modifies the molecule as part of its canonicalization step. This
 modification includes standardizing tautomers.

 Here's an example of the type of substructure modification that happens in
 your molecules:
 input smiles c1c1C(=O)Nc1c1 on begin converted to InChI and back
 yields: OC(=Nc1c1)c1c1

 Basically: If you think you know what your molecules are, you probably
 should be building them from SMILES or CTAB, not InChI.

 Apologies that I didn't think of this before; I was just focusing on the
 stereochemistry.

 -greg

--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] InChI roundtrip

2014-01-27 Thread Igor Filippov
I noticed that if I convert mol to inchi and then back to mol in quite a
few cases
the stereochemistry information gets lost. Is it something that is handled
completely by InChI library
or is RDKit not reading the mols produced from InChI correctly?

Igor
--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI roundtrip

2014-01-27 Thread Igor Filippov
Indeed - adding 2d coordinates fixed comparison based on InChI, however
Morgan fingerprints still show some differences though I need to check
whether it's for the same 26 examples.
My whole set is larger and it will take a bit of digging to find out.
I guess I'm still missing something...

Thanks!
Igor


On Mon, Jan 27, 2014 at 11:42 PM, Greg Landrum greg.land...@gmail.comwrote:

 hmm, I can't reproduce that:

 In [16]: inls = [x.strip().split() for x in
 file('/Users/landrgr1/Downloads/t.tab').readlines()]

 In [17]: inls.pop(0)
 Out[17]: ['InChI', 'Original_InChI']

 In [18]: for inch,o_inch in inls:
: m = Chem.MolFromInchi(o_inch)
: n_inch = Chem.MolToInchi(m)
: if n_inch!=o_inch:
: print o_inch
: print n_inch
: print Chem.MolToSmiles(m,True)
:
 [05:37:40] WARNING: Proton(s) added/removed
 [05:37:41] WARNING: Proton(s) added/removed
 [05:37:41] WARNING: Proton(s) added/removed
 [05:37:41] WARNING: Charges were rearranged

 In [19]: from rdkit.Chem import AllChem

 In [20]: for inch,o_inch in inls:
: m = Chem.MolFromInchi(o_inch)
: AllChem.Compute2DCoords(m)
: mb = Chem.MolToMolBlock(m)
: nm = Chem.MolFromMolBlock(mb)
: n_inch = Chem.MolToInchi(nm)
: if n_inch!=o_inch:
: print o_inch
: print n_inch
: print Chem.MolToSmiles(m,True)
: print Chem.MolToSmiles(nm,True)
:
 [05:39:44] WARNING: Proton(s) added/removed
 [05:39:44] WARNING: Proton(s) added/removed
 [05:39:44] WARNING: Proton(s) added/removed
 [05:39:44] WARNING: Charges were rearranged

 In [21]:

 Could it be that you forgot to add coordinates before you generated the SD
 file?

 -greg



 On Tue, Jan 28, 2014 at 5:22 AM, Igor Filippov 
 igor.v.filip...@gmail.comwrote:

 Here are some examples - original InChI were created from the original
 SD file, then a new SD file was created from those and new InChI
 calculated, called here InChI.
  It's a tab-separated table.
 Thanks for taking time to look at this!

 Igor


 On Mon, Jan 27, 2014 at 10:55 PM, Greg Landrum greg.land...@gmail.comwrote:

 On Tue, Jan 28, 2014 at 2:49 AM, Igor Filippov 
 igor.v.filip...@gmail.com wrote:

 I noticed that if I convert mol to inchi and then back to mol in quite
 a few cases
 the stereochemistry information gets lost. Is it something that is
 handled completely by InChI library
 or is RDKit not reading the mols produced from InChI correctly?


 It could be problems in the RDKit conversion to or from InChI. The
 easiest way to check where the problem is coming from is to see if the
 InChI itself has the correct stereochemistry flags. If not, it's the
 RDKit-InChI process, otherwise it's InChI-RDKit.

 Feel free to send along some example molecules if you want me to take a
 look at them.

 Best,
 -greg




--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Upcoming patch release

2014-01-25 Thread Igor Filippov
Are you planning V3000 support included in this release or is it slated for
a later date?

Igor


On Sat, Jan 25, 2014 at 1:42 AM, Greg Landrum greg.land...@gmail.comwrote:

 Dear all,

 I'm just about done with the patch release to fix the PDB bond-order
 perception bug and am looking for any last minute feedback/suggestions.

 The bug fixes that will be included in the release are described here:
 https://github.com/rdkit/rdkit/issues?milestone=6

 And this is the branch with the changes:
 https://github.com/rdkit/rdkit/tree/patch_Release_2013_09_2

 I'll do the new release tomorrow or Monday morning unless I hear otherwise.

 Best,
 -greg



 --
 CenturyLink Cloud: The Leader in Enterprise Cloud Services.
 Learn Why More Businesses Are Choosing CenturyLink Cloud For
 Critical Workloads, Development Environments  Everything In Between.
 Get a Quote or Start a Free Trial Today.

 http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Upcoming patch release

2014-01-25 Thread Igor Filippov
Sounds good, thank you!

Igor


On Sat, Jan 25, 2014 at 9:25 AM, Greg Landrum greg.land...@gmail.comwrote:

 Hi Igor,

 The V3000 support is a new feature, so it will wait until the 2014_03_1
 release.

 The V3000 writer support that Jan contributed to is now part of the trunk
 in github, if you want to give it a try now.

 -greg



 On Sat, Jan 25, 2014 at 3:16 PM, Igor Filippov 
 igor.v.filip...@gmail.comwrote:

 Are you planning V3000 support included in this release or is it slated
 for a later date?

 Igor


 On Sat, Jan 25, 2014 at 1:42 AM, Greg Landrum greg.land...@gmail.comwrote:

 Dear all,

 I'm just about done with the patch release to fix the PDB bond-order
 perception bug and am looking for any last minute feedback/suggestions.

 The bug fixes that will be included in the release are described here:
 https://github.com/rdkit/rdkit/issues?milestone=6

 And this is the branch with the changes:
 https://github.com/rdkit/rdkit/tree/patch_Release_2013_09_2

 I'll do the new release tomorrow or Monday morning unless I hear
 otherwise.

 Best,
 -greg



 --
 CenturyLink Cloud: The Leader in Enterprise Cloud Services.
 Learn Why More Businesses Are Choosing CenturyLink Cloud For
 Critical Workloads, Development Environments  Everything In Between.
 Get a Quote or Start a Free Trial Today.

 http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] MaxMin picker

2014-01-18 Thread Igor Filippov
I was wondering if anyone might have a simple example how to use MaxMin
picker from C++?
The source code doesn't seem to have a helpful test file for this
functionality
and I am a bit stuck figuring out if I need to use pick or lazypick and how
to construct the distance
matrix in the form suitable for rdkit.

Thanks,
Igor
--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] structure to IUPAC name made on RDkit?

2014-01-14 Thread Igor Filippov
OPSIN is doing the reverse - converts name to structure, perhaps it's
possible to re-use the algorithm?

Igor


On Tue, Jan 14, 2014 at 1:29 PM, David Hall li...@cowsandmilk.net wrote:

 Certainly, RDKit can help, any cheminformatics toolkit with
 SMARTS/substructures to quickly classify something as a ketone, ester,
 ether, bond orders, ring finding, path finding, etc. can help.

 As for how hard it is, I think for drug discovery, most would start with
 the Nomenclature of Organic Chemistry:

 http://www.iupac.org/home/publications/provisional-recommendations/under-review-by-the-authors/under-review-by-the-authors-container/nomenclature-of-organic-chemistry.html
 That's only 1306 pages to go through.

 After that, there's inorganic chemistry, although sections of that may not
 be handled well by RDKit? I'm not sure about how RDKit handles crystals and
 coordination compounds, etc.

 http://www.iupac.org/nc/home/publications/iupac-books/books-db/book-details.html?tx_wfqbe_pi1[bookid]=5

 Since RDKit recently added PDB support, people may be interested in naming
 for peptides and amino acids
 http://www.iupac.org/publications/pac/56/5/0595/
 And cyclic peptides

 http://www.iupac.org/home/publications/provisional-recommendations/under-review-by-the-authors/under-review-by-the-authors-container/nomenclature-of-cyclic-peptides.html

 So there's a decent amount of work. I think it is possible, but not a
 single weekend project.

 I should mention that many commercial vendors have products for this, off
 the top of my head:
 http://www.acdlabs.com/products/draw_nom/nom/name/
 http://www.chemaxon.com/library/iupac-name/
 https://www.cambridgesoft.com/Ensemble_for_Chemistry/ChemDraw/

 http://www.eyesopen.com/docs/lexichem/current/html/mol2nam.html#chapter-mol2nam

 So, at least 4 companies have some sort of algorithm. The only one I've
 used is ChemDraw's, and then, I only have done maybe 5 molecules, so I
 cannot claim that any of these products work.

 -David




 On Tue, Jan 14, 2014 at 11:04 AM, Michał Nowotka mmm...@gmail.com wrote:

 Yes, I know there is no algorithm for that. This is why I'm wondering
 how hard it to write one and how RDkit can help here.

 On Tue, Jan 14, 2014 at 4:01 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu
 wrote:
  On 1/14/2014 6:09 AM, Michał Nowotka wrote:
  Hi,
 
  Since there is no open source software converting structure to IUPAC
  name (I'm not talking about web services) I was wandering if one can
  be implemented using RDkit? Which parts of RDKit would help in doing
  that? Any pointers, suggestions?
 
  I've suspicion there's no algorithm for it -- cactus is the only one I
  found that does it and my impression is they're doing a database
  search... I'd like to be able to generate systematic names without web
  searches, too.
 
  Dima
 
 
 
 
 --
  CenturyLink Cloud: The Leader in Enterprise Cloud Services.
  Learn Why More Businesses Are Choosing CenturyLink Cloud For
  Critical Workloads, Development Environments  Everything In Between.
  Get a Quote or Start a Free Trial Today.
 
 http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
  ___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


 --
 CenturyLink Cloud: The Leader in Enterprise Cloud Services.
 Learn Why More Businesses Are Choosing CenturyLink Cloud For
 Critical Workloads, Development Environments  Everything In Between.
 Get a Quote or Start a Free Trial Today.

 http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




 --
 CenturyLink Cloud: The Leader in Enterprise Cloud Services.
 Learn Why More Businesses Are Choosing CenturyLink Cloud For
 Critical Workloads, Development Environments  Everything In Between.
 Get a Quote or Start a Free Trial Today.

 http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___
Rdkit-discuss 

Re: [Rdkit-discuss] Faster RDKit builds when you're not interested in running the tests

2014-01-02 Thread Igor Filippov
But this would disable static libraries as well, right?
Just to make sure. I actually use more static libs than shared.

Igor


On Wed, Jan 1, 2014 at 2:26 AM, Greg Landrum greg.land...@gmail.com wrote:

 Dear all,

 I just checked in a small change that makes building the RDKit much faster
 in cases where you just want the python wrappers and don't care about the
 C++ tests.

 Calling cmake with the flags:
 -DRDK_BUILD_CPP_TESTS=OFF -DRDK_INSTALL_STATIC_LIBS=OFF
 should cut compile times in about half.

 @Eddie: I guess this will help a lot with the homebrew build times.

 -greg



 --
 Rapidly troubleshoot problems before they affect your business. Most IT
 organizations don't have a clear picture of how application performance
 affects their revenue. With AppDynamics, you get 100% visibility into your
 Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics
 Pro!
 http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] mol 3000

2013-12-29 Thread Igor Filippov
Greg,

I just done it a quick try and it seems to be working fine for me!
Is it allowed to mix V2000 and V3000 formats within the same SDF file?

Igor


On Sat, Dec 28, 2013 at 1:43 PM, Greg Landrum greg.land...@gmail.comwrote:

 Hi Igor,

 It's underway.

 Jan contributed an initial implementation that I've done a bit of further
 work with. It's currently on a branch in git:
 https://github.com/rdkit/rdkit/tree/Issue194_V3000MolWriter
 I hope to have it merged onto the trunk in the next couple weeks after
 we've both had a chance to do more testing.

 More eyes/testers are certainly welcome. There aren't that many examples
 of v3000 mol files I can find in the wild and to this point I have been
 relying solely on Marvin sketch to confirm that things are reasonable.

 -greg


 On Saturday, December 28, 2013, Igor Filippov wrote:

 Greg,

 I remember you mentioned that adding v3000 mol format is a possibility -
 is it still in the plans?
 In any case I am adding my vote to this feature request!

 Best wishes for the holidays,
 Igor


--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] error: can't copy 'rdkit/rdBase.so': doesn't exist or not a regular file

2013-12-16 Thread Igor Filippov
 chmod 700 setup.py
 ./setup.py build

How was it ever going to work? I don't think it's in any of the build
instructions.
And the build instructions aren't exactly hard to find these days - unlike
the old procedure with bjam, etc.
RTFM seems like an appropriate response here.


On Mon, Dec 16, 2013 at 12:58 PM, Greg Landrum greg.land...@gmail.comwrote:

 And that I should either remove that setup.py file or figure out how to
 make it work. G.

 -greg

 On Monday, December 16, 2013, David Hall wrote:

 The quick answer is that you should use cmake to build rdkit.
 http://rdkit.org/docs/Install.html#building-the-rdkit

 -David

 On Monday, December 16, 2013 at 9:11 AM, Sören Wacker wrote:

 Hi,

 I would like to install rdkit. However, I ran into a problem that I
 could not find in the mailing list to be discussed.

 I downloaded the repository by

 git clone https://github.com/rdkit/rdkit
 cd rdkit
 chmod 700 setup.py
  ./setup.py build
 running build
 running build_py

 su
 ./setup.py install #Full output see below.
 ...
 error: can't copy 'rdkit/rdBase.so': doesn't exist or not a regular file

 So, this file seems to be missing in the repository. Why is that so? How
 can I repair that?
 Is there an easy way to remove the failed installation?
 I tried ./setup.py remove but that's not known.


 Hope you can help me.
 Best regards
 Sören




 full output of ./setup.py install
 running install
 running build
 running build_py
 running install_lib
 creating /usr/lib/python2.7/site-packages/rdkit
 copying build/lib/rdkit/TestRunner.py -
 /usr/lib/python2.7/site-packages/rdkit
 creating /usr/lib/python2.7/site-packages/rdkit/ML
 copying build/lib/rdkit/ML/BuildComposite.py -
 /usr/lib/python2.7/site-packages/rdkit/ML
 copying build/lib/rdkit/ML/EnrichPlot.py -
 /usr/lib/python2.7/site-packages/rdkit/ML
 creating /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/UnitTestPrune.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/randomtest.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/__init__.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/TreeUtils.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/PruneTree.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/Forest.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/BuildSigTree.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/UnitTestID3.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/UnitTestSigTree.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/UnitTestQuantTree.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/UnitTestTree.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/test_list.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/TreeVis.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/UnitTestXVal.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/DecTree.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/CrossValidate.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/Tree.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/SigTree.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/ID3.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/QuantTree.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/DecTree/BuildQuantTree.py -
 /usr/lib/python2.7/site-packages/rdkit/ML/DecTree
 copying build/lib/rdkit/ML/AnalyzeComposite.py -
 /usr/lib/python2.7/site-packages/rdkit/ML
 creating /usr/lib/python2.7/site-packages/rdkit/ML/Descriptors
 copying build/l



 --
 Rapidly troubleshoot problems before they affect your business. Most IT
 organizations don't have a clear picture of how application performance
 affects their revenue. With AppDynamics, you get 100% visibility into your
 Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics
 Pro!
 http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Rapidly troubleshoot problems before they affect your 

Re: [Rdkit-discuss] docker.io - container for fully fledged rdkit installation on linux?

2013-11-27 Thread Igor Filippov
Ah, thank you!
This explanation makes sense.
It is not trivial to figure out from the website - the learn more gives
some hints of universal packaging, but not the real description.
Also, how is it learn more if this is the very first hint as to what the
software actually does.

Igor


On Wed, Nov 27, 2013 at 10:50 AM, Markus Sitzmann markus.sitzm...@gmail.com
 wrote:

 It is basically a VM that can be scripted from the host system. The VM
 client can be preconfigured with anything your software depends on
 (including databases etc and can be based on arbitrary Linux
 distributions independent of the Linux distribution of the host).

 On Wed, Nov 27, 2013 at 4:20 PM, Igor Filippov
 igor.v.filip...@gmail.com wrote:
  Not to criticize or anything, but I've seen this issue quite a few times
 -
  perhaps the problem
  is actually with me and everybody else is in the know?
 
  I've spent last few minutes clicking around Docker website, I still
 cannot
  figure out what it is and what it does?
  I found that it runs on all Linux builds, that the latest release is a
 work
  of 130 people, that there are Trusted Builds and Docker Hack Days.
  But I still cannot puzzle out what it does!!!
 
  Would it kill the project maintainers to put a few words somewhere on the
  top of the website what the software is actually all about?
 
  Igor
 
  P.S. I finally found some clues under Learn More link. I guess the
 point
  is only those who already know or the really persistent ones or the ones
  with
  time to spare need to bother.
 
 
 
 
 
 
  On Wed, Nov 27, 2013 at 8:09 AM, Samo Turk samo.t...@gmail.com wrote:
 
  Hi rdkitters,
 
  New release of Docker is available and it brings one very impotant
  improvement - it runs on any linux distribution (as long as the kernel
 is
  3.8 or later). I updated RDKit Dockerfile so it builds everything on
 top
  of Ubuntu 13.10 base image. To build the container do:
  git clone https://gist.github.com/6669650.git .
  mv Dockerfile-rdkit Dockerfile
  sudo docker build -t rdkit .
 
  Run it with:
  sudo docker run -p 127.0.0.1:8889: rdkit
  and IPython notebook will be available on http://127.0.0.1:8889/
 
  Regards,
  Samo
 
 
  On Tue, Sep 24, 2013 at 9:08 AM, paul.czodrow...@merckgroup.com
 wrote:
 
 
  I also highly appreciate your efforts!
 
 
  Cheers,
  Paul
 
 
   Stuff like this that makes it easier for people to access/use the
   RDKit is great!
  
   The more options we have the better.
  
   Many thanks to you guys for looking into this stuff. :-)
  
   -greg
  
  
 
   Interesting stuff, looks promising!
   Got pulled in so I created a Dockerfile that builds an image with
   rdkit, ipython and matplotlib. Once the image is built it runs
   ipython notebook server. You can find the source here: https://
   gist.github.com/samoturk/6669650
   Just follow instructions in the first few lines of the Dockerfile to
   build and run it..
  
   Regards,
   Samo
  
 
 
 
  This message and any attachment are confidential and may be privileged
 or
  otherwise protected from disclosure. If you are not the intended
  recipient,
  you must not copy this message or attachment or disclose the contents
 to
  any other person. If you have received this transmission in error,
 please
  notify the sender immediately and delete the message and any attachment
  from your system. Merck KGaA, Darmstadt, Germany and any of its
  subsidiaries do not accept liability for any omissions or errors in
 this
  message which may arise as a result of E-Mail-transmission or for
 damages
  resulting from any unauthorized changes of the content of this message
  and
  any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
  subsidiaries do not guarantee that this message is free of viruses and
  does
  not accept liability for any damages caused by any virus transmitted
  therewith.
 
  Click http://www.merckgroup.com/disclaimer to access the German,
 French,
  Spanish and Portuguese versions of this disclaimer.
 
 
 
 
 --
  October Webinars: Code for Performance
  Free Intel webinars can help you accelerate application performance.
  Explore tips for MPI, OpenMP, advanced profiling, and more. Get the
 most
  from
  the latest Intel processors and coprocessors. See abstracts and
 register
  
 
 
 http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk
  ___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
 
 
 
 
 
 --
  Rapidly troubleshoot problems before they affect your business. Most IT
  organizations don't have a clear picture of how application performance
  affects their revenue. With AppDynamics, you get 100% visibility into
 your
  Java,.NET,  PHP application. Start your 15-day FREE TRIAL

Re: [Rdkit-discuss] Announcement: ChemicalToolBox

2013-11-20 Thread Igor Filippov
Looks very interesting. I will definitely check it out!

Igor


On Wed, Nov 20, 2013 at 3:49 AM, Greg Landrum greg.land...@gmail.comwrote:

 Bjoern,

 Congrats on the CTB and thanks for letting us know about it!

 -greg



 On Tue, Nov 19, 2013 at 9:30 PM, bjoern.gruen...@googlemail.com 
 bjoern.gruen...@gmail.com wrote:

 ChemcialToolBox: A Galaxy of cheminformatic tools in a web browser

 I am very happy to finally announce the first release of
 ChemicalToolBox!

 ChemicalToolBox (CTB) is an open source project that aims for
 reproducible, transparent and easy accessible cheminformatic research.
 It is based on the Galaxy [1] framework and can be installed on all unix
 like platforms. CTB integrates several cheminformatic toolkits and
 tools, like Open Babel, RDKit, CDK, chemfp, osra, opsin, silicos-it ...
 in one easy to use workbench running in a web browser.

 ChemicalToolBox with installation 
 instructions:https://github.com/bgruening/galaxytools/tree/master/chemicaltoolbox


 A few highlights of the CTB and its framework:
 - every step in your analysis is logged with tool version, input dataset
 and used parameters, enabling reproducibility of your analysis at any
 time
 - building entire cheminformatic protocols/workflows via drag  drop by
 chaining small tools together. Abstracting complexness, enabling
 creativity, like assembling lego bricks.
 - over 40 different tools:
  - conversion
  - filtering (with predefined filtering rules)
  - substructure and similarity search
  -  10 different fingerprints
  - structure recognition
  - structure depiction
  - descriptor calculation
  - property prediction
 - CTB is accessible via any recent web browser, without any shell
 knowledge. You can run a CTB instance for you lab or a whole university,
 like we do here at the university of Freiburg
 - CTB enables easy access to High-Performance-Computing (HPC). Based on
 the Galaxy project, CTB can run on any Cluster/Grid setup, including
 Cloud computing, like EC2
 - by the means of the Galaxy Tool Shed, CTB comes with installation
 routines of all dependencies, like Open Babel, RDKit, osra, opsin ...
 once Galaxy is running you can install CTB with a few clicks
 - CTB can be accessed via a REST API, for example for massively parallel
 execution of workflows
 - tools can be easily integrated, independently from its programming
 language (it needs to be command line accessible)
 - every workflow, dataset, history can be shared with a group of
 researchers or can be made public available, enabling transparent and
 peer-reviewed research. For example the ChemicalBox,
 a merged database of many freely available 
 compounds:http://galaxy.uni-freiburg.de/u/bgruening/w/preparation-of-a-large-compound-library-by-merging-of-chemical-databases-1
 - every tool and every tool-parameter can have (and hopefully has) an
 exhaustive description to guide the researcher
 - automatic build-in mutiprocessing: most of the CTB tools will split
 the input datasets into smaller chunks before execution and merging the
 result files after execution

 I would like to thank the cheminformatic community, for all the great
 tools and libraries, for accepting patches and enlightening discussions!
 I hope CTB will be a useful contribution and can open the cheminformatic
 universe to many more researchers.

 Thanks and happy research!
 Björn

 [1] http://galaxyproject.org




 --
 Shape the Mobile Experience: Free Subscription
 Software experts and developers: Be at the forefront of tech innovation.
 Intel(R) Software Adrenaline delivers strategic insight and game-changing
 conversations that shape the rapidly evolving mobile landscape. Sign up
 now.

 http://pubads.g.doubleclick.net/gampad/clk?id=63431311iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




 --
 Shape the Mobile Experience: Free Subscription
 Software experts and developers: Be at the forefront of tech innovation.
 Intel(R) Software Adrenaline delivers strategic insight and game-changing
 conversations that shape the rapidly evolving mobile landscape. Sign up
 now.
 http://pubads.g.doubleclick.net/gampad/clk?id=63431311iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Shape the Mobile Experience: Free Subscription
Software experts and developers: Be at the forefront of tech innovation.
Intel(R) Software Adrenaline delivers strategic insight and game-changing 
conversations that shape the rapidly evolving mobile landscape. Sign 

Re: [Rdkit-discuss] AllChem.ReplaceSubstructs

2013-11-07 Thread Igor Filippov
Greg,

Is it available in c++? Also, just to make sure - the argument is a list of
old positions for each new position?

Thanks,
Igor


On Thu, Nov 7, 2013 at 8:41 AM, Greg Landrum greg.land...@gmail.com wrote:

 Dear Michal,


 On Thu, Nov 7, 2013 at 12:46 PM, Michal Krompiec 
 michal.kromp...@gmail.com wrote:

 Hello again,
 I browsed through the sources and I found the answer to my question:
 the atom at index 0 from the replacement is used for the new bond. It
 would be nice to be able to specify the index of this bonding atom as
 a parameter in AllChem.ReplaceSubstructs.

 Is it possible to reorder atoms in a molecule (i.e. to have a chosen
 atom at index 0)?


 Indeed there is, the functionality was added at the last minute to the
 2013.09 release.

 Here's how you use it:

 In [2]: m = Chem.MolFromSmiles('NCO')

 In [3]: print Chem.MolToMolBlock(m)

  RDKit

   3  2  0  0  0  0  0  0  0  0999 V2000
 0.0.0. N   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. O   0  0  0  0  0  0  0  0  0  0  0  0
   1  2  1  0
   2  3  1  0
 M  END


 In [4]: m2 = Chem.RenumberAtoms(m,(1,2,0))

 In [5]: print Chem.MolToMolBlock(m2)

  RDKit

   3  2  0  0  0  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. O   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. N   0  0  0  0  0  0  0  0  0  0  0  0
   3  1  1  0
   1  2  1  0
 M  END





 --
 November Webinars for C, C++, Fortran Developers
 Accelerate application performance with scalable programming models.
 Explore
 techniques for threading, error checking, porting, and tuning. Get the most
 from the latest Intel processors and coprocessors. See abstracts and
 register
 http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDkit, OS X 10.9 and clang++

2013-10-25 Thread Igor Filippov
There is no g++ for OSX 10.9 at all?
Would one of these work by any chance?
http://sourceforge.net/projects/hpc/files/hpc/gcc/

Igor


On Fri, Oct 25, 2013 at 12:43 PM, William G. Scott wgsc...@ucsc.edu wrote:

 Dear RDkit community:

 I’ve been maintaining a fink package for RDkit (primarily as a dependency
 for coot).

 cf:  http://tinyurl.com/rdkitfink

 It compiles and on OSX 10.6, 10.7 and 10.8, but not 10.9.  (This includes
 the 2013_9 pre-release, FWIW.)

 With 10.9, the migration to clang++ is upon us, and I’m stuck.

 Has anyone succeeded in getting rdkit compiled on 10.9, and if so, how?

 Also, if anyone has feedback or recommendations  for how to improve the
 fink rdkit package, please let me know.

 Thanks in advance.

 Bill Scott





 William G. Scott
 Professor
 Department of Chemistry and Biochemistry
 and The Center for the Molecular Biology of RNA
 228 Sinsheimer Laboratories
 University of California at Santa Cruz
 Santa Cruz, California 95064
 USA


 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Inconsistancy across elements in making Hs explicit

2013-09-27 Thread Igor Filippov
SMILES for carbon and other common organic elements already assume implicit
hydrogens, implicitly.
this is not RDKit thing, it's the definition of SMILES.
I'm not sure there is such thing as implicit hydrogens for Silicone even
though it's so similar to Carbon.

HTH,
Igor


On Fri, Sep 27, 2013 at 7:14 AM, Toby Wright toby.wri...@inhibox.comwrote:

 Hi,

 I've observed an odd behaviour in RDKit with listing explicit hydrogens in
 smiles where the original molecules were generated from SD files. As the
 code below shows if I ask What is the smiles for a single C atom? I get
 C but if I ask for silicon I get [SiH4]. Any reason why this might be?
 I've also observed that in RDKit 2013 Q2 I get [Fe] as the smiles from a
 single iron atom, but in RDKit 2011 Q4 I get [FeH6] but I can't see
 anything in the release notes to explain this change. I also have examples
 involving atoms in larger molecules but I thought these provided the
 simplest examples.

 Example files and interactive python snippet:

  sup = Chem.SDMolSupplier(SingleSi.sdf)
  sup2 = Chem.SDMolSupplier(SingleC.sdf)
  print Chem.MolToSmiles(sup[0], canonical=True, isomericSmiles=True)
 [SiH4]
  print Chem.MolToSmiles(sup2[0], canonical=True, isomericSmiles=True)
 C

 SingleC.sdf:
 SingleC
  RDKit  2D

   1  0  0  0  0  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
 M  END
 

 SingleSi.sdf:
 SingleSi
  RDKit  2D

   1  0  0  0  0  0  0  0  0  0999 V2000
 0.0.0. Si  0  0  0  0  0  0  0  0  0  0  0  0
 M  END
 

 Thanks,

 Toby

 --
 InhibOx Ltd


 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] problem with RDKit configuration on apache and wsgi

2013-09-10 Thread Igor Filippov
Maybe move the rdkit libs into some folder within apache DocumentRoot?
Did you restart apache after syncing the libraries?


On Tue, Sep 10, 2013 at 3:58 PM, Michał Nowotka mmm...@gmail.com wrote:

 Yes, I mean even RDKit.

 No, the trouble maker has exactly the same system. The only difference is
 that in order to install my app on other machines I was using fabric and on
 the problematic machines I just rsynced directories from working machines.

 The only error in error_log is:

 ImportError: libRDGeneral.so.1: cannot open shared object file: No such file 
 or directory




 On Tue, Sep 10, 2013 at 9:47 PM, Markus Sitzmann 
 sitzm...@helix.nih.govwrote:

 **
 So you mean, you have even RDKit running with this setup on other machines
 or just the other libraries.

 If other machines vs the trouble maker machine includes a change from
 RH5
 to RH6 than indeed I would suspect SELinux as Igor suggested. Don't ask me
 for a solution except for switching it off.

 Is there anything (else) in /var/log/httpd/error_log?

 Markus

 On Tue, 10 Sep 2013 15:36:27 -0400, Michał Nowotka mmm...@gmail.com
 wrote:

 Yes, I can. What is more important I successfully managed to run other
 libraries with extensions in C (indigo toolkit, cx_Oracle). What is even
 more important the same configuration (at least it seems to be the same)
 works on other machines. Unfortunately I don't know how to debug to find
 source of my problems


 On Tue, Sep 10, 2013 at 9:32 PM, Markus Sitzmann 
 sitzm...@helix.nih.govwrote:

 How far is your setup working so far? Can you already run python scripts
 via mod_wsgi without importing rdkit?




 On Tue, 10 Sep 2013 15:29:07 -0400, Michał Nowotka mmm...@gmail.com
 wrote:

 Yes: python, mod_wsgi, vitualenv, RHEL


 On Tue, Sep 10, 2013 at 9:22 PM, Markus Sitzmann sitzm...@helix.nih.gov
  wrote:

 Are you trying to use it with python/mod_wsgi? Your description so far
 is a bit vague :-)

 Markus

 On Tue, 10 Sep 2013 15:00:55 -0400, Michał Nowotka mmm...@gmail.com
 wrote:

 Thanks for advices. Unfortunately I can't do anything system-wide as I
 don't have administrator privileges on the server. Than also means I can't
 switch selinux.


 On Tue, Sep 10, 2013 at 7:59 PM, Igor Filippov 
 igor.v.filip...@gmail.com wrote:

 Some kind of an answer seems to be already given at Stackoverflow.
 I would add two comments:
 1) Unless there are some specific problems with this you can add new
 library system-wide through ld.so.conf
 2) You did not mention what your log files say but I would experiment
 with Selinux switched off.



 On Tue, Sep 10, 2013 at 1:51 PM, Michał Nowotka mmm...@gmail.comwrote:

 I have some problems with configuring apache to use rdkit,  this is
 described in SO question:


 http://stackoverflow.com/questions/2550504/setting-ld-library-path-in-apache-passenv-setenv-still-cant-find-library

 Any help would be appreciated!

 Regards,
 Michał Nowotka


 --
 How ServiceNow helps IT people transform IT departments:
 1. Consolidate legacy IT systems to a single system of record for IT
 2. Standardize and globalize service processes across IT
 3. Implement zero-touch automation to replace manual, redundant tasks

 http://pubads.g.doubleclick.net/gampad/clk?id=5127iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

















 --
 How ServiceNow helps IT people transform IT departments:
 1. Consolidate legacy IT systems to a single system of record for IT
 2. Standardize and globalize service processes across IT
 3. Implement zero-touch automation to replace manual, redundant tasks
 http://pubads.g.doubleclick.net/gampad/clk?id=5127iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
How ServiceNow helps IT people transform IT departments:
1. Consolidate legacy IT systems to a single system of record for IT
2. Standardize and globalize service processes across IT
3. Implement zero-touch automation to replace manual, redundant tasks
http://pubads.g.doubleclick.net/gampad/clk?id=5127iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Can't read SDF data lines when CTAB is in V3000 format

2013-08-21 Thread Igor Filippov
Greg,

Does writing V3000 format work too? How do you trigger V3000 write-out
instead of V2000?
I never know RDKit can work with V3000!

Thanks,
Igor



On Wed, Aug 21, 2013 at 7:43 AM, Greg Landrum greg.land...@gmail.comwrote:

 Hi Toby,

 On Wed, Aug 21, 2013 at 12:10 PM, Toby Wright toby.wri...@inhibox.comwrote:


 I'm trying to read the data lines from an SD file where the CTAB is in
 V3000 format. If the file v3000propIssue.sdf contains the following:
 testMol


   0  0  0 0  0999 V3000
 M  V30 BEGIN CTAB
 M  V30 COUNTS 1 0 0 0 0
 M  V30 BEGIN ATOM
 M  V30 1 C 0 0 0 0
 M  V30 END ATOM
 M  V30 END CTAB
 M  END
   TestProp
 42

 

 then it is read by an SDMolSupplier it loads correctly (as shown by the
 Debug) apart from the data lines which are not converted to RDKit
 properties as the following interactive code snippet show:

  import rdkit
  from rdkit import Chem
  mol = Chem.SDMolSupplier(v3000propIssue.sdf).next()
 [10:59:33] ERROR: Problems encountered parsing data fields
 [10:59:33] ERROR: moving to the begining of the next molecule
  mol.HasProp(_Name)
 1
  mol.HasProp(TestProp)
 0


 That's a bug (and an embarrassing one). Looks like I've never tested the
 use of v3000 mol files in SDFs. They certainly should work though.

 I'll get it fixed.

 -greg



 --
 Introducing Performance Central, a new site from SourceForge and
 AppDynamics. Performance Central is your source for news, insights,
 analysis and resources for efficient Application Performance Management.
 Visit us today!
 http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Can't read SDF data lines when CTAB is in V3000 format

2013-08-21 Thread Igor Filippov
sounds good!
Thank you,
Igor


On Wed, Aug 21, 2013 at 7:56 AM, Greg Landrum greg.land...@gmail.comwrote:

 Hi Igor,

 On Wed, Aug 21, 2013 at 1:49 PM, Igor Filippov 
 igor.v.filip...@gmail.comwrote:


 Does writing V3000 format work too? How do you trigger V3000 write-out
 instead of V2000?
 I never know RDKit can work with V3000!


 At the moment there is no V3000 writer. The amount of work to add one is
 not huge, but it's not tiny either. I can add it to the possible features
 list.

 -greg



--
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] chirality flag

2013-07-08 Thread Igor Filippov
I noticed that RDKit generated MDL MOL file without the chirality flag set
on the top of the mol
block (there are stereo atoms with wedge and hash bonds present in the
molecule):

 RDKit  2D

796808  0  0  0  0  0  0  0  0999 V2000

Is there any way to tell MolToMolBlock() to set the chirality flag?

Thank you,
Igor
--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] chirality flag

2013-07-08 Thread Igor Filippov
Great  - thank you!
This will work the same way from C++, correct?

Igor


On Mon, Jul 8, 2013 at 11:42 PM, Greg Landrum greg.land...@gmail.comwrote:

 Ok, it's checked in:
 In [2]: m = Chem.MolFromSmiles('C[C@H](F)Cl')

 In [3]: print Chem.MolToMolBlock(m)

  RDKit

   4  3  0  0  0  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. F   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. Cl  0  0  0  0  0  0  0  0  0  0  0  0
   1  2  1  0
   2  3  1  0
   2  4  1  0
 M  END

 In [5]: m.SetProp(_MolFileChiralFlag,1)

 In [6]: print Chem.MolToMolBlock(m)

  RDKit

   4  3  0  0  1  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
  0.0.0. F   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. Cl  0  0  0  0  0  0  0  0  0  0  0  0
   1  2  1  0
   2  3  1  0
   2  4  1  0
 M  END

 The same flag will be set if the chiral flag is set when the mol block is
 parsed.

 -greg



 On Tue, Jul 9, 2013 at 5:08 AM, Greg Landrum greg.land...@gmail.comwrote:

 Hi Igor,

 On Tue, Jul 9, 2013 at 1:23 AM, Igor Filippov 
 igor.v.filip...@gmail.comwrote:

 I noticed that RDKit generated MDL MOL file without the chirality flag
 set on the top of the mol
 block (there are stereo atoms with wedge and hash bonds present in the
 molecule):

  RDKit  2D

 796808  0  0  0  0  0  0  0  0999 V2000

 Is there any way to tell MolToMolBlock() to set the chirality flag?


 There currently is no way to do this. I will add support for it (via a
 molecule property) for the next release.

 -greg



--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] chirality flag

2013-07-08 Thread Igor Filippov
Fantastic - this is the fastest feature implementation ever!

Igor


On Mon, Jul 8, 2013 at 11:45 PM, Greg Landrum greg.land...@gmail.comwrote:

 yeah. There you can/should set the flag using an unsigned int.


 On Tue, Jul 9, 2013 at 5:44 AM, Igor Filippov 
 igor.v.filip...@gmail.comwrote:

 Great  - thank you!
 This will work the same way from C++, correct?

 Igor


 On Mon, Jul 8, 2013 at 11:42 PM, Greg Landrum greg.land...@gmail.comwrote:

 Ok, it's checked in:
 In [2]: m = Chem.MolFromSmiles('C[C@H](F)Cl')

 In [3]: print Chem.MolToMolBlock(m)

  RDKit

   4  3  0  0  0  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. F   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. Cl  0  0  0  0  0  0  0  0  0  0  0  0
   1  2  1  0
   2  3  1  0
   2  4  1  0
 M  END

 In [5]: m.SetProp(_MolFileChiralFlag,1)

 In [6]: print Chem.MolToMolBlock(m)

  RDKit

   4  3  0  0  1  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
  0.0.0. F   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. Cl  0  0  0  0  0  0  0  0  0  0  0  0
   1  2  1  0
   2  3  1  0
   2  4  1  0
 M  END

 The same flag will be set if the chiral flag is set when the mol block
 is parsed.

 -greg



 On Tue, Jul 9, 2013 at 5:08 AM, Greg Landrum greg.land...@gmail.comwrote:

 Hi Igor,

 On Tue, Jul 9, 2013 at 1:23 AM, Igor Filippov 
 igor.v.filip...@gmail.com wrote:

 I noticed that RDKit generated MDL MOL file without the chirality flag
 set on the top of the mol
 block (there are stereo atoms with wedge and hash bonds present in the
 molecule):

  RDKit  2D

 796808  0  0  0  0  0  0  0  0999 V2000

 Is there any way to tell MolToMolBlock() to set the chirality flag?


 There currently is no way to do this. I will add support for it (via a
 molecule property) for the next release.

 -greg





--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] aromatic nitrogens

2013-06-25 Thread Igor Filippov
Dear All,

I think this question has been discussed before
but now that I ran into this problem I can't seem to find a solution.
Is there a SMILES string for Histidine that RDKit would be happy with?

It seems does not matter what I try  I can't use SmilesToMol() and
sanitizeMol()
without them throwing. I'm working from c++ by the way.

Igor
--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] aromatic nitrogens

2013-06-25 Thread Igor Filippov
I'm getting an exception at sanitizeMol - can't kekulize with this SMILES
(and many many others) :(

Thank you,
Igor


On Tue, Jun 25, 2013 at 12:14 PM, JP jeanpaul.ebe...@inhibox.com wrote:


 On 25 June 2013 17:00, Igor Filippov igor.v.filip...@gmail.com wrote:

 Histidine


 How about: N[C@@H](Cc1c[nH]cn1)C(O)=O

  Chem.MolFromSmiles('N[C@@H](Cc1c[nH]cn1)C(O)=O')
 rdkit.Chem.rdchem.Mol object at 0x27ef0c0



--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] aromatic nitrogens

2013-06-25 Thread Igor Filippov
Thanks, JP - trying to make the smallest example led me to find the actual
problem, which was not with the SMILES per se
but the way I copied the mol object from one place to another.

Igor


On Tue, Jun 25, 2013 at 1:29 PM, JP jeanpaul.ebe...@inhibox.com wrote:

 That is interesting, my RDKit 2012_12_1 works - which version of RDKit are
 you using?

 Also, it is always better to paste minimal code snippets if you need help.
  It is hard to figure out what is wrong otherwise.

 What happens when you copy this line in python (if you have access to a
 python interpreter)?  MolFromSmiles does sanitization - so this would fail
 if molecule is not valid.

  m = Chem.MolFromSmiles('N[C@@H](Cc1c[nH]cn1)C(O)=O')
  print m
 rdkit.Chem.rdchem.Mol object at 0x27ef1a0

 And a non working mol:

  Chem.MolFromSmiles('O=C(O)[C@@H](N)Cc1cncn1')
 [17:09:48] Can't kekulize mol




 On 25 June 2013 17:47, Igor Filippov igor.v.filip...@gmail.com wrote:

 I'm getting an exception at sanitizeMol - can't kekulize with this
 SMILES (and many many others) :(

 Thank you,
 Igor


 On Tue, Jun 25, 2013 at 12:14 PM, JP jeanpaul.ebe...@inhibox.com wrote:


 On 25 June 2013 17:00, Igor Filippov igor.v.filip...@gmail.com wrote:

 Histidine


 How about: N[C@@H](Cc1c[nH]cn1)C(O)=O

  Chem.MolFromSmiles('N[C@@H](Cc1c[nH]cn1)C(O)=O')
 rdkit.Chem.rdchem.Mol object at 0x27ef0c0





--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] publication using RDKit

2013-04-17 Thread Igor Filippov [Contr]
Dear Greg et al.,

You might get this a lot - I hope you do! - but our work would not be
possible without RDKit. Our modeling results obtained with RDKit got
recently accepted for publication - many thanks to you and everyone else
who contributed to the toolkit!

http://www.sciencedirect.com/science/article/pii/S0968089613002460

Best regards,
Igor

--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis  visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] implementation of tautomer enumeration/canonicalization

2013-04-15 Thread Igor Filippov [Contr]

 Note: another method for tautomer canonicalization (but not
 enumeration) is to convert to inchi and back. This is similar to
 Noel's canonical smiles using inchi idea. The approach may be
 somewhat fragile (I'm not convinced that the RDKit's inchi-molecule
 implementation is the best), but is worth considering.

I believe Markus and Wolf-Deitrich's tautomer canonicalization scheme
takes into account more types of tautomerism than what InChI is
currently handling. This is the impression I got  - I think Markus is
reading this list and will correct me if I'm wrong.

Igor

--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis  visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] retrosynthesis

2012-07-25 Thread Igor Filippov
I'm going to attend the upcoming ACS meeting in Philadelphia, if anyone
is interested in discussing reaction recognition perhaps this would be a
good opportunity to meet? I was thinking about doing OSRA 1.4.0 release
with reaction recognition enabled just before the meeting.

Igor



On Tue, 2012-07-24 at 12:28 -0400, Geoffrey Hutchison wrote:
 On Jul 23, 2012, at 9:32 AM, Greg Landrum wrote:
 
  I'm not aware of anything. The RDKit has many of the pieces necessary
  to start to build such a system, but a library of reactions to use in
  the retrosynthesis is missing. As I commented on your feature request
 
 Indeed, several people have approached me (through Open Babel) with a similar 
 request. As Greg said, there is no existing open database of reactions.
 
 I've tried to catalyze the issue by asking Igor Filippov (of OSRA) to do 
 reaction recognition (now in beta) and we've added reaction support for 
 ChemDraw CDX files. This would help people compile such a reaction database 
 from existing files and papers.
 
 I also know that Abe Heifets at Toronto has been working on the code side of 
 things, but he currently uses a commercial reaction database:
 http://www.cs.toronto.edu/~aheifets/ChemicalPlanning/
 
 In short -- it's a key problem, but hopefully can be solved through a bit of 
 common work.
 
 -Geoff
 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and 
 threat landscape has changed and how IT managers can respond. Discussions 
 will include endpoint security, mobile security and the latest in malware 
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] building models using descriptors

2012-05-07 Thread Igor Filippov
Thank you, Greg!

As always, right on the mark!

If I may bother you just a bit more :)


 Here's the example output:
 
 
 *** Vote Results ***
 misclassified: 93/242 (%38.43)  93/242 (%38.43)
 
Why the same set of numbers is printed twice?

 average correct confidence:0.8520
 average incorrect confidence:  0.7673
 
 Results Table:
 
   72  61  |  68.57
   32  77  |  55.40
  --- ---
69.23   55.80
 

If I try to compute percentages I'm getting for example 
72/(72+61) = 54.1%  not 68.57% or any other percentage I see there?
However 72/(72+32) = 69.23% just as it should...

Best regards,
Igor




--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] building models using descriptors

2012-05-02 Thread Igor Filippov
Dear Colleagues,

I am following the tutorials at
http://code.google.com/p/rdkit/wiki/BuildingModelsUsingDescriptors1
and
http://code.google.com/p/rdkit/wiki/BuildingModelsUsingFingerprints1
to use RDKit to build a random forest model with floating point type
descriptors. Perhaps someone can advise me on the following two points:

1) There doesn't seem to be a real-valued analogue to SigTreeBuilder,
e.g. QuantTreeBuilder, what we have to use is QuantTreeBoot?

2) The parameter needsQuantization is set to False in both cases -
binary fingerprint or real-valued descriptor, naively I thought it
should be True in the latter case?

3) What precisely is the print out coming out of
ScreenComposite.ShowVoteResults? I'm guessing it's the confusion matrix
with something extra thrown in but I cannot find explanation of all the
different numbers there...

Thanks in advance,
Igor


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] help prioritize feature additions to the RDKit

2012-02-13 Thread Igor Filippov
Things like new feature proposals are not going to happen overnight,
I would suggest keeping the ideatorrent (or a different feature request
system) running continuously.

As a matter of fact I just thought of something that is nagging me every
time I compile RDKit so I put together a feature request/solution
proposal - it's awaiting moderator approval right now :)

Igor

On Mon, 2012-02-13 at 22:56 -0500, Greg Landrum wrote:
 On Sun, Feb 5, 2012 at 5:46 PM, Greg Landrum greg.land...@gmail.com wrote:
 
  I'm trying a small experiment to try and prioritize new features for
  the next version of the RDKit.
 
  Sourceforge provides a tool called IdeaTorrent for tracking feature
  requests and allowing the community to vote on them. I've set one of
  these up with many of the current outstanding feature requests:
  https://sourceforge.net/apps/ideatorrent/rdkit/
 
 At this point it looks like this experiment demonstrates that the
 IdeaTorrent method for prioritizing feature requests isn't the right
 one: after being up for more than a week there have been a total of
 three votes. So either the RDKit is perfect or there's a problem with
 using IdeaTorrent. As much as I'd like to believe the first, I'm
 guessing it's the second. ;-)
 
 I could imagine that this has to do with the fact that an sf.net
 account is registered to vote on things in IdeaTorrent. I hadn't
 realized this when I set it up and there unfortunately doesn't seem to
 be any way to change it (realistically, I guess it's probably a bad
 idea to change it). I'm going to go ahead and shut the IdeaTorrent
 site down sometime in the next day or so.
 
 I still would like to have some easy-to-use method to allow people to
 vote for RDKit feature requests. At this point the only other thing I
 can really think of is to use the google code page for this, either
 using the tracker there or the wiki. That also requires an account,
 but I guess way, way more people have google accounts than have
 sourceforge accounts.
 
 Best,
 -greg
 
 --
 Keep Your Developer Skills Current with LearnDevNow!
 The most comprehensive online learning library for Microsoft developers
 is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
 Metro Style Apps, more. Free future releases when you subscribe now!
 http://p.sf.net/sfu/learndevnow-d2d
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] maximum common substructure

2011-08-15 Thread Igor Filippov
I sent c++ code for MCES and MCIS for submission for OpenBabel,
it should be possible in principle to modify it to work with RDKit as
well.

Igor

On Sun, 2011-08-14 at 23:03 -0400, Greg Landrum wrote:
 Dear TJ,
 
 On Mon, Aug 15, 2011 at 1:03 AM, TJ O'Donnell t...@acm.org wrote:
  Is there a module in rdkit to find the maximum common substructure for
  a set of input molecules?
 
 I'm afraid not.
 
 -greg
 
 --
 uberSVN's rich system and user administration capabilities and model 
 configuration take the hassle out of deploying and managing Subversion and 
 the tools developers use with it. Learn more about uberSVN and get a free 
 download at:  http://p.sf.net/sfu/wandisco-dev2dev
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
uberSVN's rich system and user administration capabilities and model 
configuration take the hassle out of deploying and managing Subversion and 
the tools developers use with it. Learn more about uberSVN and get a free 
download at:  http://p.sf.net/sfu/wandisco-dev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] depict R-group property

2011-06-14 Thread Igor Filippov
I don't know about reading but openbabel can certainly write out SD
files with R-groups as atomic aliases. Those SD files are then displayed
correctly in most molecular editors I've tested (Symyx Draw, MolSketch,
ChemDraw).

Igor



On Tue, 2011-06-14 at 16:22 -0400, Donald Keidel wrote:
 Greg,
 
 I know all about teething.  My 8 month old is pushing out teeth almost 
 every 2 weeks or so.  I will continue to test this portion of the code 
 since it seems to be the only open source package wrapped in python that 
 does it mostly correct.  openbabel wont even read these SD files with 
 r-groups.
 
 What I did to hack this was to add R1-R9 entries in 
 /rdkit/trunk/Code/GraphMol/atomic_data.cpp with atomic numbers 105-112.  
 I then use the set atomic number functionality in the python wrapped 
 code and change it to one of these series of numbers and then depict.  
 It is not a solid solution, but it works right now.
 
 Thanks.
 
 Don
 
 On 06/14/2011 12:55 PM, Greg Landrum wrote:
  Hi Don,
 
  On Mon, Jun 13, 2011 at 8:58 PM, Donald Keideldonald.kei...@gmail.com  
  wrote:
  Have the following mol file:
 
RDKit  2D
 
 7  6  0  0  0  0  0  0  0  0999 V2000
   7.9102   -2.42080. N   0  0  0  0  0  0  0  0  0  0  0  0
   6.4770 -2.42080. C   0  0  0  0  0  0  0  0  0  0  0  0
   5.0437 -2.42080. O   0  0  0  0  0  0  0  0  0  0  0  0
   7.1916   -2.0137 0. C   0  0  0  0  0  0  0  0  0  0  0  0
   5.7583   -2.0137 0. C   0  0  0  0  0  0  0  0  0  0  0  0
   4.3292   -2.0137 0. C   0  0  0  0  0  0  0  0  0  0  0  0
   7.9148   -3.24580. R#  0  0  0  0  0  0  0  0  0  0  0  0
 2  4  1  0
 3  5  1  0
 4  1  1  0
 5  2  1  0
 6  3  1  0
 1  7  1  0
  M  RGP  1   7   1
  V7 *
  M  END
 
 
  When I use RDKit to depict the molecule I get a * in the image where the
  R-group is located.
 
  Is there a way to define what letter or letter with number combination
  is printed for the R-group?  A dictionary perhaps?  Ideally I would like
  the ability to depict R1 thru R9.
  Here's what's going on currently:
  By default the rendering code uses atom.GetSymbol() to determine what
  should show up in the drawing.
  atom.GetSymbol() using the atomic number, unless the atom has the
  property dummyLabel set. If that property is set, it's used. It
  should also be checking for the property _MolFileRLabel.
 
  In looking at this I also discovered another problem : there's an
  error if you call the depiction code using a molecule with R groups
  and the kekulize argument is True.
 
  I've entered a bug about each of these things. They'll be fixed in the
  next release (planned for end of the month).
 
  It's great (for the RDKit) that someone is really using the R group
  stuff in mol files and reporting the problems; this is an
  under-utilized/tested piece of the code, so I really appreciate it.
  Apologies that you're having to suffer through the teething problems.
 
  Best,
  -greg
 
 --
 EditLive Enterprise is the world's most technically advanced content
 authoring tool. Experience the power of Track Changes, Inline Image
 Editing and ensure content is compliant with Accessibility Checking.
 http://p.sf.net/sfu/ephox-dev2dev
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Antwort: Re: random forest in RDKit - ctd.

2011-05-10 Thread Igor Filippov
Paul,


  nPossible = [0]+[2]*ndescrs+[3]
 
  Then it should work.
 
 
 Where does ndescrs come from?
 
 Using the MorganFingerprint example from the Wiki:
 # build fingerprints:
 fps = [AllChem.GetMorganFingerprintAsBitVect(x,2,2048) for x in ms]
 nPossible = [0]+[2]*fps+[3]
 
I'm new to this too, but I believe after [2] there should be the length
of your fingerprint - in this case it's 2048:
nPossible = [0]+[2]*2048+[3]

Also, nPossible then goes into Grow() as an argument.

Best regards,
Igor


--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] random forest in RDKit

2011-05-03 Thread Igor Filippov
Greg,

This is great! Sorry for making you work.
The new help page is very informative.
I am sure this will prove useful not just to me.

Questions:

1) How can I use real-valued descriptors, such as MOE-like descriptors
for such modeling? Do I need to pick descriptors one-by-one or is there
something like AllDescriptors which computes all of them in one pass?

2) Can you briefly say what each of the parameters to Grow() is?
Some are mentioned in the text but I would like to make sure I know what
everything is.


Best regards,
Igor

On Tue, 2011-05-03 at 06:48 +0200, Greg Landrum wrote:
 Hi Igor,
 
 On Mon, May 2, 2011 at 9:52 PM, Igor Filippov igor.v.filip...@gmail.com 
 wrote:
 
  Yes, actually for this project I'm interested in Python specifically!
  Time to learn me some new tricks :)
 
 Sad... I was kind of hoping to get the response: oh? only in python?
 never mind! because then I wouldn't have to write the docs. ;-)
 
  I was looking through the docs online but I cannot figure it out :(
 
 Yeah... that's the problem.
 
 The existing ML code is actually pretty easy to use (at least it used
 to be... I haven't used it this way in a while) if you have a database
 structured exactly the way it expects. When this is true, there are a
 couple of command line tools that automate everything. However, I
 don't expect people are actually going to be interested in doing this.
 So there ought to be at least some docs.
 
 I will start a series of howto examples on the wiki and label them with ML:
 http://code.google.com/p/rdkit/w/list?q=label:ML
 the first is here:
 http://code.google.com/p/rdkit/wiki/BuildingModelsUsingFingerprints1
 
 I'm going to focus on tree predictors (i.e. bags of decision trees and
 random forests), because that's the area where the RDKit is the
 strongest.
 
 If you're interested in more flexibility in terms of type of model,
 it's probably best to use the RDKit in combination with R (via rpy2, I
 haven't done this), or knime.
 
 -greg



--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] random forest in RDKit

2011-05-02 Thread Igor Filippov
Greg et al,

Can anybody point me in the right direction (some simple code snippets
would be best) how to use machine learning methods in RDkit? I am
especially interested in RandomForest implementation.

Thank you in advance,
Igor


--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] random forest in RDKit

2011-05-02 Thread Igor Filippov
Hi Greg,

Yes, actually for this project I'm interested in Python specifically!
Time to learn me some new tricks :)
I was looking through the docs online but I cannot figure it out :(

Best regards,
Igor

On Mon, 2011-05-02 at 21:45 +0200, Greg Landrum wrote:
 Hi Igor,
 
 On Mon, May 2, 2011 at 9:08 PM, Igor Filippov igor.v.filip...@gmail.com 
 wrote:
 
  Can anybody point me in the right direction (some simple code snippets
  would be best) how to use machine learning methods in RDkit? I am
  especially interested in RandomForest implementation.
 
 
 The machine learning code is mostly written in Python. I know you're
 primarily a C++ user, are you still interested?
 
 -greg



--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit on CentOs 5

2011-01-06 Thread Igor Filippov [Contr]
***Failed 
 62/ 76 Testing pyTestConformerWrap   ***Failed 
 63/ 76 Testing testQuery ***Failed 
 64/ 76 Testing testMatCalc   ***Failed 
 65/ 76 Testing pyMatCalc ***Failed 
 66/ 76 Testing pyCMIM***Failed 
 67/ 76 Testing pyRanker  ***Failed 
 68/ 76 Testing testChemicalFeatures  ***Failed 
 69/ 76 Testing pyFeatures***Failed 
 70/ 76 Testing pythonTestDbCLI   ***Failed 
 71/ 76 Testing pythonTestDirML   ***Failed 
 72/ 76 Testing pythonTestDirDataStructs Passed
 73/ 76 Testing pythonTestDirDbase   Passed
 74/ 76 Testing pythonTestDirSimDivFilters   Passed
 75/ 76 Testing pythonTestDirVLibPassed
 76/ 76 Testing pythonTestDirChem ***Failed 

5% tests passed, 72 tests failed out of 76

The following tests FAILED:
  1 - testDict (Failed)
  2 - testDataStructs (Failed)
  3 - pyBV (Failed)
  4 - pyDiscreteValueVect (Failed)
  5 - pySparseIntVect (Failed)
  6 - testTransforms (Failed)
  7 - testGrid (Failed)
  8 - testPyGeometry (Failed)
  9 - testMatrices (Failed)
 10 - testAlignment (Failed)
 11 - pyAlignment (Failed)
 12 - testOptimizer (Failed)
 13 - testForceField (Failed)
 14 - testDistGeom (Failed)
 15 - pyDistGeom (Failed)
 16 - graphmolTest1 (Failed)
 17 - graphmolcpTest (Failed)
 18 - graphmolqueryTest (Failed)
 19 - graphmolMolOpsTest (Failed)
 20 - graphmoltestCanon (Failed)
 21 - graphmoltestChirality (Failed)
 22 - graphmoltestPickler (Failed)
 23 - graphmolIterTest (Failed)
 24 - testDepictor (Failed)
 25 - pyDepictor (Failed)
 26 - smiTest1 (Failed)
 27 - smaTest1 (Failed)
 28 - fileParsersTest1 (Failed)
 29 - testMolSupplier (Failed)
 30 - testMolWriter (Failed)
 31 - testTplParser (Failed)
 32 - testMol2ToMol (Failed)
 33 - testSubstructMatch (Failed)
 34 - testReaction (Failed)
 35 - pyChemReactions (Failed)
 36 - testChemTransforms (Failed)
 37 - testSubgraphs1 (Failed)
 38 - testSubgraphs2 (Failed)
 39 - testFragCatalog (Failed)
 40 - pyFragCatalog (Failed)
 41 - testDescriptors (Failed)
 42 - pyMolDescriptors (Failed)
 43 - testFingerprints (Failed)
 44 - pyPartialCharges (Failed)
 45 - testMolTransforms (Failed)
 46 - pyMolTransforms (Failed)
 47 - testForceFieldHelpers (Failed)
 48 - pyForceFieldHelpers (Failed)
 49 - testDistGeomHelpers (Failed)
 50 - pyDistGeom (Failed)
 51 - testMolAlign (Failed)
 52 - pyMolAlign (Failed)
 53 - testFeatures (Failed)
 54 - pyChemicalFeatures (Failed)
 55 - testShapeHelpers (Failed)
 56 - pyShapeHelpers (Failed)
 57 - testMolCatalog (Failed)
 58 - pyMolCatalog (Failed)
 59 - testSLNParse (Failed)
 60 - pySLNParse (Failed)
 61 - pyGraphMolWrap (Failed)
 62 - pyTestConformerWrap (Failed)
 63 - testQuery (Failed)
 64 - testMatCalc (Failed)
 65 - pyMatCalc (Failed)
 66 - pyCMIM (Failed)
 67 - pyRanker (Failed)
 68 - testChemicalFeatures (Failed)
 69 - pyFeatures (Failed)
 70 - pythonTestDbCLI (Failed)
 71 - pythonTestDirML (Failed)
 76 - pythonTestDirChem (Failed)
Errors while running CTest
[r...@mgcct25 build]# 


On Thu, 2011-01-06 at 12:26 -0500, rkdeli...@gmail.com wrote:
 Igor,
 
 I am very happy to hear that the script is helpful. And, yes,
 installation on CentOS 5.5 is a pain. The problem actually resides in
 the fact that the major CentOS and RHEL releases are very dated once
 they are released. GCC is my biggest complaint as the standard version
 on the current CentOS is known to have a major bug that causes the
 Boost compile and therefore RDKit compile to fail. I believe that the
 new RHEL 6 is somewhat better - based on Fedora 12 and 13 - and the
 upcoming CentOS 6 will obviously be better as well. Unfortunately,
 this is what we have to work with. Alternatively, I have also
 installed RDKit on Fedora 14 with no need for any updates to any other
 package. One step and I was done. The good news is that future
 releases of RDKit should go somewhat painlessly. I was able to install
 the current beta package by simply compiling the RDKit code. After 2
 minutes of hands on time, 15 minutes of waiting, and RDKit was up and
 running.
 
 If you run into any problems, please post them so that we can
 (hopefully) help and others can benefit in the future.
 
 -Kirk
 
 
 
 On Jan 6, 2011 10:11am, Igor Filippov [Contr] ig...@helix.nih.gov
 wrote:
  Dear Kirk,
  
  
  
  Thank you so

[Rdkit-discuss] RDKit on CentOs 5

2011-01-05 Thread Igor Filippov [Contr]
Dear All,

Has anyone successfully compiled RDKit on CentOs 5? I'm running into the
following error message:
[ 15%] Building CXX object

Code/Numerics/Alignment/Wrap/CMakeFiles/rdAlignment.dir/rdAlignment.cpp.o

/root/RDKit_2010_09_1/Code/Numerics/Alignment/Wrap/rdAlignment.cpp:14:31: 
error: numpy/arrayobject.h: No such file or directory

On CentOs 5 arrayobject.h is part of python-numeric package and it's
located in:
/usr/include/python2.4/Numeric/arrayobject.h

I'm attempting to compile RDKit_2010_09_1, using boost version 1.39.0,
x86_64 system.

Regards,
Igor





--
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RPM packages for Fedora

2010-11-19 Thread Igor Filippov
This is very nice work! Can I have a fedora 13 64-bit package?

Best regards,
Igor

On Fri, 2010-11-19 at 16:57 -0500, gia...@gmail.com wrote:
 For those of you running Fedora I am happy to announce the
 availability of RPM packages so you can use rdkit without compiling
 stuff on your own.
 
 For now, I have packages for Fedora 14 64 bit available at:
 
 http://giallu.fedorapeople.org/rdkit-f14-x86_64/
 
 The plan is to improve the packaging to the point I will be able to
 submit a review request for eventual inclusion in the official Fedora
 repos; in the meanwhile, if anyone needs packages for Fedora 13 and/or
 32 bit based machines, feel free to ping me and I'll make them
 avaialble.
 
 Cheers
 
 G.
 
 
 



--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] BEGINWEDGE and BEGINDASH

2009-06-10 Thread Igor Filippov [Contr]

 Exactly. If you look at the chemical literature, or at what other
 drawing programs do, I think you'll find that only one bond is
 normally wedged: this conveys all the information required in a
 sketch.

Hmm, I'm not an expert but I see this kind of cases on a daily basis.

A look at a random journal issue could serve as an example - 
http://pubs.acs.org/toc/orlef7/11/1


Igor

-- 
Igor Filippov [Contr] ig...@helix.nih.gov




Re: [Rdkit-discuss] BEGINWEDGE and BEGINDASH

2009-06-10 Thread Igor Filippov

 Igor, as you said, you spend a lot of time thinking about this stuff:
 what would you say the general rule is?
 
 -greg

Oh, I'm certainly not getting into a flame war about the correct way to
depict stereochemistry!!! :) I've seen people far more knowledgeable
than me having arguments about this until these days at the InChI
steering committee meeting. This is probably an argument that has been
going on for decades with no resolution.

I'm interested in the subject only insofar as it relates to my Optical
Structure Recognition Application (OSRA) - and I've certainly seen all
kinds of drawing styles. While some of them are quite likely
unconventional or even incorrect, OSRA should be able to recognize them
and should be allowed to produce output SDF or SMILES that match the
input drawing as closely as possible. So for me the best resolution
would be to allow the programmer (in python or c++) to create the
molecular object the way he sees fit and not to force him to adhere to
an arbitrary limitation.

Igor




Re: [Rdkit-discuss] GUI

2009-05-19 Thread Igor Filippov
George,

I believe the comment from Greg was that the GUI part is badly outdated
and was of limited usability even in its heyday.

Best,
Igor

On Tue, 2009-05-19 at 15:25 +, George Oakman wrote:
 Dear all,
  
 I was wondering if there is a GUI part of the RDKit. From the various
 documents it looks like there is, but I don't seem to be able to find
 it within the Code folder. Is this a commercial component or under a
 different licensing scheme?
 
 Many thanks,
  
 George.
  
 
 
 __
  Upgrade to Internet Explorer 8 Optimised for MSN.  Download Now
 --
 Crystal Reports - New Free Runtime and 30 Day Trial
 Check out the new simplified licensing option that enables 
 unlimited royalty-free distribution of the report engine 
 for externally facing server and web deployment. 
 http://p.sf.net/sfu/businessobjects
 ___ Rdkit-discuss mailing list 
 Rdkit-discuss@lists.sourceforge.net 
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




Re: [Rdkit-discuss] Compiling on Red Hat linux

2009-03-27 Thread Igor Filippov
I don't think it's a question of upgrade, it's a question of actually
installing lapack libs.
Simply running
yum install lapack lapack-devel blas blas-devel
should take care of things.
I have compiled RDKit on Centos 4, CentOs 5 and Fedora 8 and 9.

Cheers,
Igor


On Fri, 2009-03-27 at 17:52 +, George Oakman wrote:
 Hi,
  
 Thanks. I guess the easiest would be to upgrade my Red Hat
 distribution, which would come with a more recent gcc and BLAS
 library. Unfortunately I can't do that as it is my current 'minimum
 requirement' build machine.
  
 I'm on Red Hat Enterprise Linux 4 (Nahant, 2005) with Linux 2.6.9 and
 gcc 3.4.6
  
 Has anyone managed to compile the RDKit on a similar configuration? I
 am trying to compile the Q4-2008 release by the way.
  
 Many thanks,
  
 George.
  
  
  Date: Fri, 27 Mar 2009 05:45:05 +0100
  Subject: Re: [Rdkit-discuss] Compiling on Red Hat linux
  From: greg.land...@gmail.com
  To: oakm...@hotmail.com
  CC: rdkit-discuss@lists.sourceforge.net
  
  On Thu, Mar 26, 2009 at 6:28 PM, George Oakman oakm...@hotmail.com
 wrote:
   Hi Greg,
  
   Thanks.
  
   Do you know which version is required?
  
  
  As long as it's compatible with your c++ compiler, I don't think it
  should make much difference.
  
  -greg
 
  
 
 __
 From: oakm...@hotmail.com
 To: greg.land...@gmail.com
 Date: Thu, 26 Mar 2009 17:28:06 +
 CC: rdkit-discuss@lists.sourceforge.net
 Subject: Re: [Rdkit-discuss] Compiling on Red Hat linux
 
 Hi Greg,
 
 Thanks. 
  
 Do you know which version is required?
 
  
  Date: Thu, 26 Mar 2009 18:12:27 +0100
  Subject: Re: [Rdkit-discuss] Compiling on Red Hat linux
  From: greg.land...@gmail.com
  To: oakm...@hotmail.com
  CC: rdkit-discuss@lists.sourceforge.net
  
  Dear George,
  
  I'd suggest you find whatever package exists for Red Hat that
 includes
  a pre-built BLAS and LAPACK. I wouldn't recommend building them
  yourself.
  
  -greg
  
  On Thu, Mar 26, 2009 at 4:11 PM, George Oakman oakm...@hotmail.com
 wrote:
   Hi all,
  
   I decided to take a vacation from Windows for a while and I'm
 trying to
   install the RDKit on a Linux platform (Red Hat).
  
   I'm hitting a problem trying to complie libGraphMol.so
  
   This is the error coming out of bjam:
  
   /usr/bin/ld: skipping incompatible
   /usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../libblas.so when
 searching
   for -lblas
   /usr/bin/ld: skipping incompatible
   /usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../libblas.a when
 searching for
   -lblas
   /usr/bin/ld: skipping incompatible /usr/lib/libblas.so when
 searching for
   -lblas
   /usr/bin/ld: skipping incompatible /usr/lib/libblas.a when
 searching for
   -lblas
  
   It looks like my version of libblas is incompatible. How can I
 recompile a
   compatible version?
  
   Can I use the files in $RDKit/External?
  
   It looks like $RDKit/External/Lapack only has the win32 library.
  
   Any help would be greatly appreciated.
  
   Thank you.
  
   George.
  
  
  
   
   Beyond Hotmail — see what else you can do with Windows Live. Find
 out more!
  
 --
  
   ___
   Rdkit-discuss mailing list
   Rdkit-discuss@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
  
  
 
 
 
 __
 Share your photos with Windows Live Photos – Free. Try it Now!
 --
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




Re: [Rdkit-discuss] Structure Search Engine for All Major RDBMSs

2009-02-26 Thread Igor Filippov [Contr]
Interesting, but 90,000 structures is a tiny database by today's
standards. For CSLS - http://cactus.nci.nih.gov/cgi-bin/lookup/search we
have to deal with 46 million unique structures so in-memory fingerprints
might run out of memory and into problems. Another point worth
mentioning is that similarity search is not the same as substructure
search and it seems like Duan is equaling the two.

Igor

On Thu, 2009-02-26 at 15:47 +, baoilleach wrote:
 Database searching using fingerprints seems to be a topic of interest
 at the moment...
 
  
  
 Sent to you by baoilleach via Google Reader:
  
  
 Structure Search Engine for All Major RDBMSs
 via ChemHack by Duan Lian on 2/26/09
 
  
 
 FingerPrint
 
 Many methods of doing substructure search directly in SQL has been
 reported recently, Adel Golovin and Kim Henrick’s Chemical
 Substructure in SQL, Rich Apodaca’s fingerprint based MySQL
 substructure search in MySQL, and Charlie Zhu’s Microsoft SQL Server
 based substructure search with SMARTS support. 
 
 Doing this in RDBMSs do have a number of advantages, “including
 platform independency, simplicity, flexibility, integrity, robustness
 and single point of failure”, as Adel and Kim describes. But some
 light weight RDBMSs such as MySQL and PostgreSQL, the most widely used
 open source ones, provide very limited SQL programming function, a
 pure SQL based solution may be impossible. 
 
 Plugins are developed to enhance the functionality. For MySQL, there’s
 an open source project called mychem. For PostgreSQL, there’s
 pgchem:tigress which is also open source. Both of them is based on
 OpenBabel, a C++ chemoinformatics library. 
 
 On Oracle platform, there’re CambridgeSoft Oracle Cartridge, Symyx
 Direct, JChem Cartridge, etc.. As Oracle is a commercial platform, not
 of these above is free.
 
 When I was developing chemsoso.com, a Chinese chemical supplier
 database, structure search feature is an important problem to be
 solved. The database contains 90,000 different chemicals in total and
 still growing, performance needs to be carefully dealt with. 
 
 In consideration of speed, fingerprint is obviously the best choice.
 It takes time to generate fingerprints, but in the search stage, bit
 operation are much less consuming than graph matching. My initial idea
 is to generate fingerprint in Java and do bit operation in MySQL.
 Unfortunately, MySQL has restrictions on bit operation, it limit the
 maximum range to 64 bits. In Rich’s solution, fingerprint is separated
 into multi fileds to satisfy MySQL’s requirement. Substructure search
 is possible in this method. But similar search where Tanimoto
 coefficient needs to be calculated is still impossible, as more bit
 operation function is missing in MySQL.
 
 In my final solution, a in-memory fingerprint index outside MySQL is
 created. Molecule structure information(SMILES or mol file) is stored
 in MySQL, my search engine synchronize data between the in-memory
 index and MySQL table. Structure searching is performed directly on
 the in-memory index, this guarantee the performance. On a MacBook with
 1.83GHz CPU, it only takes about 50ms to do substructure search on
 chemsoso.com’s 90,000 structures. For similar structure search, it
 takes about 300ms, for full structure search, the time is less than
 10ms. If search boundary if set according to similarity requirement,
 we can have another 4X to 100X performance improvement depends on the
 complexity of the query molecule structure.
 
 Several days ago, Charlie Zhu talked to me, wondering how
 chemsoso.com’s structure search engine works. As the structure was
 mainly built from open source libraries, I decide to make the search
 engine open source to ease the work of building chemistry related
 database. With the search engine released, developers can focus the
 database’s own functionality, instead of dealing with structure
 search.
 
 Engine Structure
 
 Before I can release the search engine, I have to find a way to cut in
 users’ system in the form of plugin. Including source code directly
 into users’ project may be the fastest way to add structure search
 functionality, but my code is in Java does not means everyone’s
 project is in Java. I want the search engine works not only with all
 major RDBMSs, but also all major OSs and all major programming
 languages. Besides Java API, command API and HTTP API will also be
 provided to make sure the search engine works with multi programming
 language and network environment where server clusters exists. 
 
 
 
  
  
 Things you can do from here:
   * Subscribe to ChemHack using Google Reader
   * Get started using Google Reader to easily keep up with all
 your favorite sites
  
  
 --
 Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
 -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
 -Strategies to boost 

Re: [Rdkit-discuss] Developing on Visual C++

2009-02-21 Thread Igor Filippov [Contr]
George,

Sure, here it is (I named your file t.cpp):
g++ -c -I ../rdkit-svn/Code -I ../rdkit-svn/External/vflib-2.0/include/
-I ../boost_1_37_0 t.cpp

export LD_LIBRARY_PATH=../rdkit-svn/bin/

g++ -o t t.o -L ../rdkit-svn/bin/ -lRDGeneral -lGraphMol -lRDGeometry 

 ./t
[12:37:07]  Hello RDKit 

Hope this helps,
Igor


On Sat, 2009-02-21 at 09:13 +, George Oakman wrote:
 Hi,
  
 Thanks for trying with gcc. 
  
 When I compile and link with bjam and toolset=msvc it works for me
 too, but it is when I try to complie using the build process
 integrated with Visual C++ Express that I get the runtime error.
  
 It would be very kind if you could let me have your command line for
 compiling and linking with gcc, this might help me turn the right
 options on on Visual C++.
  
 Thanks a lot.
  
 George.
 
  
  Subject: Re: [Rdkit-discuss] Developing on Visual C++
  From: ig...@helix.nih.gov
  To: oakm...@hotmail.com
  CC: rdkit-discuss@lists.sourceforge.net
  Date: Fri, 20 Feb 2009 12:07:17 -0500
  
  George,
  
  I compiled your example with gcc, everything works fine there.
  
  
  Igor
  
  On Fri, 2009-02-20 at 16:25 +, George Oakman wrote:
   Hi all,
   
   I am trying to write a piece of C++ code with the RDKit C++
 library
   (I'm using Visual C++ Express edition as the development
 environment).
   
   Thank you very much for the GettingStarted example in C++, that
 works
   fine. I can compile the GettingStarted example using bjam very
 well,
   so I guess this is good news.
   
   I am now trying to create a proper Visual C++ project (WIN32
 console
   app) and compile via the Build process on Visual C++. So far so
 good,
   I have a mini program that compiles and outputs a Hello RDKit
 using
   BOOST_LOG:
   
   #include stdio.h
   #include GraphMol/RDKitBase.h
   #include RDGeneral/RDLog.h
   using namespace RDKit;
   int main(int argc, char *argv[])
   {
   RDLog::InitLogs();
   BOOST_LOG(rdInfoLog) Hello RDKit std::endl; 
   return 0;
   }
   
   This is a Win32 Console Application, that I link with
 libRDGeneral.lib
   libGraphMol.lib libRDGeometry.lib
   
   The piece of code above executes fine (although I receive the
   following warning at link time: warning LNK4098: defaultlib
 'MSVCRT'
   conflicts with use of other libs; use /NODEFAULTLIB:library).
   
   Things start breaking when I try to create a molecule object with
   RWMol *mol=new RWMol(); 
   
   #include stdio.h
   #include GraphMol/RDKitBase.h
   #include RDGeneral/RDLog.h
   using namespace RDKit;
   int main(int argc, char *argv[])
   {
   RDLog::InitLogs();
   RWMol *mol=new RWMol(); 
   BOOST_LOG(rdInfoLog) Hello RDKit std::endl; 
   return 0;
   }
   
   This piece of code compiles and links well (same warning as
 before)
   but, at runtime, I receive a 'buffer overflow' on line 28 in
 RWMol.h:
   
   RWMol() { d_partialBonds.clear(); }
   
   
   Am I missing something obvious?
   
   Sorry, I know you are not really supporting VC++ and prefer the
   boost.build/bjam framework, but maybe someone else is developing
 using
   Visual C++ projects and could help me.
   
   Thanks for your help,
   
   George.
   
   
   
   
   
   
   
   
   
   
  
 __
   Beyond Hotmail - see what else you can do with Windows Live Find
 out
   more!
  
 --
   Open Source Business Conference (OSBC), March 24-25, 2009, San
 Francisco, CA
   -OSBC tackles the biggest issue in open source: Open Sourcing the
 Enterprise
   -Strategies to boost innovation and cut costs with open source
 participation
   -Receive a $600 discount off the registration fee with the source
 code: SFAD
   http://p.sf.net/sfu/XcvMzF8H
   ___ Rdkit-discuss
 mailing list Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
  
 
 
 __
 Share your photos with Windows Live Photos – Free Find out more!
-- 
Igor Filippov [Contr] ig...@helix.nih.gov




Re: [Rdkit-discuss] Developing on Visual C++

2009-02-20 Thread Igor Filippov [Contr]
George,

I compiled your example with gcc, everything works fine there.


Igor

On Fri, 2009-02-20 at 16:25 +, George Oakman wrote:
 Hi all,
  
 I am trying to write a piece of C++ code with the RDKit C++ library
 (I'm using Visual C++ Express edition as the development environment).
  
 Thank you very much for the GettingStarted example in C++, that works
 fine. I can compile the GettingStarted example using bjam very well,
 so I guess this is good news.
  
 I am now trying to create a proper Visual C++ project (WIN32 console
 app) and compile via the Build process on Visual C++. So far so good,
 I have a mini program that compiles and outputs a Hello RDKit using
 BOOST_LOG:
  
#include stdio.h
#include GraphMol/RDKitBase.h
#include RDGeneral/RDLog.h
using namespace RDKit;
int main(int argc, char *argv[])
{
  RDLog::InitLogs();
  BOOST_LOG(rdInfoLog) Hello RDKit std::endl; 
  return 0;
}
  
 This is a Win32 Console Application, that I link with libRDGeneral.lib
 libGraphMol.lib libRDGeometry.lib
  
 The piece of code above executes fine (although I receive the
 following warning at link time: warning LNK4098: defaultlib 'MSVCRT'
 conflicts with use of other libs; use /NODEFAULTLIB:library).
  
 Things start breaking when I try to create a molecule object with
 RWMol *mol=new RWMol(); 
  
#include stdio.h
#include GraphMol/RDKitBase.h
#include RDGeneral/RDLog.h
using namespace RDKit;
int main(int argc, char *argv[])
{
  RDLog::InitLogs();
  RWMol *mol=new RWMol(); 
  BOOST_LOG(rdInfoLog) Hello RDKit std::endl; 
  return 0;
}
  
 This piece of code compiles and links well (same warning as before)
 but, at runtime, I receive a 'buffer overflow' on line 28 in RWMol.h:
  
RWMol() { d_partialBonds.clear(); }
  
  
 Am I missing something obvious?
 
 Sorry, I know you are not really supporting VC++ and prefer the
 boost.build/bjam framework, but maybe someone else is developing using
 Visual C++ projects and could help me.
  
 Thanks for your help,
  
 George.
  
  
  
  
  
  
  
  
 
 
 __
 Beyond Hotmail - see what else you can do with Windows Live Find out
 more!
 --
 Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
 -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
 -Strategies to boost innovation and cut costs with open source participation
 -Receive a $600 discount off the registration fee with the source code: SFAD
 http://p.sf.net/sfu/XcvMzF8H
 ___ Rdkit-discuss mailing list 
 Rdkit-discuss@lists.sourceforge.net 
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




Re: [Rdkit-discuss] ring stereochemistry changes

2008-12-26 Thread Igor Filippov [Contr]
Dear Greg,

I was wondering if you could give a hint as to how accomplish the
following: I am trying to remove all the disconnected fragments from a
molecule smaller than a certain threshold number of atoms. The following
piece of code does not seem to work:

std::vectorROMOL_SPTR frag;
  frag = MolOps::getMolFrags(*mol1);
  for (unsigned i=0;i frag.size();i++)
if (frag[i]-getNumAtoms()MIN_A_COUNT)
  for(RWMol::AtomIterator atomIt=frag[i]-beginAtoms();atomIt!
=frag[i]-endAtoms();atomIt++)
{
  RWMol *mol2=(static_castRWMol *(frag[i]));
  mol2-removeAtom(atomIt);
}

Any recommendations would be welcome!

Igor

P.S. Merry holidays :)

On Sat, 2008-12-06 at 08:00 +0100, Greg Landrum wrote:
 Dear all,
 
 When I made the changes to the handling of stereochemistry in the Q3
 2008 release of the RDKit, I attempted to get the canonicalization of
 ring stereochemistry specifications (i.e. things like the molecule
 encoded by the SMILES c...@h]1cc[c@@H](C)CC1) right. I knew that it
 was unlikely that I had actually done so, so I added a warning to the
 code that you probably have seen:
Warning: ring stereochemistry detected. This may not be handled correctly.
 Based on the tests I did before the release, I believed that the
 current status of the code was correct a large enough fraction of the
 time that this warning was sufficient.
 
 After some large-scale testing done by a colleague, it turns out that
 this assumption was very bad: the handling of ring stereochemistry was
 incorrect much, much too frequently. To be explicit: the SMILES that
 were being generated for molecules that fell into this class did not
 have the correct stereochemistry; i.e.  the molecules were wrong. This
 is far from acceptable: it's ok to generate output that's
 non-canonical, but not output that's incorrect.
 
 Here's what I have done in order to address this:
  - The stereochemistry specifications on ring atoms are no longer 
 canonicalized.
  - The warning about ring stereochemistry has been moved into the
 SMILES writer and now indicates clearly that the output is not
 canonical
  - Detection of cases that can have ring stereochemistry has been
 improved; spiro centers, symmetrically substituted centers, and
 bridged systems no longer trigger the warning.
 
 These changes are all in svn; I encourage you to update if you are
 building from source. I plan to do the Q4 release sometime in the next
 couple of weeks; the changes will also be in that release.
 
 Once I've assembled a sufficiently large set of test molecules I will
 revisit the canonicalization of these systems sometime next year and
 try again to get this right.
 
 This is definitely the fun part of an open-source project: everyone
 sees when you screw up. :-S
 
 -greg
 
 --
 SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
 The future of the web can't happen without you.  Join us at MIX09 to help
 pave the way to the Next Web now. Learn more and register at
 http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
-- 
Igor Filippov [Contr] ig...@helix.nih.gov




Re: [Rdkit-discuss] New stuff in subversion

2008-11-26 Thread Igor Filippov [Contr]
I tried that, also with adding --prefix=/home... to both boost and RDKit
bjam run options but it's still the same error. I wonder where does it
get c:/ prefix...

Igor

On Wed, 2008-11-26 at 20:37 +0100, Greg Landrum wrote:
 On Wed, Nov 26, 2008 at 8:11 PM, Igor Filippov [Contr]
 ig...@helix.nih.gov wrote:
  I have followed the process, everything except for the very last step
  seems to have worked out Ok.
  I'm attaching the error log from
  bjam --toolset=gcc --without-python all-libraries
  (done in rdkit-svn/Code)
 
  I'm still concerned about environment variables that have C:.. (or n:)
  in them - using shell window from Msys installation there is no
  necessity for those. I think this mix of unix and windows-style
  directory names is the root of the problem.
 
 It has worked for me, but I am using clean installs of recent versions
 of msys and mingw. What happens if you try using unix-style paths in
 your env vars?
 
 -greg




[Rdkit-discuss] RDKit and MinGW

2008-11-12 Thread Igor Filippov [Contr]
Dear all,

Has anyone reported a successful build in MinGW environment?
I'm getting some obscure error messages, the log file is attached.

Igor




[Rdkit-discuss] RDKit and MinGW

2008-11-12 Thread Igor Filippov [Contr]
And now the log file is really attached.

Igor
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\property.jam:613:
 in find-replace from module object(property-map)@1
error: Ambiguous key
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\property.jam:590:
 in object(property-map)@1.find from module object(property-map)@1
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\type.jam:335: 
in generated-target-ps-real from module type
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\type.jam:359: 
in generated-target-ps from module type
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\type.jam:270: 
in type.generated-target-suffix from module type
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\virtual-target.jam:501:
 in virtual-target.add-prefix-and-suffix from module virtual-target
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\virtual-target.jam:460:
 in _adjust-name from module object(file-target)@520
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\virtual-target.jam:243:
 in abstract-file-target.__init__ from module object(file-target)@520
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\virtual-target.jam:553:
 in object(file-target)@520.__init__ from module object(file-target)@520
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/kernel\class.jam:88: 
in class.new from module class
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\generators.jam:523:
 in generator.generated-targets from module object(gcc-linking-generator)@32
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/tools\builtin.jam:885:
 in linking-generator.generated-targets from module 
object(gcc-linking-generator)@32
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/tools\unix.jam:67: 
in generated-targets from module object(gcc-linking-generator)@32
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\generators.jam:444:
 in construct-result from module object(gcc-linking-generator)@32
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\generators.jam:397:
 in run-really from module object(gcc-linking-generator)@32
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\generators.jam:372:
 in generator.run from module object(gcc-linking-generator)@32
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/tools\builtin.jam:800:
 in linking-generator.run from module object(gcc-linking-generator)@32
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/tools\unix.jam:41: 
in unix-linking-generator.run from module object(gcc-linking-generator)@32
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/tools\gcc.jam:497: 
in object(gcc-linking-generator)@32.run from module 
object(gcc-linking-generator)@32
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\generators.jam:958:
 in try-one-generator-really from module generators
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\generators.jam:1020:
 in try-one-generator from module generators
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\generators.jam:1232:
 in construct-really from module generators
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\generators.jam:1304:
 in generators.construct from module generators
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\targets.jam:1431:
 in construct from module object(typed-target)@457
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\targets.jam:1244:
 in object(typed-target)@457.generate from module object(typed-target)@457
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\targets.jam:767:
 in generate-really from module object(main-target)@464
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\targets.jam:739:
 in object(main-target)@464.generate from module object(main-target)@464
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\targets.jam:257:
 in object(project-target)@451.generate from module object(project-target)@451
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\targets.jam:883:
 in targets.generate-from-reference from module targets
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\targets.jam:1168:
 in generate-dependencies from module object(typed-target)@441
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\targets.jam:1218:
 in object(typed-target)@441.generate from module object(typed-target)@441
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\targets.jam:767:
 in generate-really from module object(main-target)@445
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\targets.jam:739:
 in object(main-target)@445.generate from module object(main-target)@445
C:/msys/1.0/home/Administrator/boost_1_37_0/tools/build/v2/build\targets.jam:257:
 in object(project-target)@440.generate from module 

[Rdkit-discuss] NO2 bug/feature?

2008-05-24 Thread Igor Filippov [Contr]
Dear Greg,

I noticed the following peculiarities - not sure if it's a bug or a
feature. 
1) When assembling a fragment like this CN(=O)=O (a nitro group),
  mol-addAtom(new Atom(7)); 
  mol-addAtom(new Atom(8)); 
  mol-addAtom(new Atom(8)); 
  mol-addAtom(new Atom(6));
  mol-addBond(0,1,Bond::DOUBLE); 
  mol-addBond(0,2,Bond::DOUBLE);
  mol-addBond(0,3,Bond::SINGLE);

it automatically gets converted to C[N+](=O)[O-]
Not exactly what I have entered, though equivalent?

2) If I designate a bond as aromatic and it's not in a ring, the
Sanitization procedure throws an exception - not a desired behavior for
me, as I would like to have an opportunity to clear up AROMATIC flag
from non-ring bonds (if it gets there by mistake), but I cannot perceive
ring bonds before sanitization. So it's somewhat like chicken-and-eggs
problem.
[02:42:01] Kekulization somehow did not convert bond 2
terminate called after throwing an instance of
'RDKit::MolSanitizeException'
  what():  N5RDKit20MolSanitizeExceptionE
Aborted



Other than that I'm happy to inform you that I have added the support
for RDKIT to OSRA and the upcoming release will give users a choice of
whether to compile with OpenBabel or RDKit as a molecular back-end.

Best regards,
Igor


-- 
Igor Filippov [Contr] ig...@helix.nih.gov




Re: [Rdkit-discuss] c++ example

2008-05-16 Thread Igor Filippov [Contr]
Dear Greg,


I was wondering if you could advise me how to obtain the following
counts with RDKit:
- The number of aromatic rings
- The number of 3,4,5,6,7-member rings

There doesn't seem to be an API function that I can find to get those
numbers, is it possible to do this with some simple SMARTS patterns?

Sincerely,
Igor

On Fri, 2008-04-04 at 16:19 +0200, Greg Landrum wrote:
 Dear Igor,
 
 On Fri, Apr 4, 2008 at 1:41 PM, Igor Filippov [Contr]
 ig...@helix.nih.gov wrote:
   Did you have a chance to take a look at compiling RDkit on 64-bit Linux?
   Unfortunately OSRA has dependencies that can only be dynamically linked
   (e.g. ImageMagick) so using 32-bit executable on 64-bit system can be
   somewhat complicated.
 
 I did take a look. Unfortunately things didn't work out of the box,
 and I'm not overly familiar with this new 64 bit world, so I don't
 have an easy answer.
 
 One big sticking point at the moment is a problem that causes
 boost.build to not pass along some command line argument (specifically
 -fPIC) when it build static libraries. This causes problems with the
 logging library. The problem is fixed in the subversion boost.build
 and in the 1.35.0 release of the boost libraries, but I haven't had
 time to test and switch to 1.35.0 yet.
 
 I will keep plugging away at this; I would like to have a 64bit build
 available for testing.
 
 -greg
-- 
Igor Filippov [Contr] ig...@helix.nih.gov




Re: [Rdkit-discuss] c++ example

2008-03-26 Thread Igor Filippov [Contr]
Greg,

Thanks, it works now.
Note for the other users - I had to add to LD_LIBRARY_PATH the
following:
export LD_LIBRARY_PATH=
$LD_LIBRARY_PATH:/home/igor/RDKit_Jan2008_1/Code/GraphMol/SmilesParse/bin/gcc-4.1.2/release/threading-multi/:/home/igor/RDKit_Jan2008_1/Code/GraphMol/bin/gcc-4.1.2/release/threading-multi/:/home/igor/RDKit_Jan2008_1/Code/GraphMol/FileParsers/bin/gcc-4.1.2/release/threading-multi/

for the compiled executable to be linked to libSmilesParse.so,
libGraphMol.so, and libFileParsers.so

Regards,
Igor

On Wed, 2008-03-26 at 05:51 +0100, Greg Landrum wrote:
 Igor,
 
 
 On Wed, Mar 26, 2008 at 4:54 AM, Igor Filippov [Contr]
 ig...@helix.nih.gov wrote:
 
   Thank you, very fast response and exactly the code snippet I was looking
   for!
 
 Glad to hear it.
 
   I'm trying to compile it now and it looks like it cannot find libblas
~/boost-jam-3.1.16-1-linuxx86/bjam out 21
   (see attached out file).
 
   I checked my installation of RDkit and it has libblas++ but no libblas.
   find RDKit_Jan2008_1/ -name libblas*
   RDKit_Jan2008_1/External/Lapack
   ++/bin/gcc-4.1.2/release/link-static/threading-multi/libblas++.a
 
 Correct, you need to have libblas (and liblapack, that will probably
 be the next link error you get) installed on your machine. You can get
 these precompiled for most linux distributions.
 
 I will put put a complete list of requirements for building in the
 build instructions on my ToDo list so that there's a way to RTFM for
 this problem.
 
 
   The installation of RDkit yesterday seemed to have progressed without
   errors, though I am completely unfamiliar with bjam and the output
   looked plenty confusing :) What would be the check that things have
 
 Yes, the output of bjam can take some getting used to.
 
   compiled as they should (or not)? It generated some executables but
   other folders have only source files and no binaries etc.
 
 Here are some instructions for testing a build from the bottom of this
 wiki page:
 http://code.google.com/p/rdkit/wiki/BuildingOnLinux
 * cd to $RDBASE/Code and do python $RDBASE/Python/TestRunner.py 
 test_list.py
 * cd to $RDBASE/Code/GraphMol and do python
 $RDBASE/Python/TestRunner.py test_list.py
 * create the databases used by the Python tests, requires sqlite3
 to be installed:
   o sqlite3 $RDBASE/Data/RDTests.sqlt 
 $RDBASE/Python/Dbase/testData/RDTests.sqlite
   o sqlite3 $RDBASE/Data/RDData.sqlt 
 $RDBASE/Python/Dbase/testData/RDData.sqlite
 * cd to $RDBASE/Python and do: find . -name 'test_list.py' -exec
 python $RDBASE/Python/TestRunner.py \{\}  pytests.out 21 \;
 
 The first two steps should be enough to see if the build worked. The
 last two steps are for more exhaustive testing.
 
   Am I doing something completely wrong?
 
 No, it looks like you're on track aside from the blas/lapack problem.
 
 -greg
-- 
Igor Filippov [Contr] ig...@helix.nih.gov