Re: [Rdkit-discuss] want advice for good teaching data set

2018-08-29 Thread TJ O'Donnell
Hi Andrew
ChEMBL 24 has compound properties in the table compound_properties.  I
think the alogp
is computed using (Crippen) atom types and the acd_logp is uses ACD labs
methods.
TJ

On Wed, Aug 29, 2018 at 5:52 AM Andrew Dalke 
wrote:

> Hi all,
>
>   I am starting to put together materials for the Python/RDKit training
> course I'm giving just before the RDKit UGM next month.
>
> I would like to structure part of it around the SQLite release of the
> ChEMBL data set. More specifically, I plan to include examples of machine
> learning with scikit-learn, using RDKit descriptors and values from ChEMBL
> 24 (and making sure to use the new schema).
>
> Two problems. First, I'm not a computational chemist and I don't know what
> would constitute a good example to use. "Good" in this case means one whose
> outlines are well-known to likely students. Second, I don't have much
> experience with the ChEMBL data.
>
> My thought is to make a logP model. The easiest would be to based it on
> atom types. For this option, can anyone suggest where I can find logP data
> from ChEMBL?
>
> Another possibility is to use a pre-existing model, like the notebook
> George Papadatos did for Ligand-based Target Prediction at
> http://nbviewer.jupyter.org/gist/madgpap/10457778 .
>
> Perhaps someone here could point me to other existing resources along
> similar lines?
>
> Best regards,
>
> Andrew
> da...@dalkescientific.com
>
>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS for an amide in an aromatic ring

2017-09-18 Thread TJ O'Donnell
Try either of these:

[N,n](C)-,:[C,c](=[O])

C[N,n]-,:[C,c](=[O])

TJ O'Donnell

On Mon, Sep 18, 2017 at 4:26 PM, James T. Metz via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> Hello,
>
> Given the following aromatic structure
>
> m = Chem.MolFromSmiles("CN1C=CC(N)=NC1=O")
>
> I would like to construct a SMARTS pattern to
> recognize the aromatic amide (nitrogen attached to
> the exocyclic methyl group) and not recognize the other
> NCO group of atoms.
>
>
> I have tried
>
> pattern = Chem.MolFromSmarts('[N,n]-,:[C,c](=[O])')
>
> but, this matches *both* NCO groups of atoms which
> I do not want.
>
>
> The completely "aliphatic version"
>
> pattern = Chem.MolFromSmarts('[N]-[C](=[O])')
>
> does not match either NCO group of atoms.
>
> I am stumped.  I have also tried several recursive
> SMARTS expressions, but I can't get the syntax
> right.
>
> I would appreciate any suggestions.  Thank you.
>
>
> Regards,
> Jim Metz
>
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread TJ O'Donnell
Let the database do the work for you.  Create a canonical SMILES column
and/or InChI column and declare them to be unique.  As you insert new
rows, postgres will let  you know if there is already a row with the same
SMILES or InChI.
Here's some help on how to handle that.
https://www.postgresql.org/docs/9.5/static/sql-insert.html#SQL-ON-CONFLICT

TJ O'Donnell

On Wed, Sep 13, 2017 at 3:13 AM, Wandré  wrote:

> Hi,
>
> My name is Wandré and I'm from Brazil.
> I'm trying to do a big database of molecules, but, I want to eliminate all
> the redundant molecules before insert them in database.
> I want to know what is the best method to identify one molecule in RDKit.
> Is SMILES ("Chem.MolToSmiles(mol,isomericSmiles=True)") or I will need to
> compare all molecules, one by one, before insert them in database (using
> Tanimoto)?
> This can be hard to do because my database will have lot of millions of
> molecules, so, compare one by one before insert is the only answer?
> Compare if the SMILES as already inserted is easy (text compare), but,
> compare fingerprint of molecule...
>
> If I really need to compare the fingerprint of molecule, how to store this
> data in PostgreSQL without use cartridge? I will generate the fingeprint
> (Atompair, for example) and store this fingerprint in database and compare
> all the fingerprints, one by one, before insert a now molecule. This
> fingerprint (Atompair) have lot of features, so, store this in relational
> database is expensive.
> It is possible?
>
> Thanks!
>
> --
> Wandré Nunes de Pinho Veloso
> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG
> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
> Inteligência Computacional - UNIFEI
> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Fwd: Need SMARTS to distinguish 6-ring vs macrocyclic ether oxygens

2017-09-06 Thread TJ O'Donnell
I verified that r6 does the trick.  Using my rdchord cartridge, I get

tjo=> select
rd.list_matches(rd.rdmol('OCC1OC2OC3C(CO)OC(OC4C(CO)OC(OC5C(CO)OC(OC6C(CO)OC(OC7C(CO)OC(OC1C(O)C2O)C(O)C7O)C(O)C6O)C(O)C5O)C(O)C4O)C(O)C3O'),
'[O;H0;D2;r6]',0,1);
  list_matches

 {{4},{11},{18},{25},{32},{39}}
(1 row)

tjo=> select
rd.list_matches(rd.rdmol('OCC1OC2OC3C(CO)OC(OC4C(CO)OC(OC5C(CO)OC(OC6C(CO)OC(OC7C(CO)OC(OC1C(O)C2O)C(O)C7O)C(O)C6O)C(O)C5O)C(O)C4O)C(O)C3O'),
'[O;H0;D2;!r6]',0,1);
  list_matches

 {{6},{13},{20},{27},{34},{41}}
(1 row)

Here's an image showing the atom numbers corresponding to the list_matches
output.

TJ

[image: Inline image 2]

On Wed, Sep 6, 2017 at 6:04 PM, TJ O'Donnell  wrote:

> Try using [O;H0;D2;r6] lower-case r.  Sorry I'm not at a computer to
> check this.
> R6 means in 6 rings.
> r6 means in ring of size 6.
>
> http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html
>
> TJ O'Donnell
>
> On Wed, Sep 6, 2017 at 4:34 PM, James T. Metz via Rdkit-discuss <
> rdkit-discuss@lists.sourceforge.net> wrote:
>
>> Hello,
>>
>> Given the following SMILES for a macrocyclic hexaose
>>
>>OCC1OC2OC3C(CO)OC(OC4C(CO)OC(OC5C(CO)OC(OC6C(CO)OC(OC7C(CO)
>> OC(OC1C(O)C2O)C(O)C7O)C(O)C6O)C(O)C5O)C(O)C4O)C(O)C3O
>>
>> can anyone suggest a SMARTS pattern that will distinguish ether
>> oxygens
>> in the smaller 6-membered rings versus the ethers in the larger
>> macrocyclic
>> structure?
>>
>> For example, using RDkit, I have tried (e.g., pattern =
>> Chem.MolFromSmarts('[O;H0;D2]') )
>>
>> [O;H0;D2]  ===>  gives 12 matches (all ether oxygens)
>>
>> [O;H0;D2;R]  ===>  gives 12 matches (all ether oxygens)
>>
>> [O;H0;D2;!R]  ===>  gives 0 matches
>>
>> [O;H0;D2;R6]  ===>  gives 0 matches
>>
>>
>> I am stumped.  Any ideas?
>>
>> If it is necessary to write more complicated PYTHON/RDkit/SMARTS
>> code, I am certainly willing to try that.
>>
>> Thanks!
>>
>> Regards,
>> Jim Metz
>> Northwestern University
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Fwd: Need SMARTS to distinguish 6-ring vs macrocyclic ether oxygens

2017-09-06 Thread TJ O'Donnell
Try using [O;H0;D2;r6] lower-case r.  Sorry I'm not at a computer to check
this.
R6 means in 6 rings.
r6 means in ring of size 6.

http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html

TJ O'Donnell

On Wed, Sep 6, 2017 at 4:34 PM, James T. Metz via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> Hello,
>
> Given the following SMILES for a macrocyclic hexaose
>
>OCC1OC2OC3C(CO)OC(OC4C(CO)OC(OC5C(CO)OC(OC6C(CO)OC(OC7C(CO)
> OC(OC1C(O)C2O)C(O)C7O)C(O)C6O)C(O)C5O)C(O)C4O)C(O)C3O
>
> can anyone suggest a SMARTS pattern that will distinguish ether oxygens
> in the smaller 6-membered rings versus the ethers in the larger macrocyclic
> structure?
>
> For example, using RDkit, I have tried (e.g., pattern =
> Chem.MolFromSmarts('[O;H0;D2]') )
>
> [O;H0;D2]  ===>  gives 12 matches (all ether oxygens)
>
> [O;H0;D2;R]  ===>  gives 12 matches (all ether oxygens)
>
> [O;H0;D2;!R]  ===>  gives 0 matches
>
> [O;H0;D2;R6]  ===>  gives 0 matches
>
>
> I am stumped.  Any ideas?
>
> If it is necessary to write more complicated PYTHON/RDkit/SMARTS code,
> I am certainly willing to try that.
>
> Thanks!
>
> Regards,
> Jim Metz
> Northwestern University
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] connecting to postgres in rdkit environment

2017-02-25 Thread TJ O'Donnell
The server itself must be told to allow remote connections.
You might check these two things.
1.  You can edit the postgresql.conf file (not sure where that is on your
system).

https://www.postgresql.org/docs/9.2/static/runtime-config-connection.html
 Uncomment or add the line listen_addresses='*'. You can
 tailor that to be more specific, but try this first.

2.  The file pg_hba.conf also controls access.  Look at this:
  https://www.postgresql.org/docs/9.3/static/auth-pg-hba-conf.html

Be sure to restart the server after you make changes to these files.

Hope this helps,
TJ O'Donnell


On Sat, Feb 25, 2017 at 12:34 PM,  wrote:

> Hi,
> I've installed rdkit on a CentOS machine using anaconda python and set up
> a postgresql compound database in the rdkit environment. It works great on
> the machine's console.
> I now want to access it remotely and I'm trying to set up a jdbc postgres
> driver to access it from a windows client but this is not working. If I
> test the driver on the server it tells me that the connection is refused
> and I should check that the machine is accepting TCP requests.
>
> I have opened the standard port that postgres uses
> -A INPUT -m state --state NEW -m tcp -p tcp --dport 5432 -j ACCEPT
>
> iptables -L returns
> ACCEPT tcp  --  anywhere anywherestate NEW tcp
> dpt:postgres
>
> this is where I don't know what to check next. A few things that might be
> relevant. If I "ps -eaf | grep post" I see four postgres processes running
> under my username (not postgres), so I think there is a server working.
> There is also a "system" postgresql (version 9.2) which I have connected to
> previously a long time ago. This connection no longer works either and I
> don't really care about that but could be an interfering factor.
>
> If anyone has suggestions about what to check next or solve this I'd be
> grateful
>
> thanks,
> Neil
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Struggling with apache + rdkit + django

2016-06-21 Thread TJ O'Donnell
I would suggest setting PYTHONPATH in
config or ini files for
Apache or Django or uwsgi
Not sure which is required.

On Tue, Jun 21, 2016 at 11:15 AM, Téletchéa Stéphane <
stephane.teletc...@univ-nantes.fr> wrote:

> Le 21/06/2016 20:05, Bennion, Brian a écrit :
> > What is the actual problem that is occurring?  You have listed what you
> have tried to do to fix a problem.
> >
> > Brian
>
> Dear Brian,
>
> I get a 500 error meaning something is not working properly, but no
> trace in logs (either apache or django),
> so I can only "assume" it comes from there since in the "developper"
> mode there is no problem (everything works as expected).
>
> Sorry for the confusion,
>
> Stéphane
>
> --
> Assistant Professor in BioInformatics, UFIP, UMR 6286 CNRS, Team Protein
> Design In Silico
> UFR Sciences et Techniques, 2, rue de la Houssinière, Bât. 25, 44322
> Nantes cedex 03, France
> Tél : +33 251 125 636 / Fax : +33 251 125 632
> http://www.ufip.univ-nantes.fr/ - http://www.steletch.org
>
>
>
> --
> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
> present their vision of the future. This family event has something for
> everyone, including kids. Get more information and register today.
> http://sdm.link/attshape
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] molecule standardization in cartridge search

2015-09-25 Thread TJ O'Donnell
Tim,

I have a set of postgres python (PL/Python) functions using rdkit.
It is available at
https://github.com/tjod/rdchord
and some docs at
https://github.com/tjod/rdchord/wiki

TJ O'Donnell

On Fri, Sep 25, 2015 at 6:54 AM, Tim Dudgeon  wrote:

> Jan,
>
> thanks for that. I'll give it a try.
> Are there any examples of writing RDKit functions and procedures for
> postgres in python?
> I see this general postgres docs:
> http://www.postgresql.org/docs/9.4/static/plpython.html
> but wondered if there are any RDKit specific examples anywhere?
>
> Tim
>
> On 25/09/2015 08:30, Jan Holst Jensen wrote:
> > On 2015-09-24 16:22, Tim Dudgeon wrote:
> >> I'm trying to get to grips with using the RDKit cartridge, and so far
> >> its going well.
> >> One thing I'm concerned about is molecule standardization, along the
> >> lines of the ChemAxon Standardizer that allows substructure searches to
> >> be done is a way that is largely independent of the quirks of structure
> >> representation. The classic example would be how nitro groups are
> >> represented, so that it didn't matter which nitro representation was in
> >> the query or target structures, because both were converted to a
> >> canonical form.
> >>
> >> My initial thoughts are that this would be done by:
> >> 1. loading the "raw" structures into a source column that would never be
> >> changed
> >> 2. defining a function that performed the necessary transform to
> >> generate the canonical form of a molecule.
> >> 3. generating a "canonical" structure column that was the result of
> >> passing the raw structures through that function
> >> 4. building the SSS index on that canonical column
> >> 5. executing queries using that function to canonicalize the query
> >> structure
> >>
> >> The problem I'm finding is that there do not seem to be postgres
> >> functions defined for doing molecular transforms (essentially a reaction
> >> transform) and doing things like removing explicit hydrogens. At least
> >> not in the functions listed on this page:
> >> http://rdkit.org/docs/Cartridge.html#functions
> >>
> >> Am I missing something here, or might I be barking up completely the
> >> wrong tree?
> >>
> >> Tim
> >
> > Hi Tim,
> >
> > We have about the same situation and we're adding standardization
> > (beyond what RDKit implicitly does when it sanitizes the molecule)
> > through Python stored procedures. You will need to build and maintain
> > a normal Python-enabled RDKit installation in parallel to the
> > cartridge. The Python stored procedures can access the normal RDKit
> > installation and then run whatever Python code is necessary to do
> > additional molecule cleanup.
> >
> > You will need to tweak your Postgres environment so the Python stored
> > procedures can load RDKit. This is what I have defined in an
> > environment file on CentOS:
> >
> > RDBASE=/opt/rdkit
> > LD_LIBRARY_PATH=/opt/rdkit/lib
> > PYTHONPATH=/opt/rdkit
> >
> > On Ubuntu this would go into /etc/postgresql/9.x/main/environment (in
> > a slightly different format where the values have to be single-quoted).
> >
> > Cheers
> > -- Jan, Biochemfusion
>
>
>
> --
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] https://sourceforge.net

2015-03-23 Thread TJ O'Donnell
How about including a link on sourceforge to this:
https://help.github.com/articles/support-for-subversion-clients/
so that folks without git clients can get started.

TJ

On Fri, Mar 20, 2015 at 9:48 PM, Greg Landrum 
wrote:

> The mailing lists and one form of the downloads are hosted there. It's a
> very good point that having the trackers still active on sourceforge is
> confusing. I just deleted them.
>
> We should also do something about the svn repo that's there, just to make
> clear that it's no longer active.
>
> Does anyone see a problem with me doing a commit there that removes all
> the code and just leaves a "look in github" readme?
>
>
> On Fri, Mar 20, 2015 at 7:44 PM, Soren Wacker  wrote:
>
>> Hi,
>>
>> rdkit has moved to github, but there is still the repository on
>> sourceforge.net.
>> However, if you google 'rdkit bugs' the sourceforge page comes up first.
>> I find that confusing. Is there a reason to keep the sourceforge.net
>> stuff?
>> If not, why don't you remove the sourceforge repository?
>>
>> kind regards
>> Soren
>>
>> --
>> Dive into the World of Parallel Programming The Go Parallel Website,
>> sponsored
>> by Intel and developed in partnership with Slashdot Media, is your hub
>> for all
>> things parallel software development, from weekly thought leadership
>> blogs to
>> news, videos, case studies, tutorials and more. Take a look and join the
>> conversation now. http://goparallel.sourceforge.net/
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
>
>
> --
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for
> all
> things parallel software development, from weekly thought leadership blogs
> to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Oracle, pypl and rdkit

2015-03-12 Thread TJ O'Donnell
I've implemented a suite of rdkit functions
for postgres using plpython
https://github.com/tjod/rdchord
and the overhead is minimal
since most of the heavy lifting of substructure searching
is done by rdkit.

I think the same would be true of oracle.
-------
TJ O'Donnell

On Thu, Mar 12, 2015 at 4:24 PM, Michal Krompiec 
wrote:

> Hello, has anybody tried to implement substructure searching in an Oracle
> database using PYPL and RDKit? Is it just a matter of writing a wrapper
> function for molecule.HasSubstructMatch(pattern) or is the overhead of
> calling pypl each time too costly timewise? Do consecutive pypl calls
> always share the same interpreter?
> Best wishes,
> Michal
>
>
> --
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for
> all
> things parallel software development, from weekly thought leadership blogs
> to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] autodock vina pdbqt file to mol2

2014-05-09 Thread TJ O'Donnell
Babel can read and write both pdbqt and mol2 files. I'm not sure how the
atom ordering might be accomplished though.

TJ
On May 9, 2014 2:43 PM, "Jan Domanski"  wrote:

> Thanks for the quick reply Christos!
>
> I found the pdbqt_to_pdb script that you mentioned but a google search for
> a pdbqt to mol2 yield nothing (other than this thread). the pdbqt_to_pdb
> converter is very crude: it retains only the best pose from _out.pdbqt and
> it basically just strips the BRANCH and ROOT tags deposited by autodock
> (which I was doing anyway with the sed).
>
> The main problems remaining are atom order (I can fix that) and missing
> hydrogens (can't fix that). There is a mode where I can prevent the
> prepare_ligand4.py from removing the hydrogens – but the output poses then
> have really weird geometry.
>
> But let's refocus a little bit: this is not an autodock vina question
> (although many folks here are knowledgeable enough to help me). This is a
> question on a mol2 file to which it should be possible to add Hs with rdkit
> and it's somehow not happening (at least not in my hands). My mol2 could be
> somehow malformatted.
>
>
>
>
>
> On 9 May 2014 20:57, Christos Kannas  wrote:
>
>> Hi Jan,
>>
>> AutoDock has a set of tools (MGLTools) that have tools to convert pdb to
>> pdbqt and vice-versa.
>> If I recall it can also convert pdbqt to mol2 also. See this discussion
>> http://autodock.1369657.n2.nabble.com/ADL-pdbqt-to-mol2-td6755769.html
>>
>> Best,
>>
>> Christos
>>
>> Christos Kannas
>>
>> Researcher
>> Ph.D Student
>>
>> Mob (UK): +44 (0) 7447700937
>> Mob (Cyprus): +357 99530608
>>
>> [image: View Christos Kannas's profile on 
>> LinkedIn]
>>
>>
>> On 9 May 2014 20:17, Jan Domanski  wrote:
>>
>>>  Hi guys,
>>>
>>> I'm really stuck here: I have some output from autodock vina in a rather
>>> obscure pdbqt format. It's a little bit like pdb but not quite. I'm trying
>>> to get back a mol2 file.
>>>
>>> The autodock pdbqt file has only the polar hydrogens in it – part of the
>>> trick is to re-add the hydrogens.
>>>
>>> Example autodock vina output is attached (it's a conformer of the ACE
>>> native ligand DUDE).
>>>
>>> First of all, I convert that to a PDB file by doing a simple sed,
>>> sed -e '/ROOT/d' -e '/BRANCH/d'
>>> Then I reorder the atoms to match those of the original
>>> crystal_ligand.mol2 (because autodock re-orders the atoms duh).
>>>
>>> Finally, I save a mol2 file out (attached) ordered as the original
>>> crystal_ligand and with polar hydrogens (for each pose of a conformer).
>>>
>>> Let's go to rdkit and try to add hydrogens:
>>>
>>> mol = Chem.MolFromMol2File(output, removeHs=False)
>>> mol2 = AllChem.AddHs(mol, addCoords=True)
>>> print mol.GetNumAtoms(), mol2.GetNumAtoms()
>>> 44 44
>>>
>>> So, only the implicit hydorgens are present. Calling AddHs doesn't raise
>>> an error and it doesn't really change the number of hydrogens...
>>>
>>> Now this may not be the best way of doing things: what I care for is to
>>> get a mol2 from autodock vina that I can compare to the original mol2 from
>>> DUD (same atom order, same number of atoms). Maybe there are other ways to
>>> achieve this: one idea would be to inject the docked pose coordinates into
>>> the original mol2 atoms (heavy and polar hydrogens) and somehow "adjust"
>>> the non-polar hydrogens.
>>>
>>> Thanks,
>>>
>>> - Jan
>>>
>>>
>>>
>>> --
>>> Is your legacy SCM system holding you back? Join Perforce May 7 to find
>>> out:
>>> • 3 signs your SCM is hindering your productivity
>>> • Requirements for releasing software faster
>>> • Expert tips and advice for migrating your SCM now
>>> http://p.sf.net/sfu/perforce
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>
>
> --
> Is your legacy SCM system holding you back? Join Perforce May 7 to find
> out:
> • 3 signs your SCM is hindering your productivity
> • Requirements for releasing software faster
> • Expert tips and advice for migrating your SCM now
> http://p.sf.net/sfu/perforce
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
• 3 signs your SCM is hindering your productivity
• Requirements for releasing software faster
• Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discu

Re: [Rdkit-discuss] RDKit cartridge - opposite of mol_from_ctab() would be nice.

2014-02-24 Thread TJ O'Donnell
Hi All

I would like to announce the availability of a somewhat different
rdkit-based
postgresql extension.  This uses rdkit for all the basic cheminformatics
functions (canonical smiles, molfile handling, smarts matching,
fingerprints, etc.)
but is based on the use of postgres' plpython language.
This does not use the existing rdkit postgres cartridge, although I have
demonstrated that the two can be used side-by-side (via the use of
rdkit pickled mol objects).

I hope this use of python might make it easier to extend postgres even
further with
additional functions based on rdkit.  The code can be checked out from
sourceforge using this:

svn checkout svn://svn.code.sf.net/p/sci3d/code/trunk/openchord/src/rdkitchord

This is a work in progress, so I would appreciate any feedback.  There are
still
some wrinkles that need to be ironed out.   I plan to document
the installation and useage better, probably using github.

TJ O'Donnell



On Sat, Feb 22, 2014 at 10:53 PM, Greg Landrum wrote:

>
> On Fri, Feb 21, 2014 at 5:45 PM, Jan Holst Jensen 
> wrote:
>
>>  Hi Greg,
>>
>> It would be great to gain the experience. I am working on a registration
>> project where we will likely need to surface additional functions in the
>> cartridge, just to try them out. So, knowing how to do that in a way where
>> things that turn out useful can be contributed back cleanly would be great.
>>
>>
> Sounds good.
>
>
>>
>> > if structures don't have conformers
>>
>> Ah, yes; good question. Decisions, decisions... I'll dodge the question
>> :-) and say it sounds like a perfect fit for an optional parameter, e.g.
>>
>> mol_to_ctab(m mol, add_depiction_if_missing bool default true)
>>
>> I would go for default true because I believe that is the general
>> preference.
>>
>>
> Having the optional argument that defaults to true make sense to me.
>
> Here's an attempt to briefly summarize what needs to be changed in order
> to add the new functionality:
>
> - Add mol_to_ctab to rdkit_io.c
> - Add molToCtabText (or some such thing) to adapter.cpp and rdkit.h
> - Add mol_to_ctab() definitions to rdkit.sql91.in and, if you want to
> support older versions of postgres, rdkit.sql.in
> - Update link dependencies in Makefile if necessary (will be necessary if
> you add depictions)
> - Add tests to one of the files in sql/ (the most logical place is
> probably rdkit-91.sql and rdkit-pre91.sql if you are supporting older
> versions) and the corresponding output file in expected/
>
>
> I think that's it.
>
> -greg
>
>
>
>>  Cheers
>> -- Jan
>>
>>
>> On 2014-02-21 16:47, Greg Landrum wrote:
>>
>> Hi Jan,
>>
>>  Great idea. I'd be happy to add it, but I can also "talk" you through
>> it if you want to gain the experience.
>>
>>  One important question: if structures don't have conformers (if they
>> are loaded from SMILES, for example), should ctabs with all zero
>> coordinates be generated or should depictions be generated?
>>
>>  -greg
>>
>>
>>
>> On Fri, Feb 21, 2014 at 2:23 PM, Jan Holst Jensen 
>> wrote:
>>
>>>  Hi Greg,
>>>
>>> Are there any plans for a mol_*to*_ctab() function in the PG cartridge
>>> ? Would make SD file export from the database a bit easier.
>>>
>>> If there are no immediate plans, I can take a stab at adding it myself.
>>>
>>> * Looks like rdkit_io.c is the place to add it ?
>>> * Should I manually define the new SQL function in rdkit.sql.in, or is
>>> there some higher-level place I should add it instead ?
>>>
>>> Cheers
>>> -- Jan
>>>
>>>
>>> --
>>> Managing the Performance of Cloud-Based Applications
>>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
>>> Read the Whitepaper.
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>>
>
>
> --
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
>
> http://p

Re: [Rdkit-discuss] PDB reader and bond perception

2014-01-13 Thread TJ O'Donnell
Hi JP

I use this file from PDB Europe:
ftp://ftp.ebi.ac.uk/pub/databases/msd/pdbechem/files/pdb.tar.gz
Useful links followed from
http://www.ebi.ac.uk/pdbe-srv/pdbechem/

The pdb.tar.gz file has the standard residues and LOTS of others
with specific CONNECT records.

TJ



On Mon, Jan 13, 2014 at 9:54 AM, JP  wrote:

> RDKitters!
>
> Finally back on the mailing list!
>
> I am sure we've been through this at the UGM (my mind must have wandered
> off!), but a quick question about the PDB reader and bond perception.  Is
> this supported with the current PDB reader?  I remember that someone
> (PaulE, perhaps?) was saying bond perception was painful, but there was
> some dictionary for PDB ligands which helps (any idea the name of this
> dictionary?).
>
> To the technical details.
>
> I am reading in the following PDB file with a simple MolFromPDBFile() call:
>
> HETATM1  O1P 84T A1862 -27.016   9.387 -72.564  1.00 20.81
>   O
> HETATM2  P   84T A1862 -27.282   9.818 -73.968  1.00 19.65
>   P
> HETATM3  O2P 84T A1862 -27.881  11.176 -74.182  1.00 21.49
>   O
> HETATM4  N   84T A1862 -25.869   9.583 -74.813  1.00 19.78
>   N
> HETATM5  C   84T A1862 -25.759  10.010 -76.075  1.00 19.97
>   C
> HETATM6  CA  84T A1862 -24.493   9.748 -76.807  1.00 19.75
>   C
> HETATM7  CB  84T A1862 -24.794   8.678 -77.847  1.00 19.73
>   C
> HETATM8  CG  84T A1862 -23.571   8.324 -78.681  1.00 19.70
>   C
> HETATM9  CD2 84T A1862 -23.309   9.519 -79.611  1.00 18.49
>   C
> HETATM   10  CD1 84T A1862 -23.863   6.932 -79.305  1.00 18.60
>   C
> HETATM   11  OHB 84T A1862 -25.210   7.467 -77.223  1.00 19.17
>   O
> HETATM   12  OH  84T A1862 -23.549   9.127 -75.984  1.00 20.33
>   O
> HETATM   13  O   84T A1862 -26.672  10.517 -76.692  1.00 20.26
>   O
> HETATM   14  O5' 84T A1862 -28.377   8.861 -74.619  1.00 19.39
>   O
> HETATM   15  C5' 84T A1862 -28.002   7.536 -74.954  1.00 18.47
>   C
> HETATM   16  C4' 84T A1862 -28.909   7.000 -76.012  1.00 18.24
>   C
> HETATM   17  C3' 84T A1862 -28.901   7.826 -77.298  1.00 18.28
>   C
> HETATM   18  C2' 84T A1862 -30.318   7.610 -77.768  1.00 18.69
>   C
> HETATM   19  O2' 84T A1862 -30.789   8.641 -78.581  1.00 19.64
>   O
> HETATM   20  O4' 84T A1862 -30.262   6.951 -75.529  1.00 18.80
>   O
> HETATM   21  C1' 84T A1862 -31.152   7.470 -76.521  1.00 19.01
>   C
> HETATM   22  N9  84T A1862 -31.753   8.732 -76.009  1.00 20.08
>   N
> HETATM   23  C4  84T A1862 -33.033   9.013 -76.158  1.00 21.10
>   C
> HETATM   24  N3  84T A1862 -34.018   8.339 -76.786  1.00 21.58
>   N
> HETATM   25  C2  84T A1862 -35.263   8.846 -76.830  1.00 21.95
>   C
> HETATM   26  C8  84T A1862 -31.223   9.701 -75.291  1.00 20.27
>   C
> HETATM   27  N7  84T A1862 -32.173  10.618 -75.019  1.00 21.28
> N
> HETATM   28  C5  84T A1862 -33.315  10.213 -75.563  1.00 21.81
>   C
> HETATM   29  C6  84T A1862 -34.624  10.702 -75.627  1.00 22.85
>   C
> HETATM   30  N1  84T A1862 -35.550  10.010 -76.285  1.00 22.44
>   N
> HETATM   31  N6  84T A1862 -35.008  11.862 -75.052  1.00 23.86
>   N
> TER
> END
>
> But I am losing all the double bond (and aromatic) information:
>
> m = Chem.MolFromPDBFile(sys.argv[1])
> print Chem.MolToSmiles(m)
>
> Gives me:
>
> CC(C)C(O)C(O)C(O)NP(O)(O)OCC1CC(O)C(N2CNC3C2NCNC3N)O1
>
> As usual, many thanks for your time,
>
> -
> Jean-Paul Ebejer
> Early Stage Researcher
>
>
> --
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] incorrect stereochemistry

2012-11-02 Thread TJ O'Donnell
Hi Greg

Your latest fix works great!!  I tested it on the troublesome 512
already canonicalized isomeric smiles
and every one of them had cansmiles(input_smiles) ==
cansmiles(cansmiles(input_smiles))
Thanks so much for all your hard work on rdkit and persistence in
getting things working 100%

TJ

On Thu, Nov 1, 2012 at 11:56 PM, Greg Landrum  wrote:
> Hi TJ,
>
> I *believe* that I have fixed this. All the current RDKit tests,
> including a new one that includes the samples you sent, now pass.
> Before I celebrate too much (this bug,
> https://sourceforge.net/p/rdkit/bugs/40/, has been open since Feb
> 2008), I'm going to run through a set of torture tests, but things
> look good.
>
> If you are willing to give the svn version of the RDKit a try on your
> test molecules and let me know if you encounter further problems, I'd
> be happy to hear about them.
>
> Thanks again for the bug report and the kick to get thinking about
> this problem again.
>
> -greg
>
>
> On Fri, Oct 26, 2012 at 7:44 PM, TJ O'Donnell  wrote:
>> Hi Greg
>>
>> On Thu, Oct 25, 2012 at 10:22 PM, Greg Landrum  
>> wrote:
>>> Dear TJ,
>>>
>>> On Fri, Oct 26, 2012 at 12:10 AM, TJ O'Donnell  wrote:
>>>>
>>>> In a recent list of about 100,000 smiles, I ran into 512 that caused
>>>> some problems.
>>>> Basically, the stereochemistry of the canonicalized (isomericSmiles=True) 
>>>> smiles
>>>> gets reversed.  I saw some discussion of this topic a while back, but it 
>>>> seems
>>>> it had not been resolved.
>>>> [15:07:50] Warning: ring stereochemistry detected. The output SMILES
>>>> is not canonical.
>>>>  Any help or input on this?
>>>
>>> From looking at your output, I believe this is a known
>>> canonicalization problem (thus the warning above), not one of
>>> correctness.
>>>
>>> Here's a demonstration using your first example:
>>>
>>> In [5]: 
>>> Chem.CanonSmiles('N#Cc1ccc2[nH]cc([C@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1')
>>> [06:45:48] Warning: ring stereochemistry detected. The output SMILES
>>> is not canonical.
>>> [06:45:48] Warning: ring stereochemistry detected. The output SMILES
>>> is not canonical.
>>> Out[5]: 'N#Cc1ccc2[nH]cc([C@@H]3CC[C@H](N4CCN(c56nccnc65)CC4)CC3)c2c1'
>>>
>>> In [6]: Chem.CanonSmiles(_2)
>>> [06:45:52] Warning: ring stereochemistry detected. The output SMILES
>>> is not canonical.
>>> [06:45:52] Warning: ring stereochemistry detected. The output SMILES
>>> is not canonical.
>>> Out[6]: 'N#Cc1ccc2[nH]cc([C@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1'
>>>
>>> This shows the known problem with "oscillating" specification of
>>> stereochemistry. I believe, however, that the results are correct. In
>>> these molecules what matters is the relative stereochemistry of the
>>> carbons at the 1 and 4 positions, not their absolute stereochemistry.
>>> If that's incorrect, I would love to hear about it.
>>
>> Indeed, these smiles all oscillate between two values.  The canonical
>> ordering is always(?) the same, so that part is not incorrect.  I was hoping
>> that the stereochemistry of an input smiles would somehow be preserved
>> so that it could be reproduced on output.  I believe that the relative
>> stereochemistry of all centers is also preserved and the the oscillation
>> is between complete enantiomers.  What matters (to me) is that I
>> can detect when two structures are identical.  Canonical smiles is
>> usually good for that, but in the case of oscillating smiles, not so much.
>> Is there a amol == bmol python capability?  Should I expect to
>> be able to recognize that two oscillating smiles are the same?
>> ARE they the same?
>> Maybe it is too much to expect that
>> a 2D representation such as smiles (maybe 2.5D with C@H) can be
>> completely understood as a 3D structure.
>>
>>>
>>> The last time I looked at stereochemistry canonicalization, I was
>>> unable to devise a scheme that handled these systems correctly while
>>> still reliably canonicalizing things. It's worth revisiting this at
>>> some point, but this is probably one of those "requires a long block
>>> of uninterrupted concentration" things that are difficult for me to
>>> schedule.
>>>
>> If it comes down to a choice, I think it is more important to preserve
>> the canonical orderin

[Rdkit-discuss] incorrect stereochemistry

2012-10-25 Thread TJ O'Donnell
Hi All

In a recent list of about 100,000 smiles, I ran into 512 that caused
some problems.
Basically, the stereochemistry of the canonicalized (isomericSmiles=True) smiles
gets reversed.  I saw some discussion of this topic a while back, but it seems
it had not been resolved.
[15:07:50] Warning: ring stereochemistry detected. The output SMILES
is not canonical.
 Any help or input on this?
Some offending smiles are below along with the code
I used to test this.  I can provide a file of 512 if you'd like.
I'm using 2012.09.1, freshly compiled from svn
and passing all tests

TJ O'Donnell

---
from rdkit import Chem
import sys
for line in sys.stdin:
  smi = line.split(None,1)[0]
  mol = Chem.MolFromSmiles(smi)
  if mol:
print smi
print Chem.MolToSmiles(mol, isomericSmiles=True)
  else:
print "can't parse smiles"
 my truncated input 
CC1(c2cc(C(F)(F)F)cc(C(F)(F)F)c2)CCN([C@@]2(c3c3)CC[C@H](N3CCN(c4c4Cl)C(=O)C3)CC2)C1=O
Fc1ccc2[nH]cc([C@H]3CC[C@H](N4CCN(c56c5OCCO6)CC4)CC3)c2c1
Fc1ccc2[nH]cc([C@H]3CC[C@H](N4CCN(c56c5OCCO6)CC4)CC3)c2c1
Fc1ccc2[nH]cc([C@H]3CC[C@@H](N4CCN(c56c5OCCO6)CC4)CC3)c2c1
Fc1ccc2[nH]cc([C@H]3CC[C@@H](N4CCN(c56c5OCCO6)CC4)CC3)c2c1
c1ccc(CCN[C@H]2CC[C@H](Nc34cnccc43)CC2)cc1
c1ccc(CCN[C@H]2CC[C@H](Nc34cnccc43)CC2)cc1
c1ccc(CCN[C@@H]2CC[C@H](Nc34cnccc43)CC2)cc1
c1ccc(CCN[C@@H]2CC[C@H](Nc34cnccc43)CC2)cc1
CCCn1c2[nH]c(C3CCC(NC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c(C3CCC(NC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c([C@@H]3CC[C@H](NC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c([C@@H]3CC[C@H](NC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c([C@H]3CC[C@H](NC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c([C@H]3CC[C@H](NC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
O=C(O)[C@H]1CC[C@H](Oc2(Sc3ccc(/C=C/C(=O)N4CCOCC4)c(C(F)(F)F)c3C(F)(F)F)c2)CC1
O=C(O)[C@H]1CC[C@H](Oc2(Sc3ccc(/C=C/C(=O)N4CCOCC4)c(C(F)(F)F)c3C(F)(F)F)c2)CC1
O=C(O)[C@@H]1CC[C@H](Oc2(Sc3ccc(/C=C/C(=O)N4CCOCC4)c(C(F)(F)F)c3C(F)(F)F)c2)CC1
O=C(O)[C@@H]1CC[C@H](Oc2(Sc3ccc(/C=C/C(=O)N4CCOCC4)c(C(F)(F)F)c3C(F)(F)F)c2)CC1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
O=C(NC1CCC(CCN2CCN(c3(Cl)c3Cl)CC2)CC1)c1cccs1
O=C(N[C@@H]1CC[C@@H](CCN2CCN(c3(Cl)c3Cl)CC2)CC1)c1cccs1
Cn1ccc2ccc3c4[nH]c5c(5CCN[C@H]5CC[C@H](O)CC5)c4c4c(c3c21)C(=O)NC4=O
N=C(N)Nc1ccc(CNC(=O)N2CCN(C(=O)O[C@@H]3CCC[C@H](OC(=O)N4CCN(C(=O)n5ccnc5)CC4)CCC3)CC2)cc1
CC(C)c1cc(C(C)C)c(S(=O)(=O)NC[C@H]2CC[C@H](C(=O)NNC(=O)c3cc4c4s3)CC2)c(C(C)C)c1
O=C(CCC[C@@H]1OO[C@H]((=O)c2c2)OO1)c1c1
-- my truncated output ; input smiles/output smiles pairs of lines --
N#Cc1ccc2[nH]cc([C@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@H]3CC[C@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@H]3CC[C@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
O=C(NC1CCC(CCN2CCN(c3(Cl)c3Cl)CC2)CC1)c1cccs1
O=C(NC1CCC(CCN2CCN(c3(Cl)c3Cl)CC2)CC1)c1cccs1
O=C(N[C@@H]1CC[C@@H](CCN2CCN(c3(Cl)c3Cl)CC2)CC1)c1cccs1
O=C(N[C@H]1CC[C@H](CCN2CCN(c3(Cl)c3Cl)CC2)CC1)c1cccs1
Cn1ccc2ccc3c4[nH]c5c(5CCN[C@H]5CC[C@H](O)CC5)c4c4c(c3c21)C(=O)NC4=O
Cn1ccc2ccc3c4[nH]c5c(5CCN[C@@H]5CC[C@@H](O)CC5)c4c4c(c3c21)C(=O)NC4=O
N=C(N)Nc1ccc(CNC(=O)N2CCN(C(=O)O[C@@H]3CCC[C@H](OC(=O)N4CCN(C(=O)n5ccnc5)CC4)CCC3)CC2)cc1
N=C(N)Nc1ccc(CNC(=O)N2CCN(C(=O)O[C@H]3CCC[C@@H](OC(=O)N4CCN(C(=O)n5ccnc5)CC4)CCC3)CC2)cc1
CC(C)c1cc(C(C)C)c(S(=O)(=O)NC[C@H]2CC[C@H](C(=O)NNC(=O)c3cc4c4s3)CC2)c(C(C)C)c1
CC(C)c1cc(C(C)C)c(S(=O)(=O)NC[C@@H]2CC[C@@H](C(=O)NNC(=O)c3cc4c4s3)CC2)c(C(C)C)c1
O=C(CCC[C@@H]1OO[C@H]((=O)c2c2)OO1)c1c1
O=C(CCC[C@@H]1OO[C@H]((=O)c2c2)OO1)c1c1
O=C(CCC[C@@H]1OO[C@@H]((=O)c2c2)OO1)c1c1
O=C(CCC[C@H]1OO[C@H]((=O)c2c2)OO1)c1c1
CCCn1c2[nH]c([C@@H]3CC[C@@H](CNC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c([C@H]3CC[C@H](CNC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c(C3CCC(CNC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c(C3CCC(CNC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
c1cc2c(2N2CCN([C@H]3CC[C@@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@@H]3CC[C@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@H]3CC[C@@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@@H]3CC[C@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@@H]3CC[C@@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@H]3CC[C@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@@H]3CC[C@@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@H]3CC[C@H](c4c[nH]c5c54)CC3)CC2)[nH

[Rdkit-discuss] maximum common substructure

2011-08-14 Thread TJ O'Donnell
Is there a module in rdkit to find the maximum common substructure for
a set of input molecules?

Thanks,
TJ

--
FREE DOWNLOAD - uberSVN with Social Coding for Subversion.
Subversion made easy with a complete admin console. Easy 
to use, easy to manage, easy to install, easy to extend. 
Get a Free download of the new open ALM Subversion platform now.
http://p.sf.net/sfu/wandisco-dev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] postgresql

2011-06-01 Thread TJ O'Donnell
There are binaries available at http://www.postgresql.org/download/
and a nice wiki at http://wiki.postgresql.org/wiki/Main_Page
The postgres community is great - check out the mailing lists
at http://www.postgresql.org/community/

TJ
-
TJ O'Donnell, Ph.D.
President, gNova Inc.
t...@gnova.com


On Wed, Jun 1, 2011 at 6:33 AM, Peter Schmidtke  wrote:
> Hey Paul,
>
> hope you are fine ;)
>
> What system/architecture are you using?
>
> ++
>
> Peter
>
>
> On 01/06/2011, at 15:31, paul.czodrow...@merck.de wrote:
>
>>
>> dear rdkitters,
>>
>> i would like to install postgresql/sqlite. could anyone point to a good
>> tutorial on how to set-up such a system? i know how to use google, but
>> maybe you guys are faster... :)
>>
>>
>> paul
>>
>> This message and any attachment are confidential and may be privileged or
>> otherwise protected from disclosure. If you are not the intended recipient,
>> you must not copy this message or attachment or disclose the contents to
>> any other person. If you have received this transmission in error, please
>> notify the sender immediately and delete the message and any attachment
>> from your system. Merck KGaA, Darmstadt, Germany and any of its
>> subsidiaries do not accept liability for any omissions or errors in this
>> message which may arise as a result of E-Mail-transmission or for damages
>> resulting from any unauthorized changes of the content of this message and
>> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
>> subsidiaries do not guarantee that this message is free of viruses and does
>> not accept liability for any damages caused by any virus transmitted
>> therewith.
>>
>> Click http://disclaimer.merck.de to access the German, French, Spanish and
>> Portuguese versions of this disclaimer.
>>
>>
>> --
>> Simplify data backup and recovery for your virtual environment with vRanger.
>> Installation's a snap, and flexible recovery options mean your data is safe,
>> secure and there when you need it. Data protection magic?
>> Nope - It's vRanger. Get your free trial download today.
>> http://p.sf.net/sfu/quest-sfdev2dev
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
> Peter Schmidtke
>
> -
> PhD Student
> Department of Physical Chemistry
> School of Pharmacy
> University of Barcelona
> Barcelona, Spain
>
>
>
> --
> Simplify data backup and recovery for your virtual environment with vRanger.
> Installation's a snap, and flexible recovery options mean your data is safe,
> secure and there when you need it. Data protection magic?
> Nope - It's vRanger. Get your free trial download today.
> http://p.sf.net/sfu/quest-sfdev2dev
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

--
Simplify data backup and recovery for your virtual environment with vRanger. 
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Data protection magic?
Nope - It's vRanger. Get your free trial download today. 
http://p.sf.net/sfu/quest-sfdev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Contractors working with the RDKit?

2011-03-20 Thread TJ O'Donnell
I am willing and able to do consulting and contract programming using RDKit,
using either Python or C++

http://gnova.com

TJ

TJ O'Donnell, Ph.D.
President, gNova, Inc.
t...@gnova.com

On Sat, Mar 19, 2011 at 10:46 PM, Greg Landrum  wrote:
> Dear all,
>
> I was recently asked if there was anyone out there who was able to do
> contract development work with or on the RDKit. It's a good question,
> but I didn't have a good answer handy. So I'm asking here.
>
> If you currently do, or are willing to do, contract development work
> either extending the RDKit or developing new tools based on the RDKit,
> please reply to this thread. It would be helpful if you indicate your
> comfort level on both the C++ or Python sides. If there's sufficient
> interest/response, I'd be happy to include a section either on
> rdkit.org or on the wiki with names/links.
>
> Thanks,
> -greg
>
> --
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] can't kekulize smiles generated by Chem.MolToSmiles

2011-01-10 Thread TJ O'Donnell
Hi Greg,

As usual, thanks for your quick response.  Yes, these were big molecules.
Let me know if you'd like me to try out any changes.  I can recompile
changes from subversion easily now.  I discovered these four examples
using 1/10 of the chembl database and can try any new code changes
on the entire set of 600K molecules.

TJ


On Sun, Jan 9, 2011 at 10:51 PM, Greg Landrum  wrote:
> Hi TJ,
>
> On Mon, Jan 10, 2011 at 2:37 AM, TJ O'Donnell  wrote:
>> Thanks Greg.  I compiled in the changes and that molfile works fine
>> now, but.
>> Here are four new examples of molfiles that convert to mol and smiles just
>> fine, but the resulting smiles won't parse properly back to a mol.
>>
>> Can you take a look?
>
> Thanks for finding another good bug. The problem here is caused, as
> you probably guessed, by the size of the molecules (specifically by
> the fact that more than 50 rings were open at one point during the
> generation of the SMILES). I will get it fixed for the release.
>
> Best Regards,
> -greg
>

--
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web.   Learn how to 
best implement a security strategy that keeps consumers' information secure 
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl 
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] can't kekulize smiles generated by Chem.MolToSmiles

2011-01-06 Thread TJ O'Donnell
I've stumbled onto a molfile which is read properly (MolFromMolBlock) and
produces a proper smiles (MolToSmiles).  But the smiles generated fails
on Chem.MolFromSmiles.  Can you help figure this one out?
I've attached the molfile in question.

Here is a simple script I used to show this issue.

from rdkit import rdBase
from rdkit import Chem
import sys
print rdBase.boostVersion
print rdBase.rdkitVersion
mb = sys.stdin.read()
mol = Chem.MolFromMolBlock(mb)
if mol:
  smi = Chem.MolToSmiles(mol, isomericSmiles=True)
  print smi
  newmol = Chem.MolFromSmiles(smi)

and the result I get
python rdmol.py <254080.mol
1_40
2010.12.1
[Cl-].CC(C)(C)c1[Te+]c(C(C)(C)C)cc(/C=C/C=C2C=C(C(C)(C)C)OC(C(C)(C)C)=C2)c1
[18:13:52] Can't kekulize mol

I just rebuilt from subversion source - not sure why this version
shows as 2010.12.1
RL: https://rdkit.svn.sourceforge.net/svnroot/rdkit/trunk
Repository Root: https://rdkit.svn.sourceforge.net/svnroot/rdkit
Repository UUID: 19320e9b-7711-0410-929e-f4fff3a11e9f
Revision: 1611
Node Kind: directory
Schedule: normal
Last Changed Author: glandrum
Last Changed Rev: 1611
Last Changed Date: 2011-01-05 00:45:35 -0800 (Wed, 05 Jan 2011)


Thanks,
TJ O'Donnell


254080.mol
Description: Binary data
--
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web.   Learn how to 
best implement a security strategy that keeps consumers' information secure 
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl ___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] reading tag data from string, not file

2010-12-29 Thread TJ O'Donnell
I can see how to read an sd file using SDMolSupplier and using mol.GetProp()
to get the tag data from the file.
But, I have each molblock (chunk of lines between  in an sdf file)
in a separate string.  I don't see a way to get properties from that
molblock string or
even better from the mol=Chem.MolFromMolBlock(molblock)
E.g. mol.GetPropNames() returns a null array (or just the private and
computed props if mol.GetPropNames(True,True)
Can you give me some hints on how I might get the property tag data
from a string molblock?

TJ O'Donnell

--
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Question: modifying default parameters for the RDKit fingerprint?

2010-12-29 Thread TJ O'Donnell
Hi Greg-

No objection here.  I've been using 1024 with 2 bits here.
Are you still using 2048 for the default size?

TJ O'Donnell


On Tue, Dec 28, 2010 at 11:33 PM, Greg Landrum  wrote:
> Dear all,
>
> As I mentioned in an earlier message
> (http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg01430.html),
> the default parameters for the RDKit fingerprint end up setting far
> too many bits for drug-like molecules. The result of this is
> similarity values that are in general too high and more frequent
> occurrences of molecules that are similar to each other only due to
> bit collisions.
>
> The easy solution to this problem is to decrease the number of bits
> set per path found (the nBitsPerHash parameter) from 4 to 2. I propose
> doing this for the Q4 2010 release of the RDKit. The downside is that
> the fingerprints generated with that release will not be compatible
> with fingerprints from earlier releases unless you specify
> nBitsPerHash=4 on your own. The upside is a much more useful
> similarity fingerprint.
>
> Any objections to me making this change?
>
> -greg
>
> --
> Learn how Oracle Real Application Clusters (RAC) One Node allows customers
> to consolidate database storage, standardize their database environment, and,
> should the need arise, upgrade to a full multi-node Oracle RAC database
> without downtime or disruption
> http://p.sf.net/sfu/oracle-sfdevnl
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

--
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] topological fingerprints

2010-12-27 Thread TJ O'Donnell
I was surprised that, using topological fingerprints, the tanimoto
similarity between benzene and toluene is 0.32
Examining the fp bits, I can see why.  But I don't understand why so
many paths are repeated for toluene.
To my way of thinking, paths that trace the same types of atoms should
not be considered different, and therefore
set new bits. Am I missing something?

Here is my sample code:
from rdkit import Chem
from rdkit.Chem import RDKFingerprint
from rdkit import DataStructs
import sys
smiles = ['c1c1', 'Cc1c1']
fps = list()
for smi in smiles:
  mol = Chem.MolFromSmiles(smi)
  fps.append(RDKFingerprint(mol))
  #fps.append(RDKFingerprint(mol, 1, 7, 1024, 3, True, 0.0, 1024))

for fp in fps:
  #print fp.ToBitString()
  i = 0
  bitlist = list()
  for bit in fp:
i += 1
if bit: bitlist.append(i)
  print bitlist

print DataStructs.FingerprintSimilarity(fps[0], fps[1])

and the output I get is:
[12, 18, 57, 72, 180, 199, 558, 590, 712, 858, 990, 999, 1221, 1277,
1446, 1582, 1639, 1787, 1829, 1879, 1914, 1952, 1986, 2021]
[12, 18, 57, 72, 123, 180, 199, 215, 242, 255, 301, 324, 361, 447,
518, 526, 558, 570, 590, 595, 610, 693, 703, 712, 745, 778, 857, 858,
891, 896, 927, 933, 961, 964, 968, 990, 999, 1012, 1022, 1047, 1065,
1090, 1100, 1108, 1134, 1172, 1188, 1221, 1228, 1243, 1268, 1277,
1287, 1297, 1306, 1345, 1446, 1503, 1514, 1538, 1582, 1593, 1622,
1626, 1639, 1665, 1691, 1787, 1829, 1873, 1879, 1914, 1952, 1986,
2021]
0.32

--
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MolFromMolBlock never returns

2010-12-27 Thread TJ O'Donnell
HI Greg

Thanks for the quick reply.  Sure enough, the latest version of rdkit
fixes the problem I was having.
I should have tried that first!  Now that I have the build issues
worked out, a svn update
and make install is pretty quick.

TJ

On Sun, Dec 26, 2010 at 8:31 PM, Greg Landrum  wrote:
> Hi TJ,
>
> 2010/12/24 TJ O'Donnell :
>> I have a mol file that causes MolFromMolBlock to get stuck.
>> I reproduced this problem with this simple python script (below).
>> I've attached the problem input molfile.  I got the file
>> from the chembl08 download.  Another large molfile finishes
>> in seconds, but I stopped this one after about 1 minute.
>> Can you see what might be the problem?
>>
>> I'm afraid I am not using the most recent version, but
>> one I built last July.
>
> There have been some fixes related to handling of large molecules
> since July. Certainly the current state of the code from svn (and
> probably the last release, though I haven't tried this) handles your
> SD file without problems or huge delays (less than half a second on my
> machine).
>
> Best Regards,
> -greg
>

--
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] MolFromMolBlock never returns

2010-12-26 Thread TJ O'Donnell

I have a mol file that causes MolFromMolBlock to get stuck.
I reproduced this problem with this simple python script (below).
I've attached the problem input molfile.  I got the file
from the chembl08 download.  Another large molfile finishes
in seconds, but I stopped this one after about 1 minute.
Can you see what might be the problem?

I'm afraid I am not using the most recent version, but
one I built last July.
From subversion:
URL: https://rdkit.svn.sourceforge.net/svnroot/rdkit/trunk
Repository Root: https://rdkit.svn.sourceforge.net/svnroot/rdkit
Repository UUID: 19320e9b-7711-0410-929e-f4fff3a11e9f
Revision: 1450
Node Kind: directory
Schedule: normal
Last Changed Author: glandrum
Last Changed Rev: 1450
Last Changed Date: 2010-07-08 21:15:09 -0700 (Thu, 08 Jul 2010)

Thanks,
TJ O'Donnell, Ph.D.
President, gNova, Inc.

from rdkit import Chem
import sys
mb = sys.stdin.read()
mol = Chem.MolFromMolBlock(mb)
if mol:
  print len(mb),mol
  print Chem.MolToSmiles(mol, isomericSmiles=False)
 
  CDK3/26/10,13:38

469504  0  0  0  0  0  0  0  0999 V2000
   10.8216  -24.66950. N   0  0  0  0  0  0  0  0  0  0  0  0
   10.8142  -30.38920. C   0  0  0  0  0  0  0  0  0  0  0  0
   11.2281  -29.67280. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.0536  -29.67410. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.4664  -30.39150. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.0474  -31.10860. C   0  0  0  0  0  0  0  0  0  0  0  0
   11.2233  -31.10370. C   0  0  0  0  0  0  0  0  0  0  0  0
   10.8074  -31.81630. C   0  0  0  0  0  0  0  0  0  0  0  0
9.9824  -31.81230. N   0  0  0  0  0  0  0  0  0  0  0  0
9.5730  -31.09550. C   0  0  0  0  0  0  0  0  0  0  0  0
8.7480  -31.09160. C   0  0  0  0  0  0  0  0  0  0  0  0
9.9889  -30.38300. O   0  0  0  0  0  0  0  0  0  0  0  0
8.3321  -31.80410. C   0  0  0  0  0  0  0  0  0  0  0  0
8.3389  -30.37520. N   0  0  0  0  0  0  0  0  0  0  0  0
7.5139  -30.37130. C   0  0  0  0  0  0  0  0  0  0  0  0
7.1048  -29.65480. C   0  0  0  0  0  0  0  0  0  0  0  0
7.0980  -31.08380. O   0  0  0  0  0  0  0  0  0  0  0  0
8.7415  -32.52090. C   0  0  0  0  0  0  0  0  0  0  0  0
9.7280  -33.41670. N   0  0  0  0  0  0  0  0  0  0  0  0
9.0115  -33.82580. C   0  0  0  0  0  0  0  0  0  0  0  0
8.4011  -33.27080. N   0  0  0  0  0  0  0  0  0  0  0  0
9.5665  -32.52490. C   0  0  0  0  0  0  0  0  0  0  0  0
   10.4016  -32.51210. C   0  0  0  0  0  0  0  0  0  0  0  0
   11.2308  -32.50580. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.2914  -30.39420. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.7019  -29.67400. N   0  0  0  0  0  0  0  0  0  0  0  0
   14.1221  -30.37920. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.7017  -28.85570. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.7238  -31.10640. C   0  0  0  0  0  0  0  0  0  0  0  0
   11.6317  -31.78480. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.8988  -31.11770. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.4962  -31.83780. N   0  0  0  0  0  0  0  0  0  0  0  0
   12.8916  -32.56190. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.7164  -32.58150. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.4622  -33.26630. O   0  0  0  0  0  0  0  0  0  0  0  0
   14.1457  -31.87700. N   0  0  0  0  0  0  0  0  0  0  0  0
   14.1117  -33.30560. C   0  0  0  0  0  0  0  0  0  0  0  0
   14.9705  -31.89670. C   0  0  0  0  0  0  0  0  0  0  0  0
   15.3999  -31.19220. C   0  0  0  0  0  0  0  0  0  0  0  0
   15.3659  -32.62070. O   0  0  0  0  0  0  0  0  0  0  0  0
   13.6824  -34.01000. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.9991  -34.77120. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.3731  -35.30850. N   0  0  0  0  0  0  0  0  0  0  0  0
   12.6686  -34.87910. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.8593  -34.07650. N   0  0  0  0  0  0  0  0  0  0  0  0
   12.9851  -28.44690. O   0  0  0  0  0  0  0  0  0  0  0  0
   14.4140  -28.43950. C   0  0  0  0  0  0  0  0  0  0  0  0
   14.4097  -27.61450. N   0  0  0  0  0  0  0  0  0  0  0  0
   15.1309  -28.84870. C   0  0  0  0  0  0  0  0  0  0  0  0
   15.8432  -28.43250. C   0  0  0  0  0  0  0  0  0  0  0  0
   15.9284  -27.61510. N   0  0  0  0  0  0  0  0  0  0  0  0
   16.7345  -27.43940. C   0  0  0  0  0  0  0  0  0  0  0  0
   17.1510  -28.15220. N   0  0  0  0  0  0  0  0  0  0  0  0
   16.6021  -28.76800. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.6931  -27.20570. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.6888  -26.38070. C   0  0  0  0  0  0  0

[Rdkit-discuss] compile issues

2010-07-08 Thread TJ O'Donnell
Hi Greg

I'm trying to build rdkit on a 64-bit redhat system.

g++ (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46)

I built boost 1.43, the latest flex, and got up to this point
building rdkit

[ 82%] Building CXX object 
Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/lex.yysln.cpp.o
Linking CXX shared library libSLNParse.so
/usr/bin/ld: 
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/libboost_regex.a(cpp_regex_traits.o):
 
relocation R_X86_64_32S against `std::basic_string, std::allocator 
 >::_Rep::_S_empty_rep_storage' can not be used when making a shared 
object; recompile with -fPIC
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/libboost_regex.a: 
could not read symbols: Bad value
collect2: ld returned 1 exit status
make[2]: *** [Code/GraphMol/SLNParse/libSLNParse.so] Error 1
make[1]: *** [Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/all] Error 2
make: *** [all] Error 2

Can you help?

Thanks,
TJ O'Donnell

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Canonical SMILES

2009-02-13 Thread TJ O'Donnell

Hi George

Yes, INnChI is unique across different packages.  This is because
there is one definitive source for the code and algorithm.  This was
a design goal of InChI.

TJ O'Donnell

George Oakman wrote:

Hi,
 
Thanks a lot for the speedy response.
 
Yes, this is what I was suspecting - slightly different conventions (in 
this case probably to do with which branch to deal with first) will lead 
to different results.
 
The book I was referring to is An Introduction to Chemoinformatics from 
A.R. Leach and V.J. Gillet. Yes, they refer to the CANGEN algorithm and 
to the Weininger paper you mentioned.
 
It doesn't matter, as long as I'm aware of the scope of 'uniqueness'.
 
Just out of interest, is the InChi representation more 'unique' across 
different packages than canonical SMILES?
 
Thanks again,
 
George.
 


 > From: da...@dalkescientific.com
 > Date: Fri, 13 Feb 2009 18:38:21 +0100
 > To: rdkit-discuss@lists.sourceforge.net
 > Subject: Re: [Rdkit-discuss] Canonical SMILES
 >
 > On Feb 13, 2009, at 6:20 PM, George Oakman wrote:
 > > One of the first example I have been playing with is the canonical
 > > SMILES for Aspirin.
 > ..
 > >
 > > This gave me the following result:
 > >
 > > CC(Oc1c1C(O)=O)=O
 > >
 > > But I was expecting
 > >
 > > CC(=O)Oc1c1C(=O)O)
 >
 > The canonical SMILES is canonical only on the context of an
 > algorithm. The Daylight algorithm is different than the RDKit one is
 > different from the OpenBabel one is different ... . In fact, the
 > Daylight algorithm has changed over time to fix various problems.
 >
 > When that happens, the molecules need to be re-canonicalized.
 >
 > Even if you go back to the original Weininger paper, there are
 > ambiguities in the description which make the result implementation-
 > specific.
 >
 > Is the book you're using "Molecular Design" by Gisbert Schneider and
 > Karl-Heinz Baringhaus? That came up when I searched for "canonical
 > SMILES" and I see it has example of aspirin with your expected SMILES.
 >
 >
 > Andrew
 > da...@dalkescientific.com
 >
 >
 >
 > 
--
 > Open Source Business Conference (OSBC), March 24-25, 2009, San 
Francisco, CA
 > -OSBC tackles the biggest issue in open source: Open Sourcing the 
Enterprise
 > -Strategies to boost innovation and cut costs with open source 
participation
 > -Receive a $600 discount off the registration fee with the source 
code: SFAD

 > http://p.sf.net/sfu/XcvMzF8H
 > ___
 > Rdkit-discuss mailing list
 > Rdkit-discuss@lists.sourceforge.net
 > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



Windows Live Hotmail just got better. Find out more! 
<http://www.microsoft.com/uk/windows/windowslive/products/hotmail.aspx>





--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




[Rdkit-discuss] python2.5 and Numeric

2008-05-22 Thread TJ O'Donnell
I'm new to RDKit and have had good success so far
building it under linux and installing it on windows.
Some of the tests fail (only tested under linux).
I can try to get that resolved later.

The issue I am concerned about is Numeric.
It is not supported under python 2.5
It was relatively easy to find Numeric for python2.5
with Ubuntu (synaptics package manager found it)
but hard to find for windows.
Are there any plans to deliver Numeric for python 2.5
with RDKit, or to move away from dependency on it?

RDKit looks great so far and I'm anxious to try more
with it.

TJ O'Donnell