Re: [Rdkit-discuss] how to output multiple Kekule structures

James T. Metz via Rdkit-discuss Mon, 11 Sep 2017 13:54:51 -0700

Paolo,


    Exactly what I was looking for.  Very helpful.  Thank you.
    
    Regards,
    Jim Metz





-----Original Message-----
From: Paolo Tosco <paolo.to...@unito.it>
To: James T. Metz <jamestm...@aol.com>; greg.landrum <greg.land...@gmail.com>; 
rdkit-discuss <rdkit-discuss@lists.sourceforge.net>
Sent: Mon, Sep 11, 2017 2:53 pm
Subject: Re: [Rdkit-discuss] how to output multiple Kekule structures


    Hi Jim,
    
    you can indeed enumerate all Kekulè structures for a molecule within    the 
RDKit using Chem.ResonanceMolSupplier():
    
    
      
        
          
          
        
      
    
    
      
        
          
            
              
from rdkit import Chem

            
          
        
      
    
    
      
      
    
    
      
        
          
            
              
mol = Chem.MolFromSmiles('c1ccccc1')

            
          
        
      
    
    
      
      
    
    
      
        
          
            
              
suppl = Chem.ResonanceMolSupplier(mol, Chem.KEKULE_ALL)

            
          
        
      
    
    
      
      
    
    
      
        
          
            
len(suppl)

          
        
      
    
    
      
      
    
    
      
        
          
            
              
2
            
          
        
      
    
    
      
      
    
    
      
        
          
            
for i in range(len(suppl)):
    print (Chem.MolToSmiles(suppl[i], kekuleSmiles=True))

          
        
      
    
    
      
      
    
    
      
        
          
            
              
C1C=CC=CC=1
C1=CC=CC=C1

            
          
        
      
    
    
      
      
    
    
      
        
          
            
              
                
                  
 

                
              
            
          
        
      
    
    Best,
    Paolo
    
    
On 09/11/2017 05:22 PM, James T. Metz      via Rdkit-discuss wrote:
    
    
Greg,        

        
        
    Thanks!  Yes, very helpful.  I will need to digest the          detailed 
information
        
you have provided.  I am somewhat familiar with recursive          SMARTS.  
Thanks
        
again.
        

        
        
    Regards,
        
    Jim Metz
        

          
          
          
          
-----Original            Message-----
            From: Greg Landrum <greg.land...@gmail.com>
            To: James T. Metz <jamestm...@aol.com>
            Cc: RDKit Discuss            <rdkit-discuss@lists.sourceforge.net>
            Sent: Mon, Sep 11, 2017 11:15 am
            Subject: Re: [Rdkit-discuss] how to output multiple Kekule          
  structures
            
            
              
                
                  

                  
                  
                    
On Mon, Sep 11,                      2017 at 5:55 PM, James T. Metz 
<jamestm...@aol.com>                      wrote:
                      
Greg,                          

                          
                          
    I need to be able to use SMARTS                            patterns to 
identify substructures in                            molecules
                          
that can be aromatic, and I need to be                            able to 
handle cases where there can be
                          
differences in the way that the molecule                            was entered 
or drawn by a user.
                        
                      

                      
                      
                        
That particular                          problem is a big part of the reason 
that we                          tend to use the aromatic representation of     
                     things.
                        
 
                        
                      
                      
                          
                          
                          
    For example, consider the following                            
alkenyl-substituted pyridine, there
                          
are two possible Kekule structures
                          

                          
                          
    m1 = 'C=CC1=NC=CC=C1'
                          
    m2 = 'C=CC1N=CC=CC1'
                        
                      

                      
                      
Fixing what I assume is a typo for m2, I can                        do the 
following:
                      

                      
                      
In [11]: m1 =                        Chem.MolFromSmiles('C=CC1=NC=CC=C1')
                      

                      
                      
In [12]: m2 =                        Chem.MolFromSmiles('C=CC1N=CC=CC=1')
                      

                      
                      
In [13]: q1 = Chem.MolFromSmarts('cccc')
                      

                      
                      
In [14]: q2 = Chem.MolFromSmarts('cccn')
                      

                      
                      
In [15]: list(m1.GetSubstructMatch(q1))
                      
Out[15]: [2, 7, 6, 5]
                      

                      
                      
In [16]: list(m1.GetSubstructMatch(q2))
                      
Out[16]: [6, 5, 4, 3]
                      

                      
                      
In [17]: list(m2.GetSubstructMatch(q1))
                      
Out[17]: [2, 7, 6, 5]
                      

                      
                      
In [18]: list(m2.GetSubstructMatch(q2))
                      
Out[18]: [6, 5, 4, 3]
                      
 
                      

                      
                      
Those particular queries were going for the                        aromatic 
species and will only match inside the                        ring, but if you 
want to be more generic you                        could tune your queries like 
this:
                      
                        

                        
                      
                      
                        
In [28]: q3 
=Chem.MolFromSmarts('[#6;$([#6]=,:[*])]-,=,:[#6;$([#6]=,:[*])]-,=,:[#6;$([#6]=,:[*])]-,=,:[#6;$([#6]-=,:[*])]')
                        

                        
                        
In [29]: q4 
=Chem.MolFromSmarts('[#6;$([#6]=,:[*])]-,=,:[#6;$([#6]=,:[*])]-,=,:[#6;$([#6]=,:[*])]-,=,:[#7;$([#7]-=,:[*])]')
                        

                        
                        
In [30]: list(m1.GetSubstructMatch(q3))
                        
Out[30]: [0, 1, 2, 7]
                        

                        
                        
In [31]: list(m1.GetSubstructMatch(q4))
                        
Out[31]: [0, 1, 2, 3]
                        

                        
                        
In [32]: list(m2.GetSubstructMatch(q3))
                        
Out[32]: [0, 1, 2, 7]
                        

                        
                        
In [33]: list(m2.GetSubstructMatch(q4))
                        
Out[33]: [0, 1, 2, 3]
                        

                        
                      
                      
If you aren't familiar with recursive SMARTS,                        this 
construct: "[#6;$([#6]=,:[*])]" means "a                        carbon that has 
either a double bond or an                        aromatic bond to another 
atom".  So you can                        interpret q3 as "four carbons that 
each have                        either a double or aromatic bond and that are  
                      connected to each other by single, double, or             
           aromatic bonds".
                      

                      
                      
Is this starting to approximate what you're                        looking for?
                      
-greg
                      

                      
                      

                      
                      

                      
                      

                      
                      
                          
                          
                          
    Now consider two SMARTS
                          

                          
                          
    pattern1 = '[C]=[C]-[C]={C]
                          
                          
    pattern2 = '[C]=[C]-[C]=[N]'
                          
                          

                          
                          
    I need to be able to detect the                            existence of 
each pattern in the molecule
                          
                          

                          
                          
    If m1 is the only available generated                            Kekule 
structure, then pattern2 will be                            recognized.
                          
                          
    If m2 is the only available generated                            Kekule  
structure, then pattern1 will be                            recognized.
                          
                          

                          
                          
    Hence, I am getting different answers                            for the 
same input molecule just because
                          
                          
it was drawn in different Kekule                            structures.
                          

                          
                          
    Regards,
                          
                          
    Jim Metz
                          
                          
                            
                              

                              
                              
    
                              
                              

                                
                                
                                
-----Original                                  Message-----
                                  From: Greg Landrum <greg.land...@gmail.com>
                                  To: James T. Metz <jamestm...@aol.com>
                                  Cc: RDKit Discuss 
<rdkit-discuss@lists.sourceforge.net>
                                  Sent: Mon, Sep 11, 2017 10:31 am
                                  Subject: Re: [Rdkit-discuss] how to           
                       output multiple Kekule structures
                                  
                                  
                                    
                                      
                                        
Hi Jim,
                                        

                                        
                                        The code currently has no way to        
                                enumerate Kekule structures. I                  
                      don't recall this coming up in                            
            the past and, to be honest, it                                      
  doesn't seem all that generally                                        
useful.                                         

                                        
                                        
Perhaps there's an                                          alternate way to 
solve the                                          problem; what are you trying 
                                         to do?                                 
         

                                          
                                          
-greg
                                          

                                          
                                        
                                      
                                      

                                        
On                                          Mon, Sep 11, 2017 at 5:04 PM,       
                                   James T. Metz via                            
              Rdkit-discuss <rdkit-discuss@lists.sourceforge.net>               
                           wrote:
                                          
Hello,                                              

                                              
                                              
    Suppose I read in                                                an 
aromatic SMILES e.g.,                                                for benzene
                                              
                                              

                                              
                                              
    c1ccccc1
                                              
                                              

                                              
                                              
    I would like to                                                generate the 
major                                                canonical resonance        
                                        forms
                                              
                                              
and save the results                                                as two 
separate                                                molecules.  Essentially
                                              
I am trying to                                                generate
                                              

                                              
                                              
    m1 =                                                'C1=CC=CC-C1'
                                              
                                              
    m2 = 'C1C=CC=CC1'
                                              
                                              

                                              
                                              
    Can this be done                                                in RDkit?  
I have found                                                a KEKULE_ALL 
                                              
                                              
option in the                                                detailed 
documentation                                                which seems to be 
what I
                                              
am trying to do, but                                                I don't 
understand how                                                  this option is 
to be                                                  used,
                                              
or                                                  the proper syntax.
                                              

                                                
                                              
                                                     If it is necessary         
                                         to somehow renumber                    
                              the atoms and                                     
             re-generate
                                                
                                              
Kekule                                                  structures, that is     
                                             OK.  Thank you.
                                              

                                                
                                              
                                                     Regards,
                                                
                                              
                                                     Jim Metz
                                                
                                              

                                                
                                              

                                                
                                              

                                                
                                              

                                                
                                              

                                              
                                              

                                              
                                            
------------------------------------------------------------------------------
                                            Check out the vibrant tech          
                                  community on one of the                       
                     world's most
                                            engaging tech sites,                
                            Slashdot.org! http://sdm.link/slashdot
_______________________________________________
                                            Rdkit-discuss mailing list
                                            Rdkit-discuss@lists.sourceforge.net
                                            
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
                                            
                                          
                                        
                                        
                                      
                                    
                                  
                                
                              
                            
                          
                        
                    
                    
                  
                
              
            
          
        
            
      
      
      
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
      
      
      
      
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] how to output multiple Kekule structures

Reply via email to