RDkit Discussion Group,



    I have written a SMARTS to detect vicinal chlorine groups

using RDkit.  There are 4 atoms involved in a vicinal chlorine group.


SMARTS = '[Cl]-[C,c]-,=,:[C,c]-[Cl]'


    I am trying to count the number of ("unique") occurrences of this

pattern.


    For some molecules with symmetry, this results in

over-counting.
   

    For the molecule, smiles1 below, I want to obtain

a count of 1 i.e., 1 tuple of 4 atoms.


    smiles1 = 'ClC(Cl)CCl'

    

    However, using the SMARTS above, I obtain 2 tuples of 4 atoms.  
Beginning with a MOL file representation of smiles1, I get


    ((1,2,4,3), (0,2,4,3))



    One possible solution is to somehow merge the two tuples according 

to a "rule."  One rule that works is "if 3 of the atom indices are the same, 
then combine into one tuple."


    However, the rule needs a bit of modification for more complicated
cases (higher symmetry).


    Consider



    smiles2 = 'ClC(Cl)CCl(Cl)(Cl)



    My goal is to get 2 tuples of 4 atoms for smiles2



    smiles2 is somewhat tricky because there are either

2 groups of 3 (4 atom) tuples, or 3 groups of 2 (4 atom)
tuples depending on how you choose your 3 atom indices.


    Again, if my goal is to get 2 tuples, then I need to somehow

pick the largest group, i.e., 2 groups of 3 tuples to do the merge 
operation which will give me 2 remaining groups (desired).


    I have already checked stackoverflow and a few other places

for PYTHON code to do the necessary merging, but I could not
find anything specific and appropriate.


    I would be most grateful if anyone has ideas how to do this.  I

suspect the answer is a few lines of well-written PYTHON code, 
and not modifying the SMARTS (I could be mistaken!).


    Thank you.



    Regards,

    Jim Metz




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to