On Dec 2, 2016, at 5:46 PM, Brian Kelley wrote:
> I hacked a version of RDKit's smiles parser to compute heavy atom count,
> perhaps some version of this could be used to check smiles validity without
> making the actual molecule.
FWIW, here's my regex code for it, which makes the assumption that only "[H]"
and anything with a "*" are not heavy.
_atom_pat = re.compile(r"""
(
Cl? |
Br? |
[NOSPFIbcnosp] |
\[[^]]*\]
)
""", re.X)
def get_num_heavies(smiles):
num_atoms = 0
for m in _atom_pat.finditer(smiles):
text = m.group()
if text == "[H]" or "*" in text:
continue
num_atoms += 1
return num_atoms
Thus turns out to be a quite handy piece of functionality.
Andrew
[email protected]
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss