On Oct 20, 7:07 am, Peter Otten <[EMAIL PROTECTED]> wrote:
> Alfons Nonell-Canals wrote:
> > I have a trouble and I don't know how to solve it. I am working with
> > molecules and each molecule has a number of atoms. I obtain each atom
> > spliting the molecule.
>
> > Ok. It is fine and I have no problem with it.
>
> > The problem is when I have to work with these atoms. These atoms usually
> > are only a letter but, sometimes it can also contain one o more numbers.
> > If they contein a number I have to manipulate them separately.
>
> > If the number was allways the same I know how to identify them, for
> > example, 1:
>
> > atom = 'C1'
>
> > if '1' in atom:
> > print 'kk'
>
> > But, how can I do to identify in '1' all possibilities from 1-9, I
> > tried:
>
> > if '[1-9]', \d,...
>
> > Any comments, please?
>
> http://mail.python.org/pipermail/tutor/1999-March/000083.html
>
> Peter- Hide quoted text -
>
> - Show quoted text -

Wow, that sure is a lot of code.  And I'm not sure the OP wants to
delve into re's just to solve this problem.  Here is the pyparsing
rendition (although it does not handle the recursive computation of
submolecules given in parens, as the Tim Peters link above does):
http://pyparsing.wikispaces.com/file/view/chemicalFormulas.py

The pyparsing version defines chemical symbols and their coefficients
as using the following code:

caps = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
lowers = caps.lower()
digits = "0123456789"

element = Word( caps, lowers )
integer = Word( digits )
elementRef = Group( element + Optional( integer, default="1" ) )
chemicalFormula = OneOrMore( elementRef )


Then to parse a formula like C6H5OH, there is no need to code up a
tokenizer, just call parseString:

elements = chemicalFormula.parseString("C6H5OH")

The URL above links to a better annotated example, included 2 more
extended versions that show how to use the resulting parsed data to
compute the molecular weight of the chemical.

-- Paul
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to