Dharmjeet Kumar wrote:
>
> *Objective: * Extract Currency Value (not Date) from given file.
>
>  ...
>
> *Problem with output: * I am getting date in currency value which it
> should not. What alteration should I do in my regex to complete???

This is a tricky proposition.  Your problem statement doesn't address a
number of other special cases.  For example, what about ++123,456.00 and
--123?  Should it match the +123,456.00 part and the -123 part?  Or
should those be completely ignored?  What about "money+123,456"?  What
about "QR345.00"?

If you think about this in words, you want to change your regular
expression so that it doesn't match a sequence if the character
immediate before could possibly have been part of a number, or if the
character immediately following is a + or -.  For that, you need a
"negative lookbehind assertion" and a "negative lookahead assertion."

So, if you add (?<![+-,]|\d) to the beginning and (?![+-]) to the end,
it seems to do what you want.  It will reject "++123,456.00" and
"--123", but it matches the number in "money+123,456".  You might decide
that "a money amount must always be preceded by a space", which would
skip that last one as well.

In your example, the money amount are always preceded by "QR ".  If that
is part of the requirement, that's easier to handle.  You can change the
"negative lookbehind assertion" at the front to a normal lookbehind
assertion: (?<=QR )

By the way, you have several style issues in this code.

        for i in range(l2.__len__()):

            newstr2  = l2[i]

ANY time you have code that says "for i in range(len(...))", it is
better to just enumerate through the sequence itself.  You don't need
"i" in the loop:

        for newstr2 in l2:


Then, in the inner loop, you have:

            for i in range(len(val_currency)):

                val2 =  val_currency[i]

Problem #1 is that you have re-used the loop variable here.  You can't
have "i" for both the inner and outer loops.  However, once again you
don't need the "i" at all:

            for val2 in val_currency:       

-- 
Tim Roberts, t...@probo.com
Providenza & Boekelheide, Inc.

_______________________________________________
python-win32 mailing list
python-win32@python.org
http://mail.python.org/mailman/listinfo/python-win32

Reply via email to