In article <[EMAIL PROTECTED]>,
Steve <[EMAIL PROTECTED]> wrote:
>Hi All,
>I'm having a tough time converting the following regex.compile patterns
>into the new re.compile format.  There is also a differences in the
>regsub.sub() vs. re.sub()
>Could anyone lend a hand?
>import regsub
>import regex
>import re    # << need conversion to this module
>    """Convert perl style format symbology to printf tokens.
>    Take a string and substitute computed printf tokens for perl style
>    format symbology.
>    For example:
>        ###.##    yields %6.2f
>        ########  yields %8d
>        <<<<<     yields %-5s
>    """

Perhaps not optimal, but this processes things as requested. Note that
all floats have to be done before any integer patterns are replaced.


import re

"""Convert perl style format symbology to printf tokens.
Take a string and substitute computed printf tokens for perl style
format symbology.

For example:

###.##    yields %6.2f
########  yields %8d
<<<<<     yields %-5s

# handle cases where there's no integer or no fractional chars
floatPattern = re.compile(r'(?<!\\)(#+\.(#*)|\.(#+))')
integerPattern = re.compile(r'(?<![\\.])(#+)(?![.#])')
leftJustifiedStringPattern = re.compile(r'(?<!\\)(<+)')
rightJustifiedStringPattern = re.compile(r'(?<!\\)(>+)')

def float_sub(matchobj):
    # fractional part may be in either groups()[1] or groups()[2]
    if matchobj.groups()[1] is not None:
        return "%%%d.%df" % (len(matchobj.groups()[0]),
        return "%%%d.%df" % (len(matchobj.groups()[0]),

def unperl_format(s):
    changed_things = 1
    while changed_things:
        # lather, rinse and repeat until nothing new happens
        changed_things = 0

        mat_obj =
        if mat_obj:
            s = re.sub(leftJustifiedStringPattern, "%%-%ds" %
                       len(mat_obj.groups()[0]), s, 1)
            changed_things = 1

        mat_obj =
        if mat_obj:
            s = re.sub(rightJustifiedStringPattern, "%%%ds" %
                       len(mat_obj.groups()[0]), s, 1)
            changed_things = 1

        # must do all floats before ints
        mat_obj =
        if mat_obj:
            s = re.sub(floatPattern, float_sub, s, 1)
            changed_things = 1
            # don't fall through to the int code
        mat_obj =
        if mat_obj:
            s = re.sub(integerPattern, "%%%dd" % len(mat_obj.groups()[0]),
                       s, 1)
            changed_things = 1
    return s

if __name__ == '__main__':
   testarray = ["integer: ####, integer # integer at end #",
                "float ####.## no decimals ###. no int .### at end ###.",
                "Left string  <<<<<< short left string  <",
                "right string >>>>>> short right string >",
                "escaped chars \\#### \\####.## \\<\\<<<< \\>\\><<<"]

   for s in testarray:
       print("Testing: %s" % s)
       print "Result: %s" % unperl_format(s)

Running this gives

Testing: integer: ####, integer # integer at end #
Result: integer: %4d, integer %1d integer at end %1d

Testing: float ####.## no decimals ###. no int .### at end ###.
Result: float %7.2f no decimals %4.0f no int %4.3f at end %4.0f

Testing: Left string  <<<<<< short left string  <
Result: Left string  %-6s short left string  %-1s

Testing: right string >>>>>> short right string >
Result: right string %6s short right string %1s

Testing: escaped chars \#### \####.## \<\<<<< \>\><<<
Result: escaped chars \#%3d \#%6.2f \<\<%-3s \>\>%-3s


Jim Segrave           ([EMAIL PROTECTED])


Reply via email to