In article <[EMAIL PROTECTED]>, Paul McGuire <[EMAIL PROTECTED]> wrote: >"Jim Segrave" <[EMAIL PROTECTED]> wrote in message >news:[EMAIL PROTECTED] >> In article <[EMAIL PROTECTED]>, >> Paul McGuire <[EMAIL PROTECTED]> wrote: >> >> >Not an re solution, but pyparsing makes for an easy-to-follow program. >> >TransformString only needs to scan through the string once - the >> >"reals-before-ints" testing is factored into the definition of the >> >formatters variable. >> > >> >Pyparsing's project wiki is at http://pyparsing.wikispaces.com. >> >> If fails for floats specified as ###. or .###, it outputs an integer >> format and the decimal point separately. It also ignores \# which >> should prevent the '#' from being included in a format. >> >Ah! This may be making some sense to me now. Here are the OP's original >re's for matching. > >exponentPattern = regex.compile('\(^\|[^\\#]\)\(#+\.#+\*\*\*\*\)') >floatPattern = regex.compile('\(^\|[^\\#]\)\(#+\.#+\)') >integerPattern = regex.compile('\(^\|[^\\#]\)\(##+\)') >leftJustifiedStringPattern = regex.compile('\(^\|[^\\<]\)\(<<+\)') >rightJustifiedStringPattern = regex.compile('\(^\|[^\\>]\)\(>>+\)') > >Each re seems to have two parts to it. The leading parts appear to be >guards against escaped #, <, or > characters, yes? The second part of each >re shows the actual pattern to be matched. If so: > >It seems that we *don't* want "###." or ".###" to be recognized as floats, >floatPattern requires at least one "#" character on either side of the ".". >Also note that single #, <, and > characters don't seem to be desired, but >at least two or more are required for matching. Pyparsing's Word class >accepts an optional min=2 constructor argument if this really is the case. >And it also seems that the pattern is supposed to be enclosed in ()'s. This >seems especially odd to me, since one of the main points of this funky >format seems to be to set up formatting that preserves column alignment of >text, as if creating a tabular output - enclosing ()'s just junks this up. >
The poster was excluding escaped (with a '\' character, but I've just looked up the Perl format statement and in fact fields always begin with a '@', and yes having no digits on one side of the decimal point is legal. Strings can be left or right justified '@<<<<', '@>>>>', or centred '@||||', numerics begin with an @, contain '#' and may contain a decimal point. Fields beginning with '^' instead of '@' are omitted if the format is a numeric ('#' with/without decimal). I assumed from the poster's original patterns that one has to worry about '@', but that's incorrect, they need to be present to be a format as opposed to ordinary text and there's appears to be no way to embed a '@' in an format. It's worth noting that PERL does implicit float to int coercion, so it treats @### the same for ints and floats (no decimal printed). For the grisly details: http://perl.com/doc/manual/html/pod/perlform.html -- Jim Segrave ([EMAIL PROTECTED]) -- http://mail.python.org/mailman/listinfo/python-list