On Wed, 8 Jun 2005 09:49:51 -0600, "Ara.T.Howard" <[EMAIL PROTECTED]> wrote:
> >hi- > >i know nada about python so please forgive me if this is way off base. i'm >trying to fix a bug in MoinMoin whereby > > WordsWithTwoCapsInARowLike > ^^ > ^^ > ^^ > >do not become WikiNames. this is because the the wikiname pattern is >basically > > /([A-Z][a-z]+){2,}/ > >but should be (IMHO) > > /([A-Z]+[a-z]+){2,}/ That would take care of the example above, but does it change an official spec? > >however, the way the patterns are constructed like > > word_rule = > ur'(?:(?<![%(l)s])|^)%(parent)s(?:%(subpages)s(?:[%(u)s][%(l)s]+){2,})+(?![%(u)s%(l)s]+)' > % { > 'u': config.chars_upper, > 'l': config.chars_lower, > 'subpages': config.allow_subpages and (wikiutil.CHILD_PREFIX + '?') or > '', > 'parent': config.allow_subpages and (ur'(?:%s)?' % > re.escape(PARENT_PREFIX)) or '', > } > > >and i'm not that familiar with python syntax. to me this looks like a map >used to bind variables into the regex - or is it binding into a string then >compiling that string into a regex - regexs don't seem to be literal objects >in pythong AFAIK... i'm thinking i need something like > > word_rule = > ur'(?:(?<![%(l)s])|^)%(parent)s(?:%(subpages)s(?:[%(u)s]+[%(l)s]+){2,})+(?![%(u)s%(l)s]+)' > % { > ^ > ^ > ^ > 'u': config.chars_upper, > 'l': config.chars_lower, > 'subpages': config.allow_subpages and (wikiutil.CHILD_PREFIX + '?') or > '', > 'parent': config.allow_subpages and (ur'(?:%s)?' % > re.escape(PARENT_PREFIX)) or '', > } > >and this seems to work - but i'm wondering what the 's' in '%(u)s' implies? >obviously the u is the char range (unicode?)... but what's the 's'? 'u' doesn't stand for unicode here. It is the key to look up config.chars_upper from the dict. That could be unicode, and probably is. The 's' is the final part of a formatting spec which says how to convert the data looked up, and 's' is for string, which doesn't change string data (unless, and UIAM, a conversion to unicode is required). All of the above is making use of the % operator of strings, as in the expression fmt % data where fmt is a string containing ordinary characters and formatting specs in the form of substrings escaped by a leading character '%'. The formatting specs take two basic alternative forms: %<spec> or %(name)<spec>. If any '%' is followed by a parenthesized name, as in '%(u)s' it means that the data to be formatted is retrieved from data['u'] for the latter example. If there is no parenthesized name, the data is retrieved from data[i] where data must be a tuple and i is the positional count of format specs in fmt. In some cases where there is no ambiguity, and there is only one datum, data[0] may be written as the non-tuple value expression, e.g., instead of (123,) that data could be written as (123,)[0] or plain 123. In the word_rule above, %(u)s uses 'u' as a key to get data from the dictionary { 'u': config.chars_upper, ...} to substitute in the [%(u)s] as a string (that's what the 's' specifies), so config.chars_upper will presumably have had a string value such as u'ABC..Z' and that will then be inserted in place of the %(u)s to get u'...[ABC..Z]...' (if fmt is unicode, the resulting string will be unicode, UIAM) > >i'm looking at > > http://docs.python.org/lib/re-syntax.html > http://www.amk.ca/python/howto/regex/ > See also http://www.python.org/doc/current/lib/typesseq-strings.html (which IMO should be easier to find, but if you click on the index square at the top right of any library reference page, you can see a "%formatting" link) >and coming up dry. sorry i don't have more time to rtfm - just want to >implement this simple fix and get on to fcgi configuration! ;-) > >cheers. > >-a >-- >=============================================================================== >| email :: ara [dot] t [dot] howard [at] noaa [dot] gov >| phone :: 303.497.6469 >| My religion is very simple. My religion is kindness. >| --Tenzin Gyatso >=============================================================================== > Regards, Bengt Richter -- http://mail.python.org/mailman/listinfo/python-list