On 3/12/2011 8:47 PM, Glenn Linderman wrote:
On 3/12/2011 2:09 PM, Terry Reedy wrote:
I believe that if the integer field were padded with leading blanks as
needed so that all are the same length, then no key would be needed.
Did you mean that "if the integer field were" converted to string and
"padded with leading blanks..."?
Guido presented a use case of a list a strings, each of form '%s,%d',
where the %s part is a 'word'. 'Integer field' refers to the part of
each string after the comma.
Otherwise I'm not sure how to pad an integer with leading blanks.
The integers are already in string form. The *existing* key function his
colleague used converted that part to an int as the second part of a
tuple. I presume the integer field was separated by split(','), so the
code was something like
def sikey(s):
s,i = s.split(',')
return s,int(i)
longlist.sort(key=sikey)
It does not matter if the splitting method is more complicated, because
it is already part of the problem spec. I proposed instead
def sirep(s):
s,i = s.split(',') # or whatever current key func does
return '%s,%#s' % (s,i)
# where appropriate value of # is known from application
longlist = map(sirep, longlist)
longlist.sort()
# or assuming that a simple split is correct
longlist = ['%s,%#s' % tuple(s.split(',')) for s in longlist]
longlist.sort()
Also, what appears to be your revised data structure, strval + ',' +
'%5.5d' % intval , assumes the strval is fixed length, also.
No it does not, and need not. ',' precedes all letters in ascii order.
(Ok, I assumed that the 'word' field does not include any of
!"#$%&'()*+. If that is not true, replace comma with space or even a
control char such as '\a' which even precedes \t and \n.) Given the
context of Google, I assumed that 'word' meant word, as in a web
document, while the int might be a position or doc number (or both). The
important point is that the separator cause all word-int pairs with the
same word to string-sort before all word-int pairs with the same word +
a suffix. My example intentionally tested that.
Consider the following strval, intval pairs, using your syntax:
['a,997, 1','a, 1000']
Nothing says the strval wouldn't contain data that look like your
structure...
The problem as presented. 'a,997' is not a word. In any case, as I said
before, the method of correctly parsing the strings into two fields is
already specified. I am only suggesting a change in how to proceed
thereafter.
--
Terry Jan Reedy
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com