On 3/12/2011 8:47 PM, Glenn Linderman wrote:
On 3/12/2011 2:09 PM, Terry Reedy wrote:
I believe that if the integer field were padded with leading blanks as
needed so that all are the same length, then no key would be needed.

Did you mean that "if the integer field were" converted to string and
"padded with leading blanks..."?

Guido presented a use case of a list a strings, each of form '%s,%d', where the %s part is a 'word'. 'Integer field' refers to the part of each string after the comma.

Otherwise I'm not sure how to pad an integer with leading blanks.

The integers are already in string form. The *existing* key function his colleague used converted that part to an int as the second part of a tuple. I presume the integer field was separated by split(','), so the code was something like

def sikey(s):
  s,i = s.split(',')
  return s,int(i)

longlist.sort(key=sikey)

It does not matter if the splitting method is more complicated, because it is already part of the problem spec. I proposed instead

def sirep(s):
  s,i = s.split(',') # or whatever current key func does
  return '%s,%#s' % (s,i)
  # where appropriate value of # is known from application

longlist = map(sirep, longlist)
longlist.sort()

# or assuming that a simple split is correct

longlist = ['%s,%#s' % tuple(s.split(',')) for s in longlist]
longlist.sort()

Also, what appears to be your revised data structure, strval + ',' +
'%5.5d' % intval , assumes the strval is fixed length, also.

No it does not, and need not. ',' precedes all letters in ascii order. (Ok, I assumed that the 'word' field does not include any of !"#$%&'()*+. If that is not true, replace comma with space or even a control char such as '\a' which even precedes \t and \n.) Given the context of Google, I assumed that 'word' meant word, as in a web document, while the int might be a position or doc number (or both). The important point is that the separator cause all word-int pairs with the same word to string-sort before all word-int pairs with the same word + a suffix. My example intentionally tested that.

Consider the following strval, intval pairs, using your syntax:

['a,997, 1','a, 1000']

Nothing says the strval wouldn't contain data that look like your
structure...

The problem as presented. 'a,997' is not a word. In any case, as I said before, the method of correctly parsing the strings into two fields is already specified. I am only suggesting a change in how to proceed thereafter.

--
Terry Jan Reedy

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to