There are four scripts attached with this email and one compressed-format LyX document:

sortable.py, sortable_help.py -- a script for sorting tables in LyX, & its help file sortlist.py, sortlist_help.py -- a script for sorting lists in LyX, & its help file
pLyXSorting(compressed).lyx -- an explanatory document with examples

The LyX document is in compressed LyX format. For the examples to work it must be saved in uncompressed format. (The pLyX system depends on LyX files being *text* files.) It contains some example tables and a substantial multi-level list to play with.

The table sorting script reworks for the pLyX system a script aired on this list back in September. On the developer's list I've noticed recent work to swap columns or swap rows in tables. I imagine once that is done, the next step will be hard to resist: go the whole way and sort the table. This might be a short-lived script.

The list sorting script is new and was harder to write because of the recursive calls to the sorting routine required by the (possible) presence of sub-lists. My underlying interest in sorting lists arose from a wish to sort indexes. (With their headings, subheadings, sub-subheadings, ... the logic is the same.)

Andrew


def helpnote(hv):
    if hv > 1:
        return header + version
    else:
        return header + tail

header = r'''\begin_layout LyX-Code
\family roman
\series bold
.sort table
\end_layout
'''
version = r'''\begin_layout LyX-Code
\family roman
Version 1.0 (15 December 2012) Columns can be sorted more than once for
 inter-filed mixed-case alphabetical sorting.
\end_layout
\begin_layout LyX-Code
\family roman
Version 0.4 (1 November 2012) First version for pLyX system.
\end_layout
\begin_layout LyX-Code
\family roman
Version 0.3 (17 September 2012) Use of custom insets; yellow notes option,
 hrules and vrules preserved.
\end_layout
\begin_layout LyX-Code
\family roman
Version 0.2 (13 September 2012) Script now ignores ERT insets.
\end_layout
\begin_layout LyX-Code
\family roman
Version 0.1 (12 September 2012 Table sorting script posted to user's list.
\end_layout
'''
tail = r'''\begin_layout LyX-Code
\family roman
Sort the
\emph on
 rows
\emph default
 by the values in specified columns.
\end_layout
\begin_layout LyX-Code
\end_layout
\begin_layout LyX-Code
\family roman
\series bold
Global options
\end_layout
\begin_layout LyX-Code
\family roman
-
\series bold
h  --help
\series default
      show this help note.
\end_layout
\begin_layout LyX-Code
\family roman
\series bold
-v --version
\series default
  show version information.
\end_layout
\begin_layout LyX-Code
\family roman
\series bold
-n  --notes
\series default
    make LyX's (yellow) notes sortable; notes are
 sort-neutral by default.
\end_layout
\begin_layout LyX-Code
\end_layout
\begin_layout LyX-Code
\family roman
\series bold
Local options
\end_layout
\begin_layout LyX-Code
\family roman
A sort specification is a sequence like 
\series bold
2a3A1+
\series default
 (or 
\series bold
2a 3A 1+
\series default
 or 
\series bold
2a
\series default
,
\series bold
 3A
\series default
,
\series bold
 1+
\series default
, etc.) where the number indicates the column and the qualifying letter or
 sign indicates the kind of sort. A specification may involve from one to
 all columns in the table and a column may appear in the spec. more than once.
 The primary sort is by the first column specified, the secondary sort by
 the second column, etc. 
\end_layout
\begin_layout Itemize
\family roman
\series bold
a, A, +
\series default
 indicate ascending sorts; 
\end_layout
\begin_layout Itemize
\family roman
\series bold
z, Z, -
\series default
 indicate descending sorts; 
\end_layout
\begin_layout Itemize
\family roman
letters indicate alphabetical sorts, uppercase indicating case sensitivity;
\end_layout
\begin_layout Itemize
\family roman
\series bold
+, -
\series default
 indicate numerical sorts.
\end_layout
\begin_layout LyX-Code
\family roman
The 
\emph on
next
\emph default
 and subsequent rows of a table following a sort specification are sorted.
 For neat alphabetical sorts involving inter-filed mixed case, specify columns
 twice, e.g. 
\series bold
1a1A
\series default
 for an AaBbCc ... sort, or
\series bold
1a1Z
\series default
 for an aAbBcC ... sort.
\end_layout
'''




def helpnote(hv):
    if hv > 1:
        return header + version
    else:
        return header + tail

header = r'''\begin_layout LyX-Code
\family roman
\series bold
.sort list
\end_layout
'''
version = r'''\begin_layout LyX-Code
\family roman
Version 1.0 (16 December 2012) Allow secondary, tertiary, etc. sorts.
\end_layout
\begin_layout LyX-Code
\family roman
Version 0.2 (13 December 2012) Include -a and -n options.
\end_layout
\begin_layout LyX-Code
\family roman
Version 0.1 (8 November 2012)
\end_layout
'''
tail = r'''\begin_layout LyX-Code
\family roman
Sort one or more lists and sub-lists.
\begin_layout LyX-Code

\end_layout
\begin_layout LyX-Code
\family roman
\series bold
Global options
\end_layout
\begin_layout LyX-Code
\family roman
\series bold
-h --help  
\series default
     show this help message.
\end_layout
\begin_layout LyX-Code
\family roman
\series bold
-v --version 
\series default
 show version information.
\end_layout
\begin_layout LyX-Code
\family roman
\series bold
-a  
\series default
                sort across changes of list-type.
\end_layout
\begin_layout LyX-Code
\family roman
\series bold
-c 
\emph on
n
\emph default
\series default
              
\emph on
n
\emph default
 = number of characters in sort-key
 strings; default is 20 characters.
\end_layout
\begin_layout LyX-Code
\family roman
\series bold
-n  --notes
\series default
    make LyX's (yellow) notes sortable; notes are sort-neutral by default.
\end_layout
\begin_layout LyX-Code
 
\end_layout
\begin_layout LyX-Code
\family roman
A change of list type at the top level (e.g.
 from 
\family sans
Itemize
\family roman
 to 
\family sans
Description
\family roman
) usually halts a sort; the 
\series bold
-a
\series default
 option ensures the sort continues across the list-type boundary.
 (The result is generally a mess, but a sorted one.) Sorting always
 continues across list-type boundaries in sub-lists (but, again,
 generates a sorted mess).
\end_layout
\begin_layout LyX-Code

\end_layout
\begin_layout LyX-Code
\family roman
\series bold
Local options
\end_layout
\begin_layout LyX-Code
\family roman
A sort specification is a sequence like 
\series bold
1a2a3+
\series default
 where the number indicates the list level (1 = first or top level, 2 =
 sub-list, etc.) and the qualifying letter or sign indicates the kind
 of sort. A specification may involve from one to (potentially) all six list
 levels; if desired, only sub-levels may be sorted. For secondary, tertiary ...
 sorts, separate the sort specs. by a solidus, e.g.
 \series bold
1a2a3+/1A2A
\series default
.
\end_layout
\begin_layout Itemize
\family roman
\series bold
a, A, +
\series default
 indicate ascending sorts; 
\end_layout
\begin_layout Itemize
\family roman
\series bold
z, Z, -
\series default
 indicate descending sorts; 
\end_layout
\begin_layout Itemize
\family roman
letters indicate alphabetical sorts, uppercase indicating case sensitivity;
\end_layout
\begin_layout Itemize
\family roman
\series bold
+, -
\series default
 indicate numerical sorts.
\end_layout
\begin_layout LyX-Code
\family roman
The specification is placed either 
\emph on
before
\emph default
 or within a 
\emph on
top-level
\emph default
 item in a list. The 
\emph on
next
\emph default
 and subsequent items in a list following a sort specification are sorted. If
 the 
\emph on
 kind
\emph default
 of list changes (say from
\family sans
 Labeling
\family roman
 to
\family sans
 Description
\family roman
), the sort stops at the change, unless the 
\series bold
-a
\series default
 option has been invoked.
\end_layout
'''




# Sort the rows of a table according to values in specified
# columns. Part of the pLyX system; not an independent script.
#
# sortable.py
#
# Andrew Parsloe (apars...@clear.net.nz)
#
import re, argparse
from operator import itemgetter

def main(infl, outfl, opts, guff):
    '''Sort table rows according to the values in specified columns'''
    
    # no change to LyX header material required
    outfl.write(guff)
    guff = ''
    
    # parse options; this is unnecessarily complicated but it allows
    # for additions to the option list
    parser = argparse.ArgumentParser(description = 'Sort notes?', version = '1.0')

    parser.add_argument('-n', '--notes', dest='n', action ='store_true', \
                    default = False, help="Makes LyX's (e.g. yellow) notes sortable")
    sort_notes = parser.parse_args(opts).n

    flex_sortable = r'\begin_inset Flex .sort table'
    row_start = r'<row'
    cell_start = r'<cell'
    cell_end = r'</cell>'
    row_end = r'</row>'
    table_end = r'</lyxtabular>'
    re_lyxcmds = re.compile(r'(\\\w+)|(status (open|collapsed))')
    begin_inset = r'\begin_inset'
    end_inset = r'\end_inset'
    begin_text = r'\begin_inset Text'
    bottom = r'bottom'
    botline = r'bottomline="true" '
    topline = r'topline="true" '

    # table_status
    # -1 = read & write lines unaltered; 0 = no marker yet; 1 = get sort spec;
    # 2 = find next row; 3 = this row; 4 = this cell; 5 = ERT|Note inset
    table_status = 0
    current_row = ''
    rows = []
    
    for line in infl:
        if table_status == 0:   #no marker yet
            outfl.write(line)
            if flex_sortable in line:
                table_status += 1
                sort_spec = ''
                insets = 1
        
        elif table_status == 1: #get sort spec
            outfl.write(line)
            if begin_inset in line:
                insets += 1
            elif end_inset in line:
                columns = re.findall(r'\d+',sort_spec)
                colsorts = re.findall(r'[-+AZaz]',sort_spec)
                numsorts =  len(columns)
                qeys = ['' for i in range(numsorts)]
                updn = [False for i in range(numsorts)]
                # reverse order since primary sort is last one done
                cols = [int(columns[numsorts-i-1]) for i in range(numsorts)]
                sort_type = [colsorts[numsorts-i-1] for i in range(numsorts)]
                insets -= 1
                table_status += 1   
            elif re_lyxcmds.search(line):
                continue
            else:
                sort_spec += line.strip()
            
        elif table_status == 2: # find next row
            if row_start == line[:4]:
                current_row = line
                colnum = 0
                table_status += 1
            # end of table
            elif table_end in line:
                for i in range(numsorts):
                    rows.sort(key=itemgetter(i), reverse = updn[i])
                numrows = len(rows)
                r = 0
                for item in rows:
                    r += 1
                    if r == numrows:
                        # restore the bottom hrule
                        item[numsorts] = item[numsorts].replace(topline, topline + botline)
                    outfl.write(item[numsorts])
                outfl.write(line)
                table_status = 0
                current_row = ''
                rows = []
            else:
                outfl.write(line)
            
        elif table_status == 3: # this row
            if cell_start == line[:5]:
                colnum += 1
                if bottom in line:
                    line = line.replace(botline, '') # del bottom hrule
                current_row += line
                if colnum in cols:
                    table_status += 1
                    qey = ''
            elif row_end == line[:6]:
                current_row += line
                table_status -= 1
                rows.append(qeys + [current_row])
            else:
                current_row += line

        elif table_status == 4:  # this cell
            current_row += line
            if cell_end == line[:7]:
                c = -1
                for i in range(cols.count(colnum)):
                    # in case same col. sorted more than once
                    c = cols.index(colnum, c + 1)
                    if sort_type[c] in 'az':
                        qeys[c] = qey.lower()
                    elif  sort_type[c] in 'AZ':
                        qeys[c]= qey
                    elif sort_type[c] in '+-':
                        qeys[c]= float(qey)
                    if  sort_type[c] in 'zZ-':
                        updn[c] = True # reverse order
                table_status -= 1
            elif begin_inset in line:
                if 'Text' in line:
                    continue
                elif sort_notes and 'Note Note' in line:
                    continue
                else:
                    table_status += 1
                    insets = 1
            elif re_lyxcmds.match(line):
                continue
            else:
                qey += line.strip()
                
        elif table_status == 5: # ert inset
            current_row += line
            if begin_inset in line:
                insets += 1
            elif end_inset in line:
                insets -= 1
                if insets == 0:
                    table_status -= 1

        else:
            outfl.write(line)
            
    return 1


# Sort a list & sub-lists according to a sort specifation. 
# Part of of the pLyX system; not an independent script. 
#
# Andrew Parsloe (apars...@clear.net.nz)
#
import re, argparse, tempfile
from operator import itemgetter

re_lyxcmds = re.compile(r'(\\\w+)|(status (open|collapsed))')
re_itemkind = re.compile(r'\\begin_layout (Labeling|Enumerate|Description|Itemize)')

begin_inset = r'\begin_inset'
end_inset = r'\end_inset'
begin_layout = r'\begin_layout'
end_layout = r'\end_layout'
flex_list = r'\begin_inset Flex .sort list'
item_desc = r'\begin_layout Description'
item_enum = r'\begin_layout Enumerate'
item_label = r'\begin_layout Labeling'
item_ize = r'\begin_layout Itemize'
begin_deeper = r'\begin_deeper'
end_deeper = r'\end_deeper'
end_body = r'\end_body'

def main(infl, outfl, opts, guff):
    '''Sort (nested) lists.'''

    parser = argparse.ArgumentParser(description = 'Sort notes?')

    parser.add_argument('-n', '--notes', dest='n', action ='store_true', \
                    default = False, help="Makes LyX's (e.g. yellow) notes sortable")
    parser.add_argument('-a', action ='store_true', \
                    default = False, help="Sort across list-type boundaries")
    parser.add_argument('-c', action ='store', type = int, default = 20, \
                    help="No. of chars in the sort key")

    sort_notes = parser.parse_args(opts).n
    sort_all = parser.parse_args(opts).a
    nchars = parser.parse_args(opts).c
    
    def get_list_kind(infile, outfile):
        '''Itemize? Labeling? Enumerate? Description?'''
        for line in infile:
            # find item kind
            if re_itemkind.match(line):
                temp = r'\begin_layout ' + \
                            re_itemkind.match(line).expand(r'\1')
                return temp
            else:
                outfile.write(line)

    def sort_list(infile, outfile, dpth, sorter, updn, list_kind):
        '''Do the sort.'''
        
        def include_inset(notes):
            '''Include insets without sorting unless notes is True.'''
            count = 1
            temp = nqey = ''
            for line in infile:
                temp += line
                if begin_inset in line:
                    count += 1
                elif end_inset in line:
                    count -= 1
                    if count == 0:
                        return temp, nqey
                elif notes:
                    if re_lyxcmds.match(line):
                        continue
                    else:
                        nqey += line.strip()
                        
        #######################################
        list_type_change = False
        layouts = 1
        items = []
        qey = ''
        if list_kind == None:
            list_kind = ''
        current_item = list_kind + '\n'
        for line in infile:
            # get item & sort key
            if begin_deeper in line:
                current_item += line
                lkind = get_list_kind(infile, outfile)
                current_item += sort_list(infile, outfile, dpth + 1, sorter, updn, lkind)
                items[-1] = (qey, current_item)
            elif end_deeper in line:
                items.sort(key=itemgetter(0), reverse=updn[dpth])
                cur_items = ''
                for i in range(len(items)):
                    cur_items += items[i][1]
                cur_items += line
                return cur_items
            elif begin_inset in line:
                current_item += line
                if sort_notes and 'Note Note' in line:
                    tmp1, tmp2 = include_inset(sort_notes)
                else:
                    tmp1, tmp2 = include_inset(False)
                current_item += tmp1
                qey += tmp2
            elif begin_layout in line:
                # list-type change; continue sorting?
                if list_kind  in line or \
                   (re_itemkind.match(line) and (dpth > 0 or sort_all)):
                    current_item = line
                    layouts = 1
                    qey = ''
                else:
                    items.sort(key=itemgetter(0), reverse=updn[dpth])
                    cur_items = ''
                    for i in range(len(items)):
                        cur_items += items[i][1]
                    cur_items += line
                    return cur_items
            elif end_body in line:
                items.sort(key=itemgetter(0), reverse=updn[0])
                cur_items = ''
                for i in range(len(items)):
                    cur_items += items[i][1]
                cur_items += line
                return cur_items
            elif end_layout in line:
                current_item += line
                layouts -= 1
                if layouts == 0:
                    if sorter[dpth] in 'az':
                        qey = qey.lower()[:nchars]
                    elif sorter[dpth] in 'AZ':
                        qey = qey[:nchars]
                    elif sorter[dpth] in 'nN':
                        qey = '0'
                    elif sorter[dpth] in 'PM':
                        qey = float(qey)
                    items.append((qey, current_item))
            elif re_lyxcmds.search(line):
                current_item += line
            else:
                current_item += line
                qey += line.strip()

    

    ###############################################
    # write the prelims to file
    outfl.write(guff)
        
    list_status = 0
    for line in infl:
        # find sort spec. inset
        if list_status == 0:   
            outfl.write(line)
            if flex_list in line:
                list_status += 1
                sort_spec = ''
                insets = 1
        
        #get sort spec.
        elif list_status == 1:
            outfl.write(line)
            if begin_inset in line:
                insets += 1
            elif end_inset in line:
                insets -= 1
                if insets == 0:
                    # groom sort_spec to std form
                    sp = sort_spec.replace('+', 'P')
                    sp = sp.replace('-', 'M')
                    specs = sp.split('/')
                    repeats = len(specs)
                    for s in range(repeats):
                        specs[s] = re.sub(r'\W', '', specs[s])
                    levels = [[] for i in range(repeats)]
                    sort_type = [[] for i in range(repeats)]
                    for i in range(repeats):
                        levels[i] = [int(n) for n in re.findall(r'(\d)', specs[i])]
                        sort_type[i] = re.findall(r'\d(\w)', specs[i])
                    sorting = ['N' for j in range(6)]
                    
                    ftemp = ['' for i in range(repeats+1)]
                    ftemp[0] = infl
                    ftemp[repeats] = outfl
                    
                    for i in range(repeats):
                        # do lesser sorts before primary
                        for j in levels[repeats - i - 1]:
                            k = levels[repeats - i - 1].index(j)
                            sorting[j - 1] = sort_type[repeats - i - 1][k]
                        if i < repeats - 1:
                                ftemp[i+1] = tempfile.TemporaryFile(mode = 'w+t')
                        updn = [s in '-Zz' for s in sorting] 
                        depth = 0
                        list_kind = get_list_kind(ftemp[i], ftemp[i+1])
                        ftemp[i+1].write(sort_list(ftemp[i], ftemp[i+1], depth, sorting, updn, list_kind))
                        if i > 0:
                            ftemp[i].close()
                        if i < repeats - 1:
                            ftemp[i+1].seek(0)
                        list_status = 0
            elif re_lyxcmds.search(line):
                continue
            else:
                sort_spec += line.strip()
        else:
            outfl.write(line)
            
    return 1 




Attachment: pLyXSorting(compressed).lyx
Description: Binary data

Reply via email to