Redirected from Digest (see below)

On 23/07/2020 11:59, Stavros Macrakis wrote:
> Mousedancer, thanks!

Yes, I even look like a (younger) Kevin Costner!
(you believe me - right!?)


> As a finger exercise, I thought I'd try implementing print-level and print-length as an object-to-object transformer (rather than a pretty printer). I know that has a bunch of limitations, but I thought I might learn something by trying.
>
> Here's a simple function that will copy nested lists while limiting their depth and length. When it encounters a non-iterable object, it treats it as atomic:
>
>     scalartypes = list(map(type,(1,1.0,1j,True,'x',b'x',None)))
>
>     def limit(obj,length=-2,depth=-2):
>          if type(obj) in scalartypes:
>              return obj
>          if depth==0:
>              return 'XXX'
>          lencnt = length
>          try:
> new = type(obj).__new__(type(obj)) # empty object of same type
>              for i in obj:
>                  lencnt = lencnt - 1
>                  if lencnt == -1:
>                      new.append('...')          # too long
>                      break
>                  else:
>                      new.append(limit(i,length,depth-1))
>              return new
>          except:                                # which exceptions?
>              return obj                         # not iterable/appendable
>
>     limit( [1,2,[31,[321,[3221, 3222],323,324],33],4,5,6], 3,3)
>
>             => [1, 2, [31, [321, 'XXX', 323, '...'], 33], '...']
>
>
>
> This works fine for lists, but not for tuples (because they're immutable, so no *append*) or dictionaries (must use *for/in* *obj.items*, and there's no *append*). There must be some way to handle this generically so I don't have to special-case tuples (which are immutable, so don't have *append*) and dictionaries (where you have to iterate over *obj.items()*... and there's no *append*), but I'm stuck. Should I accumulate results in a list and then make the list into a tuple or dictionary or whatever at the end? But how do I do that?
>
> It's not clear how I could handle /arbitrary/ objects... but let's start with the standard ones.

This looks like fun!
BTW why are we doing it: is it some sort of 'homework assignment' or are you a dev 'scratching an itch'?


May I suggest a review of the first few pages/chapters in the PSL docs (Python Standard Library): Built-in Functions, -Constants, -Types, and -Exceptions. Also, try typing into the REPL:

    pp.__builtins__.__dict__()

(you will recognise the dict keys from the docs). These may give you a more authoritative basis for "scalartypes", etc.

If you're not already familiar with isinstance() and type() then these (also) most definitely useful tools, and thus worth a read...


With bottom-up prototyping it is wise to start with the 'standard' cases! (and to 'extend' one-bite at a time)


Rather than handling objects (today's expansion on the previous), might I you refer back to the objective, which (I assume) requires the output of a 'screen-ready' string. Accordingly, as the data-structure/network is parsed/walked, each recognised-component could be recorded as a string, rather than kept/maintained?reproduced in its native form.

Thus:
- find a scalar, stringify it
- find a list, the string is "["
- find a list, the string is "{"
- find a tuple, the string is "("
etc

The result then, is a series of strings.

a) These could be accumulated, ready for output as a single string. This would make it easy to have a further control which limits the number of output characters.

b) If the accumulator is a list, then

    accumulator.append( stringified_element )

works happily. Plus, the return statement can use a str.join() to produce a single accumulator-list as a string. (trouble is, if the values should be comma-separated, you don't want to separate a bracket (eg as a list's open/close) from the list-contents with a comma!) So, maybe that should be done at each layer of nesting?

Can you spell FSM?
(Finite State Machine)


Next set of thoughts: I'm wondering if you mightn't glean a few ideas from reviewing the pprint source-code? (on my (Fedora-Linux) machine it is stored as /usr/lib64/python3.7/pprint.py)

Indeed, with imperial ambitions of 'embrace and extend', might you be able to sub-class the pprint class and bend it to your will?


Lastly, (and contrasting with your next comment) I became a little intrigued, so yesterday, whilst waiting for an on-line meeting's (rather rude, IMHO) aside to finish (and thus move-on to topics which involved me!), I had a little 'play' with the idea of a post-processor (per previous msg).

What I built gives the impression that "quick and dirty" is a thoroughly-considered and well-designed methodology, but the prototype successfully shortens pprint-output to a requisite number of elements. Thus:

    source_data = [1,2,[31,[321,[3221, 3222],323,324],33],4,5,6]
    limit( source_data, 3 )

where the second argument (3) is the element-count/-limit; results in:

    [1,2,[31

ie the first three elements extracted from nested lists (tuples, sets, scalars, etc).
(recall my earlier query about what constitutes an "element"?)


> Sorry for the very basic questions!

No such thing - what is "basic" to you, might seem 'advanced' so someone else, and v-v. Plus, you never know how many 'lurkers' (see below) might be quietly-benefiting from their observation of any discussion!


PS on which subject, List Etiquette:

There are many people who 'lurk' on the list - which is fine. Presumably they are able to read contributions and learn from what seems interesting. This behavior is (to me) a major justification for the digest service - not being 'bombarded' by many email msgs is how some voice their concerns/preference.

However, once one asks a question, one's involvement is no longer passive ('lurking'). Hence:

>     When replying, please edit your Subject line so it is more specific
>     than "Re: Contents of Python-list digest..."
...

>        16. Re: How to limit *length* of PrettyPrinter (dn)
...

Further, many of us manage our email 'bombardment' through 'organisation' rather than 'limitation' (or 'condensation'?); and thus "threading" is important - most competent mail-clients offer this, as do GMail and many web-mail services. From a list perspective, this collects and maintains all parts of a conversation - your contributions and mine, in the 'same place'. Sadly, switching between the list-digest and single-messages breaks threading! Also, no-one (including the archiving software) looking at the archive (or the digest) would be able to detect any link between an earlier conversation called "How to limit *length* of PrettyPrinter" and one entitled "...Digest..."!
--
Regards =dn

--
Regards =dn
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to