Re: How to limit length of PrettyPrinter

dn via Python-list Sat, 25 Jul 2020 16:47:07 -0700

Let me preface this reply with the concern that my level of competence,in this area, is insufficient. However, there are a number of folk'here' who are 'into' Python's internals, and will (hopefully) jump-in...

Also, whilst we appear to be concentrating on understanding the contentof a data-structure, have we adequately defined "length"?

(per msg title)

- total number of o/p lines (per paper.ref)
- total number of characters 'printed'
- total number of elements l-r (of any/all embedded data-structures)
- the number of elements in each embedded data-structure
- the number of characters displayed from each embedded d-s
- depth of data-structure t-d
- something else?


On 25/07/2020 10:52, Stavros Macrakis wrote:

dn, Thanks again.
For background, I come from C and Lisp hacking (one of the MITdevelopers of Macsyma <https://en.wikipedia.org/wiki/Macsyma>/Maxima<https://sourceforge.net/p/maxima/wiki/Home/>) and also play withR, though I haven't been a professional developer for many years. I knowbetter than to Reply to a Digest -- sorry about that, I was just beingsloppy.

Us 'silver-surfers' have to stick-together! Also in the seventies Idecided Lisp was not for me...

The reason I wanted print-length limitation was that I wanted to get anoverview of an object I'd created, which contains some very long lists.I expected that this was standard functionality that I simply couldn'tfind in the docs.
I'm familiar with writing pretty-printer ("grind") functions with stringoutput (from way back: see section II.I, p. 12<http://bitsavers.trailing-edge.com/pdf/mit/ai/aim/AIM-279.pdf>), butI'm not at all familiar with Python's type/class system, which is whyI'm trying to understand it by playing with it.

I accept, one might say 'on faith', that in Python "everything is anobject", and proceed from there. Sorry!

Similarly, I've merely accepted the limitations of pprint() - andprobably use it less-and-less, as I become more-and-more orientedtowards TDD...

I did try looking at the Python Standard Library docs, but I don't seewhere it mentions the superclasses of the numerics or of the collectiontypes or the equivalent of *numberp*. If I use *type(4).__bases__*, Iget just*(<class 'object'>,)*, which isn't very helpful. I suspect thatthat isn't the correct way of finding a class's superclasses -- what is?


If you haven't already, try:
- The Python Language Reference Manual (see Python docs)
        in particular "Data Model"

- PSL: Data Types = "types -- Dynamic type creation and names forbuilt-in types"

- PSL: collections
- PSL: collections.abc

Another source of 'useful background' are PEPs (Python EnhancementProposals). Note that some have been accepted and are part of thecurrent-language - so the "proposal" part has become an historic record.In comparison: some have been rejected, and others are stillunder-discussion...


- PEP 0: an index
- PEP 3119 -- Introducing Abstract Base Classes
- PEP 3141 -- A Type Hierarchy for Numbers

- and no-doubt many more, which will keep you happily entertained, andsave the members of your local flock of sheep from thinking that to youthey are a mere number...

BTW, where do I look to understand the difference between *dir(type(4))*(which does not include *__bases__*) and *type(4).__dir__(type(4))*(which does)? According to Martelli (2017, p. 127), *dir(*x*)* justcalls /*x*./*__dir__()*; but *type(4).__dir__() *=> ERR for me. Has thischanged since 3.5, or is Martelli just wrong?


I don't know - and I'm not about to question Alex!

There's nothing else obvious in dir(0) or in dir(type(0)). After somelooking around, I find that the base classes are not built-in, but needto be added with the *numbers* and *collections.abc *modules? That's asurprise!


Yes, to me there is much mystery in this (hence "faith", earlier).

Everything is a sub-class of object. When I need to differentiate, egbetween a list and a dict; I either resort to isinstance() or back tothe helpful table/taxonomy in collections.abc and hasattr() - thus atuple is a Collection and a Sequence, but not a MutableSequence like alist. A set looks like a list until it comes to duplicate values orbehaving as a Sequence. A dict is a MutableMapping, but as you say(below), when considered a Collection will only behave as a list ofkeys. So, we then chase the *View-s...

You suggested I try *pp.__builtins__.__dict__()* . I couldn't figure outwhat you meant by *pp* here (the module name *pprint*? the class*pprint.PrettyPrint*? the configured function*pprint.PrettyPrinter(width=20,indent=3).pprint*? none worked...). Ifinally figured out that you must have meant something like*pp=pprint.PrettyPrinter(width=80).print; pp(__builtins__.__dict__)*.Still not sure which attributes could be useful.

With apologies: "pp" is indeed pprint. The code-example should have beenprefaced with:


    from pprint import pprint as pp

This is (?my) short-hand, whenever I'm using pprint within a module.(you will find our number-crunching friends referring to "np" ratherthan the full: "numpy", and similar...

    With bottom-up prototyping it is wise to start with the 'standard'
    cases! (and to 'extend' one-bite at a time)
Agreed! I started with lists, but couldn't figure out how to extend thatto tuples and sets. I was thinking I could build a list then convert itto a tuple or set. The core recursive step looks something like this:
CONVERT( map( lambda i: limit(i, length, depth-1) , obj[0:length]) + ( [] if len(obj) <= length else ['...'] ) )
... since map returns an iterator, not a collection of the same type asits input -- so how do I convert to the right result type (CONVERT)?
After discovering that typ.__new__(typ,obj) doesn't work for mutables,and thrashing for a while, I tried this:
       def convert(typ,obj):
            newobj = typ.__new__(typ,obj)
            newobj.__init__(obj)
            return newobj
which is pretty ugly, because the *__new__* initializer is magicallyignored for a mutable (with no exception) and the *__init__* setter ismagically ignored for an immutable (with no exception). But it seems towork....
Now, on to dictionaries! Bizarrely, the *list()* of a dictionary, andits iterator, return only the keys, not the key-value pairs. No problem!We'll create yet another special case, and use */set/.items()* (whichfor some reason doesn't exist for other collection types). And /mirabiledictu/, *convert *works correctly for that!:
    dicttype = type({'a':1})
    test = {'a':1,'b':2}
    convert(dicttype,test.items()) => {'a':1,'b':2}
So we're almost done. Now all we have to do is slice the result to thedesired length:
    convert(dicttype,test.items()[0:1])      # ERR
But */dict./items() *is not sliceable. However, it /is/ iterable... butwe need another count variable (or is there a better way?):
    c = 0
convert(dicttype, [ i for i in test.items() if (c:=c+1)<2 ])
Phew! That was a lot of work, and I'm left with a bunch of specialcases, but it works. Now I need to understand from a Python guru whatthe Pythonic way of doing this is which /doesn't/ require all this ugliness.
(This doesn't really work for the original problem, because there's noway of putting "..." at the end of a dictionary object, but I stillthink I learned something about Python.)
I did take a look at the pprint source code, and could no doubt modifyit to handle print-length, but at this point, I'm still trying tounderstand how Python code can be written generically. So I wasdisappointed to see that *_print_list, _print_tuple, *and*_print_set*are not written generically, but as three separate functions. I alsowonder what the '({' case is supposed to cover.
A lot of questions -- probably based on a lot of misunderstandings!

...and any response severely limited by my competence in these topics,eg my assumption that the data would be presented with differentbrackets, ie the "({", according to type.

Recently, I offered a "Friday Finking" to the list, relating a JuniorProgrammer's wrestling with the challenge of expanding an existing APIfrom scalar values to a choice between scalars and a tuple. (or was it alist - or does it really matter?) There seemed to be no suggestionbeyond isinstance().

In this case, there will be a 'ladder' of if...elif...else clauses, andquite possibly needed in two places - parsing and printing. (Theref.paper talked of two passes, so...)PS there is talk of a case/switch which will handle class-distinction,but alas, not currently available (PEP 622?Python 3.10).

Is the challenge one of attempting to retain and represent the valueswithin the data-structure? There is an implicit issue here, that afirst-approach may be essentially replication (ie storage-expensive).

Nevertheless, proceeding in this fashion, remember that a Python list isnot the "array" of other languages! Content-elements need not behomogeneous. So, an 'accumulator' list could contain a dict, a set,sundry scalars, and/or inner-lists, in perfect happiness.

Through zip(), Python has a very handy (IMHO) way of linking two lists,without any effort on my part. (I work very hard at being this lazy!)So, during 'parsing', it is possible to (say) build one list recording'type', eg list, dict, set, ... and a parallel list containing thevalues (or k-v pairs, or i,j, or...). Yet more, if/when a 'length'metric can be computed... Thereafter when it comes to presentation, theassembled lists can be zip-ped together in a for-loop to yield thefinal-presentation.

Apologies, the above seems to be 'fluff around the edges' rather thanaddressing the central needs of the problem. I'm also a little concernedabout your expertise level and the likelihood that I may be 'talking down'.

--
Regards =dn
--
https://mail.python.org/mailman/listinfo/python-list

Re: How to limit *length* of PrettyPrinter

Reply via email to

Re: How to limit length of PrettyPrinter