[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Steven D'Aprano
On Fri, Jan 01, 2021 at 02:53:33PM -0500, Eric V. Smith wrote:
> On 1/1/2021 2:00 PM, Jonathan Fine wrote:
> >
> >By the way, this surprised me. Would anyone like to explain this?
> >    >>> id(f()), id(f())
> >    (139927013695536, 139927013695536)
> >
> id's are only unique while an object exists. Here the object returned by 
> the first call is discarded after its id is taken. Then the second 
> object is created and reuses the memory address that was used by the 
> first object. Since in CPython id() returns object's memory address, the 
> id's are the same.

What Eric said, except I would put more emphasis on two things:

1. I would emphasize more strongly that IDs are abstract identity 
numbers, not memory addresses except as an accident of implementation. 
For example in Jython and IronPython, the IDs are sequential numbers 
and won't be reused.

In general, if the Python implementation is using a compacting garbage 
collector where objects can move in memory, IDs cannot be memory 
addresses.


2. And that it's a matter of happenstance, not design, that the same 
memory address happens to be used.

I can't produce a demonstration right now, but I've seen senarios where 
every *second* temporary object reuses the previous IDs, so printing a 
sequence of IDs goes something like this:

1012, 1042, 1012, 1042, 1012, 1042 ...

Presumably that could happen if the garbage collector takes longer to 
reclaim each temporary object than it takes the interpreter to generate 
and process the next one.



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PALNX6J4BLGXTHPH5NPOS3UTFOETU3HK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Steven D'Aprano
On Fri, Jan 01, 2021 at 11:42:12AM -0800, Brendan Barnwell wrote:
> On 2020-12-31 21:34, Steven D'Aprano wrote:
> >Is there something we can do to make it easier for people to deal with
> >recursive data structures, like the list repr and deepcopy code does? Is
> >there a pattern we can abstract out and provide as a tool or recipe for
> >people to use in their own code?
> 
>   It sounds promising but so far a bit too vague for me to understand 
> what you're really suggesting.

It is vague to me too, and I don't really have any concrete suggestions. 
I was hoping to start a discussion that might lead to some concrete 
suggestions.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JTBDO22RYJI5WPNGNDDKRENNNXSPUCD6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Richard Damon
It might not be that surprising to use a list if you are already using
that list to keep track of your position in the hierarchy. You can
probably assume that most of the time the depth is fairly low so the
cost of scanning is low enough that the cost of building and updating
the set costs enough that there aren't significant savings.

On 1/1/21 7:38 PM, Samuel Freilich via Python-ideas wrote:
> Interesting to look at the code for some of this.
>
> list.repr and dict.repr seem to be taking the "construct a set of seen
> items" approach. Except it's not using a set of ids, it's doing a
> linear pass over a list of visited PyObjects each time (which seems
> somewhat surprising to me, though it's only O(n*d) where d is the
> nesting depth, not O(n^2)):
> https://github.com/python/cpython/blob/3bf05327c2b25d42b92795d9d280288c22a0963d/Objects/object.c#L1974
> 
>
> copy.deepcopy keeps track with a dict:
> https://github.com/python/cpython/blob/3bf05327c2b25d42b92795d9d280288c22a0963d/Lib/copy.py#L138
> 
>
> (That uses a sentinel object as the default for the lookup so it can
> tell if that's explicitly set to None, though that seems convoluted
> compared to just checking whether the value is None or not. Is that
> considering that "copy.deepcopy(x) is None and x is not None" might be
> true?)
>
> Peace,
> -Sam
>
> On Fri, Jan 1, 2021 at 7:07 PM Greg Ewing  > wrote:
>
> On 2/01/21 6:14 am, Jeff Allen wrote:
> > we may as
> > well say that the required result takes the form of an iterable
> of the
> > nodes, which may subsequently be iterated by a consumer.
>
> In general you need more than that, I think. If you're printing
> a representation of the graph, you want to know about the structure,
> not just get a flat list of nodes.
>
> > But we must have to do something
> > very similar to pickle arbitrary (potentially cyclic)
> structures, which
> > is likewise dependent on special help from the particular
> built-in type.
> > Can something be founded on |||__getstate__| ?
>
> I don't think so. All the logic for dealing with cycles is buried
> inside pickle -- __getstate__ just gets info about one object.
>
> -- 
> Greg
>

-- 
Richard Damon
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/M55CUGTDN56E34WJANQ4CZZPKPBDMP74/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Samuel Freilich via Python-ideas
Interesting to look at the code for some of this.

list.repr and dict.repr seem to be taking the "construct a set of seen
items" approach. Except it's not using a set of ids, it's doing a linear
pass over a list of visited PyObjects each time (which seems somewhat
surprising to me, though it's only O(n*d) where d is the nesting depth, not
O(n^2)):
https://github.com/python/cpython/blob/3bf05327c2b25d42b92795d9d280288c22a0963d/Objects/object.c#L1974

copy.deepcopy keeps track with a dict:
https://github.com/python/cpython/blob/3bf05327c2b25d42b92795d9d280288c22a0963d/Lib/copy.py#L138

(That uses a sentinel object as the default for the lookup so it can tell
if that's explicitly set to None, though that seems convoluted compared to
just checking whether the value is None or not. Is that considering that
"copy.deepcopy(x) is None and x is not None" might be true?)

Peace,
-Sam

On Fri, Jan 1, 2021 at 7:07 PM Greg Ewing 
wrote:

> On 2/01/21 6:14 am, Jeff Allen wrote:
> > we may as
> > well say that the required result takes the form of an iterable of the
> > nodes, which may subsequently be iterated by a consumer.
>
> In general you need more than that, I think. If you're printing
> a representation of the graph, you want to know about the structure,
> not just get a flat list of nodes.
>
> > But we must have to do something
> > very similar to pickle arbitrary (potentially cyclic) structures, which
> > is likewise dependent on special help from the particular built-in type.
> > Can something be founded on |||__getstate__| ?
>
> I don't think so. All the logic for dealing with cycles is buried
> inside pickle -- __getstate__ just gets info about one object.
>
> --
> Greg
>
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/3FMDYC76POOMNJ4OZUOJ2U3KCEMTGPII/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TVXHQCSAE3V54V6MTKFGXEXSFD3RTBTT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Eric V. Smith

On 1/1/2021 2:00 PM, Jonathan Fine wrote:


By the way, this surprised me. Would anyone like to explain this?
    >>> id(f()), id(f())
    (139927013695536, 139927013695536)

id's are only unique while an object exists. Here the object returned by 
the first call is discarded after its id is taken. Then the second 
object is created and reuses the memory address that was used by the 
first object. Since in CPython id() returns object's memory address, the 
id's are the same.


Eric

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YHNHGZT5BNPXQLQ2BDT6ARNDJCHGE62M/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Greg Ewing

On 2/01/21 6:14 am, Jeff Allen wrote:
we may as 
well say that the required result takes the form of an iterable of the 
nodes, which may subsequently be iterated by a consumer.


In general you need more than that, I think. If you're printing
a representation of the graph, you want to know about the structure,
not just get a flat list of nodes.

But we must have to do something 
very similar to pickle arbitrary (potentially cyclic) structures, which 
is likewise dependent on special help from the particular built-in type. 
Can something be founded on |||__getstate__| ?


I don't think so. All the logic for dealing with cycles is buried
inside pickle -- __getstate__ just gets info about one object.

--
Greg

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3FMDYC76POOMNJ4OZUOJ2U3KCEMTGPII/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Jeff Allen

On 01/01/2021 19:36, Richard Damon wrote:

On 1/1/21 2:00 PM, Jonathan Fine wrote:

Hi Richard

You wrote

 I believe that one solution to detecting the cycles is to create a set
 of the object IDs you have visited and started to print. If you come
 across an ID you have already seen, this point is in a cycle. Sets are
 fairly compact and intentionally fast to search for an item.


Indeed. But I see a flaw. The problem is that we want to know about
EQUALITY of nodes, not equality of the ID (memory address in disguise)
at which the node is stored.


As to the first, just because to points have equal values doesn't mean
that we have recursion.

That is why I put the id()s in the set, the id()s are by definition
hashable, and an object always has the same id() and no two different
objects can have the same id().

Simple example:

list1 = [ ['one', 'two'], ['one', 'two'] ]

This makes me realise that the spanning tree I brought into the 
discussion is very likely a red herring. The print (or more generally 
iteration/visitation) of the structure needs to be curtailed not at an 
object you already visited, but only at an object that is between you 
and the root. (In the case of printing or serialisation, that is an 
object you are part-way through processing.)


Here, you definitely want to visit a three times when printing b:

>>> a = [1, 2, 3]
>>> b = [a, a, a]
>>> b
   [[1, 2, 3], [1, 2, 3], [1, 2, 3]]

But after:

>>> c = [4, a, b]
>>> a.append(c)

revisiting a and b is to be avoided during the visit to c

>>> b
   [[1, 2, 3, [4, [...], [...]]], [1, 2, 3, [4, [...], [...]]], [1, 2,
   3, [4, [...], [...

However, I'm definitely with Richard that it is the identity of the node 
that matters, not its value (hashable or not). Of course, I'm assuming I 
know what kind of iteration Steven wanted, based on the reference to 
printing (repr).


The question remains whether this can be done as a general capability, 
and I still see the problem as the inscrutability of built-in types (and 
their repr, etc.).


Jeff

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/42YGZUPUGDMVQ5V7OXSDGGROENQ6P3PC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Cameron Simpson
On 01Jan2021 16:34, Steven D'Aprano  wrote:
>This isn't so much an idea for Python, as a request for ideas to solve a
>problem in Python.
>
>Back in the early days of Python, printing recursive lists could crash
>the interpreter:
>
>a = [1, 2, 3]
>a.append(a)
>print(a)
>
>If you try that today, you get a nice display:
>
>[1, 2, 3, [...]]
[...]
>Is there something we can do to make it easier for people to deal with
>recursive data structures, like the list repr and deepcopy code does? Is
>there a pattern we can abstract out and provide as a tool or recipe for
>people to use in their own code?

I don't know about a tool - that might be tricky to generalise - but my 
standard pattern for this is:

def recurse(o, seen=None):
if seen is None:
seen = set()
if id(o) in seen:
return
seen.add(id(o))
walk o and call recurse(sub_object, seen) ...

The key to store in seen isn't always id(o), and for some tasks I 
discard the key on my way back out of the recursion.

I can imagine writing a decorator for this. I'd probably like it to look 
like this:

@no_recurse(keyfunc=None,popkey=False):
def recurse(o, *, seen):
...
recurse(sub_object, seen)

maybe (untested):

@decorator
def no_recurse(func, keyfunc=None, popkey=False):
if keyfunc is None:
keyfunc = id
def no_recurse_wrapper(o, *a, seen=None, **kw):
if seen is None:
seen = set()
k = keyfunc(o)
if k in seen:
return None
seen.add(k)
try:
return func(o, *a, seen=seen, **kw)
finally:
if popkey:
seen.discard(k)

The "@decorator" decorator is one of my own which decorates a decorator 
which may accept parameter of not, allowing both:

@no_recurse
def recurse(o, blah..., *, seen, other_kwargs...):

and also:

@no_recurse(keyfunc=lambda o: o.idkey())
def recurse(o, blah..., *, seen, other_kwargs...):

for some nondefault decoration.

There are some missing features there: what to return when recusion is 
encoutered (rather than None above), some mode to make the occurrence of 
recursion trivially noticeable by the primary function to do something 
better than "return None", etc etc.

Cheers,
Cameron Simpson 
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CFZNLRQJENSDQLJ5F456F7G6ZHINSGMV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Brendan Barnwell

On 2020-12-31 21:34, Steven D'Aprano wrote:

Is there something we can do to make it easier for people to deal with
recursive data structures, like the list repr and deepcopy code does? Is
there a pattern we can abstract out and provide as a tool or recipe for
people to use in their own code?


	It sounds promising but so far a bit too vague for me to understand 
what you're really suggesting.  What form would this "something" take? 
Are you envisioning, say, a function safe_recursive_iter so that you 
could do `for item in safe_recursive_iter(obj)` and it would cleanly 
raise an exception if it wound up recursing through the same object more 
than once, rather than getting stuck in an infinite loop?


--
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is no 
path, and leave a trail."

   --author unknown
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ASG6PHMOETWAO7E6GESBMEJZLPH5XF53/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Richard Damon
On 1/1/21 2:00 PM, Jonathan Fine wrote:
> Hi Richard
>
> You wrote
>
> I believe that one solution to detecting the cycles is to create a set
> of the object IDs you have visited and started to print. If you come
> across an ID you have already seen, this point is in a cycle. Sets are
> fairly compact and intentionally fast to search for an item.
>
>
> Indeed. But I see a flaw. The problem is that we want to know about
> EQUALITY of nodes, not equality of the ID (memory address in disguise)
> at which the node is stored. 
>
> In other words, as stated earlier, things are easier and quicker if
> the nodes are hashable. Only hashable objects can be stored in a set,
> and equality of NODE doesn't imply equality of ID. (The converse is
> trivially true.)
>
> Here's an example
>     >>> def f(): return 10**1000
>     >>> list(map(id, (f(), f(
>     [139927013695536, 139927013696008]
>
> By the way, this surprised me. Would anyone like to explain this?
>     >>> id(f()), id(f())
>     (139927013695536, 139927013695536)
>
> -- 
> Jonathan
>
Second problem first, Two numbers that are equal may or might not be
stored in the same object. It is unspecified.

As to the first, just because to points have equal values doesn't mean
that we have recursion.

That is why I put the id()s in the set, the id()s are by definition
hashable, and an object always has the same id() and no two different
objects can have the same id().

Simple example:

list1 = [ ['one', 'two'], ['one', 'two'] ]

The first and second member of the above list are equal, but not the
same object. Thus there is no reason to want to try to trim the second
from an output.


list1 = ['one', 'two']

list2 = [list1, list1]

now the first and second member of the list list2 are both equal and the
same object. Perhaps we might want to indicate this in a dump, but maybe
not.

list1 = ['one', 'two']

list1.append(list11)

Now the third element of the list list1 IS the same object as list1, so
we definitely want to not try and show it when we display the value of
list1, but instead indicate that we have recursion.

Even further:

list1 = ['one', 'two']

list1.append(list1)

list2 = ['one', 'two', list1]

Now list2 will be equal to list1 but is a distinct object. Now the
expansion of list2 will be:

['one', 'two', ['one', 'two', [...]]]

We don't want to test for EQUAL nodes, we want to test for SAME nodes
(same object, same id())

-- 
Richard Damon
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QVRB4IT3WT6NS3L64CNL2Z2QLCH5DO2H/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Jonathan Fine
Hi Richard

You wrote

I believe that one solution to detecting the cycles is to create a set
> of the object IDs you have visited and started to print. If you come
> across an ID you have already seen, this point is in a cycle. Sets are
> fairly compact and intentionally fast to search for an item.
>

Indeed. But I see a flaw. The problem is that we want to know about
EQUALITY of nodes, not equality of the ID (memory address in disguise) at
which the node is stored.

In other words, as stated earlier, things are easier and quicker if the
nodes are hashable. Only hashable objects can be stored in a set, and
equality of NODE doesn't imply equality of ID. (The converse is trivially
true.)

Here's an example
>>> def f(): return 10**1000
>>> list(map(id, (f(), f(
[139927013695536, 139927013696008]

By the way, this surprised me. Would anyone like to explain this?
>>> id(f()), id(f())
(139927013695536, 139927013695536)

-- 
Jonathan
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LTHN7GZYSEXDEMQLXL7UZNHY6Y7XDY5K/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Richard Damon
On 1/1/21 12:14 PM, Jeff Allen wrote:
> On 01/01/2021 14:50, Jonathan Fine wrote:
>> ...
>> To summarise, it's loops / cycles in the representation of the data
>> that's the problem. ... .
>>
>> Final idea. Perhaps a tool to detect cycles in data might be a good idea.
>>
> Detecting cycles will involve a lot of bikeshedding. (Sorry, couldn't
> resist.)
>
> But you're right, that's a better word for discussing the problem.
> Steven's problem data structures are cyclic graphs. I don't agree that
> such structures are a sign one is doing something wrong (outside the
> naive approach to printing them out).
>
> If memory serves me right, what we're talking about is finding and
> walking a "spanning tree" of the graph. There is a twist in our case
> that we would like to insert a special object where a link was elided
> to break the cycle: an object that maybe contains the original
> reference but would reproduce as "..." in print.
>
> Since it appears necessary to create a data structure with as many
> nodes as there are in the tree, just to remember where we've been, we
> may as well say that the required result takes the form of an iterable
> of the nodes, which may subsequently be iterated by a consumer.
> Actually, I can think of two styles of iterator: one where "..."
> objects are skipped by a "processing" iterator, and one where they are
> delivered by a "print" iterator. It is valid to do this with objects
> that can't be hashed, so the structure is perhaps a dict keyed by id.
>
> Would this be possible as a general capability? I think only if the
> references were in the __dict__ of each instance. With a built-in, one
> is at the mercy of the implementor. But we must have to do something
> very similar to pickle arbitrary (potentially cyclic) structures,
> which is likewise dependent on special help from the particular
> built-in type. Can something be founded on |||__getstate__| ?
>
> Happy New Year
>
> Jeff Allen
>
I believe that one solution to detecting the cycles is to create a set
of the object IDs you have visited and started to print. If you come
across an ID you have already seen, this point is in a cycle. Sets are
fairly compact and intentionally fast to search for an item.

-- 
Richard Damon
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OXZ4GP27ZWDEONUZQ3MCBOHPQYAM2FZJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Richard Damon
On 1/1/21 12:12 PM, Jonathan Fine wrote:
> Hi Richard
>
> You suggested
>
> A very simple and in my mind reasonable use for this is to build a
> representation of a graph, where each node is represented by a
> list (or
> some other collection), and the connections are denoted by adding to
> that collection the nodes that are adjacent (or maybe a tuple of the
> node and the distance). This naturally creates a recursive structure
> unless connections are unidirectional and the graph is acyclical
> (i.e. a
> tree).
>
> For example, a 3 node fully connected graph might be:
>
> a = []
> b = []
> c = []
>
> a.append(b)
> a.append(c)
> b.append(a)
> b.append(c)
> c.append(a)
> c.append(b)
>
>
>  According to https://en.wikipedia.org/wiki/Graph_theory#Graph
> , a graph is an
> ordered pair
> >>> G = (V, E)
> where V is a set of vertices, and E is a set of edges, where an edge
> is an unordered pair of vertices.
>
> You've given the complete graph on 3 vertices (although without
> explicitly stating the vertex set V). 
> https://en.wikipedia.org/wiki/Complete_graph
> 
>
> If we use the wikipedia definition (with vertices 'a', 'b' and 'c') we
> have
> >>> V = {'a', 'b', 'c'}
> >>> E = {{'a', 'b'}, {'a', 'c'}, {'b', 'c'}}
>
> I've shown how to represent graphs, by directly translating the
> mathematical definition into Python (without introducing any
> additional classes). You've produced a different way to represent graphs.
>
> Mathematically, your representation is this. There is a set V of
> vertices. Further, for each v in V there is associated a subset V_v of
> V. Further, if w in V_v then v in V_w.
>
> I wouldn't call what you've written down the Python representation of
> this mathematical definition. To explain this, I'll write down what I
> think it should be.
>
> Suppose G is a graph with vertices {'a', 'b', 'c', 'd'}. Using a dict
> to represent the map that v goes to V_v, we can write G as
> {
>     'a': {...},
>     'b': {...},
>     # etc
>     'e': {...},
> }
>
> My initial post suggests that, purely on the basis of the loop in the
> data structure, there's a code smell in the representation you
> provided. I'd say that not agreeing with its own mathematical
> definition is a code smell (or worse).
>
> I'll put it another way. Suppose our vertices are called (or
> represent) Alice, Bob, Carol, Dave and Eve. Now let G be the graph
> whose edges represent the existence of cryptographically secure
> communication between X and Y. I claim that your code isn't able to
> represent such a graph.
>
> Thank you Richard for forcing me to think things through and find what
> I consider to be a flaw in the code you supplied.
>
> -- 
> Jonathan
>
This may be the practical difference betwen Theory and Practice (which
in Theory doesn't exist). There are times when processing a system that
can be represented as a graph, it is cleanest to think first of node,
and then by their connections. Your notation makes it inconvenient that
given you are at node A to know what nodes it connects to,  you need to
search through the E structure to find all entries with the starting
node listed. You also get the 'name' of the node, not the node itself,
so you need another dictionary to map names to the actual nodes them
selves (where more information about the node might be stored).

Yes, maybe in pure graph theory all this isn't important, but if I am
USING a grahph for actual work, it might be.

Also, your example assumes that NAMES are the important piece, how would
you build your structure if names do not exist or are not unique, Once
you make the node themselves important, then making the links by the
node itself instead of by some arbitrary name makes sense.

Also, if the graph is directed, like Alice has a way to drop a message
into Bobs secure storage box, but Alice doesn't have access to one of
her own, so Bob can't use that method to send back to Alice, then it can
make sense to make the edge list not a total separate structure, but
each node includes the edges it can talk to itself.

It can be noted that this is very much like the model of the internet as
it exists today. Every machine (a node), keeps track of who it can
directly talk to, and a list of who it is next to and guess for who it
should try to send messages to in order to get a message to somone it is
not next to. There is no master list of edges in existance anywhere (I
once did see a master listing of the graph of the network in the early
days before automatic routing, such a map would be impossible to build
today).

Once you move the listing of links into the node itself, as opposed to
having a separate listing of all the vertices, and make the links to
node rather than names (to avoid an extra unneeded lookup) then you get
the recursive data structure.

For your example you say that can 

[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Jonathan Fine
Hi Jeff

> But you're right, that's a better word for discussing the problem.
> Steven's problem data structures are cyclic graphs. I don't agree that such
> structures are a sign one is doing something wrong (outside the naive
> approach to printing them out).
>

Thank you for agreeing with me, and nicely disagreeing with me. However,
the problem of 'walking a graph' to find a spanning tree is first of all a
matter of choosing an algorithm. The algorithm to be used and the
representation of the graph are two different matters.

Suppose the vertices are well ordered, for example (0, 1, 2, 3, .., n).
Write each edge as an ORDERED pair, with the first vertex being less than
the second vertex. Now list the edges in lexicographic order. That solves
the problem, and gives a unique representation, provided we are given a
well ordering of the vertices.

Under certain circumstances, we can use the Python id to provide the well
ordering of vertices. However, that doesn't always work. Two strings (or
large ints, or tuples) can be equal even though they have different ids.
However, I think it reasonable to assume that we have a sensible equality
relation on the vertices.

Now things get interesting. For a very large graph, it would be very
helpful if every vertex has a hash. This would make determining equality
easier. But if the data structure representing the graph has Python cycles
in it (as in Steven's use case) then there won't be a hash available.

Do you agree with me, Jeff, that the problem of printing out a
vertices-and-edges graph is much easier (or at least quicker) if each
vertex has a hash. More exactly, we want a quick and easy way to determine
if two vertices are equal.

It would be nice to have some further input from Steven, whose original
post stated the problem he wished to solve.

-- 
Jonathan
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3AHIT7CQ6PW4SRIXNFP3UWPFKYO3KDGN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Peter Ludemann
A variant of the problem is the "occurs check", used in theorem proving:
https://en.wikipedia.org/wiki/Occurs_check
Removing the occurs check reduces the cost of unifying two terms (t1, t2)
from *O(size(t1) + size(t2))* to *O(min(size(t1), size(t2)))*, which is why
Prolog implementations don't do the occurs check by default (but provide
unify_with_occurs_check and acyclic_term predicates for when the check is
needed).

One use for cyclic terms is in representing grammars. Of course, the loops
can be avoided by changing the representation from a graph to a BNF-style
set of clauses - that is, by naming the nodes.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EY7K2T2S6DNG3JPTWEJVKNC6TQVLYEGF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Jeff Allen

On 01/01/2021 14:50, Jonathan Fine wrote:

...
To summarise, it's loops / cycles in the representation of the data 
that's the problem. ... .


Final idea. Perhaps a tool to detect cycles in data might be a good idea.

Detecting cycles will involve a lot of bikeshedding. (Sorry, couldn't 
resist.)


But you're right, that's a better word for discussing the problem. 
Steven's problem data structures are cyclic graphs. I don't agree that 
such structures are a sign one is doing something wrong (outside the 
naive approach to printing them out).


If memory serves me right, what we're talking about is finding and 
walking a "spanning tree" of the graph. There is a twist in our case 
that we would like to insert a special object where a link was elided to 
break the cycle: an object that maybe contains the original reference 
but would reproduce as "..." in print.


Since it appears necessary to create a data structure with as many nodes 
as there are in the tree, just to remember where we've been, we may as 
well say that the required result takes the form of an iterable of the 
nodes, which may subsequently be iterated by a consumer. Actually, I can 
think of two styles of iterator: one where "..." objects are skipped by 
a "processing" iterator, and one where they are delivered by a "print" 
iterator. It is valid to do this with objects that can't be hashed, so 
the structure is perhaps a dict keyed by id.


Would this be possible as a general capability? I think only if the 
references were in the __dict__ of each instance. With a built-in, one 
is at the mercy of the implementor. But we must have to do something 
very similar to pickle arbitrary (potentially cyclic) structures, which 
is likewise dependent on special help from the particular built-in type. 
Can something be founded on |||__getstate__| ?


Happy New Year

Jeff Allen


___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HOUAPRNH7CXYGH7PBAQBRB5JSA7MIGTQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Jonathan Fine
Hi Richard

You suggested

A very simple and in my mind reasonable use for this is to build a
> representation of a graph, where each node is represented by a list (or
> some other collection), and the connections are denoted by adding to
> that collection the nodes that are adjacent (or maybe a tuple of the
> node and the distance). This naturally creates a recursive structure
> unless connections are unidirectional and the graph is acyclical (i.e. a
> tree).
>
> For example, a 3 node fully connected graph might be:
>
> a = []
> b = []
> c = []
>
> a.append(b)
> a.append(c)
> b.append(a)
> b.append(c)
> c.append(a)
> c.append(b)
>

 According to https://en.wikipedia.org/wiki/Graph_theory#Graph, a graph is
an ordered pair
>>> G = (V, E)
where V is a set of vertices, and E is a set of edges, where an edge is an
unordered pair of vertices.

You've given the complete graph on 3 vertices (although without explicitly
stating the vertex set V).
https://en.wikipedia.org/wiki/Complete_graph

If we use the wikipedia definition (with vertices 'a', 'b' and 'c') we have
>>> V = {'a', 'b', 'c'}
>>> E = {{'a', 'b'}, {'a', 'c'}, {'b', 'c'}}

I've shown how to represent graphs, by directly translating the
mathematical definition into Python (without introducing any additional
classes). You've produced a different way to represent graphs.

Mathematically, your representation is this. There is a set V of vertices.
Further, for each v in V there is associated a subset V_v of V. Further, if
w in V_v then v in V_w.

I wouldn't call what you've written down the Python representation of this
mathematical definition. To explain this, I'll write down what I think it
should be.

Suppose G is a graph with vertices {'a', 'b', 'c', 'd'}. Using a dict to
represent the map that v goes to V_v, we can write G as
{
'a': {...},
'b': {...},
# etc
'e': {...},
}

My initial post suggests that, purely on the basis of the loop in the data
structure, there's a code smell in the representation you provided. I'd say
that not agreeing with its own mathematical definition is a code smell (or
worse).

I'll put it another way. Suppose our vertices are called (or represent)
Alice, Bob, Carol, Dave and Eve. Now let G be the graph whose edges
represent the existence of cryptographically secure communication between X
and Y. I claim that your code isn't able to represent such a graph.

Thank you Richard for forcing me to think things through and find what I
consider to be a flaw in the code you supplied.

-- 
Jonathan
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BULYB6UCIDT7IO6PNCEJZOTGVRKP7TWK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Richard Damon
On 1/1/21 9:50 AM, Jonathan Fine wrote:
> Hi
>
> I'd say the problem isn't recursion. Here I'm using the definitions
> given in:
> https://en.wikipedia.org/wiki/Recursion
> 
> https://www.cs.utah.edu/~germain/PPS/Topics/recursion.html
> 
>
> Rather, it's that the data has a loop (or cycle) in it. A simple
> example of this is
>     >>> x = []
>     >>> x.append(x)
>     >>> x
>     [[...]]
>     >>> x[0] is x
>     True
>
> So far as I know, it's not possible to create such using only
> immutable objects. And anything created using only immutable objects
> will have a hash.
>
> A beginner can easily by mistake create an object such as x above, and
> without the short-circuit provided by
>     >>> x
>     [[...]]
> the result is an unprintable object, that further raises an exception
> (or worse). That's really bad for the poor beginner.
>
> As a first approximation, my opinion is that data having such a loop /
> cycle in it is at least a code smell, and perhaps worse. However,
> there may be examples that I've not considered that would make me
> change my mind.

A very simple and in my mind reasonable use for this is to build a
representation of a graph, where each node is represented by a list (or
some other collection), and the connections are denoted by adding to
that collection the nodes that are adjacent (or maybe a tuple of the
node and the distance). This naturally creates a recursive structure
unless connections are unidirectional and the graph is acyclical (i.e. a
tree).

For example, a 3 node fully connected graph might be:

a = []

b = []

c = []

a.append(b)

a.append(c)

b.append(a)

b.append(c)

c.append(a)

c.append(b)

Maybe more generally the nodes would be dicts of properties, one of
which is a connection property with that list of nodes, but you still
end up with the same recursion.

>
> By the way, in the mathematics of the symmetric group (permutations)
> the two tuples
>     >>> (0, 1, 2, 3)
>     >>> (1, 2, 3, 0)
> are different representations of the same cycle (where here the word
> cycle has a different meaning).
>
> Also by the way, determining if two vertex-edge graphs are isomorphic
> is a hard problem. I've been told that the best tool for this is
> https://www3.cs.stonybrook.edu/~algorith/implement/nauty/implement.shtml
> 
>
> To summarise, it's loops / cycles in the representation of the data
> that's the problem. Without a use case to the contrary, I'd say don't
> do this. That's my opinion. Yours may be different.
>
> Final idea. Perhaps a tool to detect cycles in data might be a good idea.
>
> Finally, Happy New Year.
>
>  -- 
> Jonathan
>

-- 
Richard Damon
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HC6KXTAD2ROTXRSZXTXD4CVZ25SB2L2Q/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Jonathan Fine
Hi

I'd say the problem isn't recursion. Here I'm using the definitions given
in:
https://en.wikipedia.org/wiki/Recursion
https://www.cs.utah.edu/~germain/PPS/Topics/recursion.html

Rather, it's that the data has a loop (or cycle) in it. A simple example of
this is
>>> x = []
>>> x.append(x)
>>> x
[[...]]
>>> x[0] is x
True

So far as I know, it's not possible to create such using only immutable
objects. And anything created using only immutable objects will have a hash.

A beginner can easily by mistake create an object such as x above, and
without the short-circuit provided by
>>> x
[[...]]
the result is an unprintable object, that further raises an exception (or
worse). That's really bad for the poor beginner.

As a first approximation, my opinion is that data having such a loop /
cycle in it is at least a code smell, and perhaps worse. However, there may
be examples that I've not considered that would make me change my mind.

By the way, in the mathematics of the symmetric group (permutations) the
two tuples
>>> (0, 1, 2, 3)
>>> (1, 2, 3, 0)
are different representations of the same cycle (where here the word cycle
has a different meaning).

Also by the way, determining if two vertex-edge graphs are isomorphic is a
hard problem. I've been told that the best tool for this is
https://www3.cs.stonybrook.edu/~algorith/implement/nauty/implement.shtml

To summarise, it's loops / cycles in the representation of the data that's
the problem. Without a use case to the contrary, I'd say don't do this.
That's my opinion. Yours may be different.

Final idea. Perhaps a tool to detect cycles in data might be a good idea.

Finally, Happy New Year.

 --
Jonathan
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LPWI7X7RJBWOSMXFQ3VURMADKDSLA5VR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Stestagg
On Fri, Jan 1, 2021 at 2:21 PM Marco Sulla 
wrote:

> On Fri, 1 Jan 2021 at 06:38, Steven D'Aprano  wrote:
> > Relevant: https://bugs.python.org/issue42801
>
> Can't reproduce on the latest trunk (3.10). I get 1989 as a result.
>

There were a spate of bugs about this, and was fixed by
https://github.com/python/cpython/pull/23744 quite recently

___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/2FFECDJEXGYY5UDLPNOUOMOCDIEBXWLZ/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/F3T5M7OBKRZMAUDWDUMJM7TI3L4DCHLV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread 2QdxY4RzWzUUiLuE
On 2021-01-01 at 16:34:15 +1100,
Steven D'Aprano  wrote:

> This isn't so much an idea for Python, as a request for ideas to solve a 
> problem in Python.

[examples of recurive data snipped]

> The built-ins handle these cases okay. Likewise the copy.deepcopy
> function takes care to avoid getting stuck in a loop with recursive
> data structures.

The POSIX utilities tar and find (and I'm sure there are more) also deal
with this.

> On the other hand, when I write code to process nested data
> structures, I always worry about such recursive patterns, but actually
> doing something about it is too hard. So I just ignore the risk and
> feel guilty about it. I'm sure I'm not the only one who feels guilty
> about ignoring this problem.

Sorry about the guilt.

Know your requirements!  Specify your constraints and limitations!
Document, document, document!  But you know that stuff already.

> To say nothing of those who don't even know there's a problem to worry
> about.

*sigh*

> Is there something we can do to make it easier for people to deal with
> recursive data structures, like the list repr and deepcopy code does?
> Is there a pattern we can abstract out and provide as a tool or recipe
> for people to use in their own code?

My first thought was that all of those things (printing a list,
navigating a tree^H^H^H^H cyclic graph, deepcopying data, etc.) are the
recipes, and that every use case is different, but that's kind of a cop
out.

If I had to "abstract out" a pattern or an algorithm, it might look
something like this:

1.  initialize been_there to an empty set
2.  initialize go_here to a collection containing the root of the data
  (e.g., go_here = [root])
3.  if go_here is empty, then stop
4.  remove one element from go_here, assign it to current_node
5.  if current_node is in been_there, go back to step 3
6.  deal with current_node
7.  add current_node to been_there
8.  add each of current_node's "next node"s to go_here
9.  go back to step 3

IOW, keep track of where you've been, and don't go there again.  I
probably should say out loud that go_here should contain the id of nodes
rather than references to nodes (to avoid having to know too much about
how to determine equality amongst nodes), but I was trying to stay away
from too many Python- or implementation- specific details.

Yes, it takes additional time and space not to dump core.  No, you don't
have to do this if you're certain that your data isn't recursive (even
if your data structure is).

But that seems too obvious and (at least at this level) too simple.
It's almost right out of any graph (vertexes and edges) library that
handles non-acyclic graphs, directed or not.  So maybe it doesn't
provide for a pretty "..." at the right place in the report.  What am I
missing?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/O62LCGZXLEW2N3X7QOGRS2U3EOIRGUV2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Standard tool for iterating over recursive data structures?

2021-01-01 Thread Marco Sulla
On Fri, 1 Jan 2021 at 06:38, Steven D'Aprano  wrote:
> Relevant: https://bugs.python.org/issue42801

Can't reproduce on the latest trunk (3.10). I get 1989 as a result.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2FFECDJEXGYY5UDLPNOUOMOCDIEBXWLZ/
Code of Conduct: http://python.org/psf/codeofconduct/