On 2018-01-15 06:15, Frank Millman wrote:
"Christian Gollwitzer"  wrote in message news:p3gh84$kfm$1...@dont-email.me...

Am 14.01.18 um 22:04 schrieb Christian Gollwitzer:
> Am 14.01.18 um 09:30 schrieb Frank Millman:
>> I need to detect when a 'cycle' occurs - when a path loops back on >> itself and revisits a node previously visited. I have figured out a way >> to do this, but I have a problem.
>
> I don't know if that helps, but there is a classic graph theory > algorithm called "Floyd's cycle detector". The idea is to have a pointer > move along the graph and a second one which runs at double the speed. If > they meet, you found a cycle. It is not straight-forward to come up with > this algorithm, which even runs in constant storage. ISTR that there are > some variants which can give you the split point of the cycle, too.

And here is an algorithm to enumerate all cycles in a directed graph:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.335.5999&rep=rep1&type=pdf

with an implementation in C++ here:

https://github.com/hellogcc/circuit-finding-algorithm


I appreciate the input, Christian, but I am afraid both of those were over
my head :-(

I think/hope that a business process graph does not require such a complex
solution. Here is my cycle-detecting algorithm.

In BPMN terms, each node can have 0->* incoming connections, and 0->*
outgoing connections.

Any node with 0 incoming is deemed to start the process. Normally there is
only one such node.

Any node with 0 outgoing represents the end of an 'active path'. If there is
more than one such node, all 'active' ones must reach the end before the
process is finished. There is no formal definition of an 'active path', and
I can think of a few corner cases which could prove problematic, but
normally the meaning is clear.

I start my cycle-detection with a node with 0 incoming connections.

def find_cycle(node, path):
     for output in node.outputs:
         if output in path:
             print('Cycle found in', path)
         else:
             new_path = path[:]
             new_path.append(output)
             find_cycle(output, new_path)

find_cycle(node, [node])

This traverses every possible path in the graph. I think/hope that a typical
business process will not grow so large as to cause a problem.

If anyone can see a flaw in this, please let me know.

A couple of suggestions:

1. Instead of copying the list and appending, I'd do:

    find_cycle(output, path + [output])

2. Lists are ordered, but searching them is O(n), so I'd pass a set too to speed it up a bit:

    def find_cycle(node, path, visited):
        for output in node.outputs:
            if output in visited:
                print('Cycle found in', path)
            else:
                find_cycle(output, path + [output], visited | {output})

That will help as the paths get longer, although on a small graph, it won't matter.
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to