By studying useful idioms in untyped languages, one can gain a better
understanding of the challenges faced in designing a good type system.
Here is the classic union-find algorithm in Python. Turns out that using
just a single field at each node works fairly well. Assume each node has
a uniquely identifying index (an int). Initially the field is a set (of
cohorts), and eventually it transitions to an int (the index of the node
this one is merged into).
For a strongly typed language this is the basic challenge of dealing
with typestate. There may be some leverage to be gained by designing a
system that admits only a bounded number of type transitions (in this
case, 2).
(This "overlay" notion goes back to the 50s and 60s, when memory was in
short supply and reusing the same storage block in different ways at
different points in the algorithm made sense. We don't have that
motivation any more. For me, the motivation was simply clarity of the
code. By using *one* variable, I ensure that I do not have to manage
semantic invariants across multiple distinct variables, e.g. an
`in_cohort` in field and a `cohorts` int set field.)
def find_root(node0, nodes):
"""
The find in union find. nodes are dicts with fields index and
cohorts.
It is required that nodes[i]['index']==i.
nodes[i]['cohorts'] is either an int index for some other node,
or a set of indices.
"""
node0_idx = node0['index']
node = node0
while isinstance(node['cohorts'], int):
node = nodes[node['cohorts']]
node_idx = node['index']
assert node0_idx==node_idx or isinstance(node['cohorts'], set),
('Ugh. no cohort set!',
node_idx,
'original node', node0_idx)
return node
def connected(nodes0, edge_fn):
"""
Return connected component in the graph whose list of nodes are
given by nodes0.
For any pair (x, y) of nodes edge_fn(x,y) should return True iff
there is an edge.
A connected component is a set of indices of the nodes in that
component.
e.g.
connected([0,1,2,3], lambda x,y:True) ==> [{0,1,2,3}]
connected([0,1,2,3], lambda x,y:False) ==>
[{0},{1},{2},{3}]
connected([0, 1, 2, 3], lambda x, y: x + y<= 2) ==>
[{0,1,2},{3}]
connected([0, 1, 2, 3], lambda x, y: x + y % 2 ==0) ==>
[{0,2},{1,3}]
"""
# Key implementation decision: Keep just one field. Starts out as a
set, becomes
# an int when (the set associated with) this node is merged into
another.
# initialize with every node in its own cohort group.
nodes = [{'node':nodes_[i], 'cohorts':set([i]), 'index':i}
for i in range(len(nodes_))]
for i in range(len(nodes)):
root = find_root(nodes[i], nodes)
root_idx = root['index']
for other in nodes[i+1:]:
if edge_fn(node['node'], other['node']):
other_root = find_root(other, nodes)
if root_idx != other_root['index']:
root['cohorts'] |= other_root['cohorts']
other_root['cohorts'] = root_idx
result = [node['cohorts'] for node in nodes if
isinstance(node['cohorts'], set)]
return result
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
X10-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/x10-users