[ https://issues.apache.org/jira/browse/CALCITE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134573#comment-17134573 ]
Julian Hyde commented on CALCITE-4049: -------------------------------------- The advantage of ConsList is that lists can share tails. For example, the paths [a, d, e] and [a, b, c, d, e] share the tail [d, e]. Sharing tails saves some memory (but I've not done any experiments to see whether there's a significant advantage). A single-linked list could in principle do the same, but java's LinkedList class (or in fact any double-linked list) does not share links. The original complaint in this bug is that when you get a path from a frozen graph, calling List.size() is expensive. But when I look at the code, I see that the "freeze" process ({{Graph.makeImmutable}}) heavily relies on List.size() and therefore will perform badly. A better implementation of "freeze" would perhaps use Floyd-Warshall algorithm to compute and store the lengths of the shortest paths between any two nodes. It would only store the length, not the path. The path can be reconstructed fairly cheaply. > Reduce the time complexity of getting shortest distances > -------------------------------------------------------- > > Key: CALCITE-4049 > URL: https://issues.apache.org/jira/browse/CALCITE-4049 > Project: Calcite > Issue Type: Improvement > Components: core > Reporter: Liya Fan > Assignee: Liya Fan > Priority: Major > Fix For: 1.24.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Currently, we have {{Graphs#makeImmutable}} to compute the shortest paths > between all pairs of nodes. For many scenarios, however, we do not need the > exact paths between nodes. Instead, we are only interested in the lengths of > the shortest paths. > To get the path length, we need to get the shortest path first, which is > returned as a {{List}}, then we call the {{List#size()}} method. According to > the current implementation, the returned list is of type {{ConsList}}. The > time complexity of {{ConsList#size}} is O(p) (p is the number of vertices on > the path), which is inefficient. > In this issue, we revise the implementation of {{ConsList#size}} so that it > takes O(1) time. In addition, we also give a utiltiy to get the shortest > distances between nodes. -- This message was sent by Atlassian Jira (v8.3.4#803005)