[ 
https://issues.apache.org/jira/browse/CALCITE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135665#comment-17135665
 ] 

Liya Fan edited comment on CALCITE-4049 at 6/15/20, 12:20 PM:
--------------------------------------------------------------

Hi all, thank you for the fruitful discussion.

I think we all agree that we should not modify the general ConsList data 
structure. In addition, it seems that the path itself is not that important, 
and is not frequently used. 

For the problem of high cost of getting path length, I think we have some other 
ways to solve it:

1. We can have a list implementation that wraps the ConsList and have a size 
field.
2. We can store the shortest path length along with the shortest path in 

{{Map<Pair<V, V>, List<V>> shortestPaths = new HashMap<>();}}

3. We can employ Floyd's algorithm to get the shortest path lengths, and then 
restore the path in O(p) time, whenever necessary. 

Method 3 was suggested by Julian in the above discussion. I am also in favor of 
it, because storing all the shortest paths can be expensive. In addition, it 
speeds up algorithm convergence. What do you think?

I will prepare a PR ASAP, and we can have further discussions based on it. 
Thank you.




was (Author: fan_li_ya):
Hi all, thank you for the fruitful discussion.

I think we all agree that we should not modify the general ConsList data 
structure. In addition, it seems that the path itself is not that important, 
and is not frequently used. 

For the problem of high cost of getting path length, I think we have some other 
ways to solve it:

1. We can have a list implementation that wraps the ConsList and have a size 
field.
2. We can store the shortest path length along with the shortest path in 

{{Map<Pair<V, V>, List<V>> shortestPaths = new HashMap<>();}}

3. We can employ Floyd's algorithm to get the shortest path lengths, and then 
restore the path in O(p) time, whenever necessary. 

Method 3 was suggested by Julian in the above discussion. I am also in favor of 
it, because storing all the shortest paths can be expensive.

I will prepare a PR ASAP, and we can have further discussions based on it. 
Thank you.



> Reduce the time complexity of getting shortest distances
> --------------------------------------------------------
>
>                 Key: CALCITE-4049
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4049
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Liya Fan
>            Assignee: Liya Fan
>            Priority: Major
>             Fix For: 1.24.0
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently, we have {{Graphs#makeImmutable}} to compute the shortest paths 
> between all pairs of nodes. For many scenarios, however, we do not need the 
> exact paths between nodes. Instead, we are only interested in the lengths of 
> the shortest paths. 
> To get the path length, we need to get the shortest path first, which is 
> returned as a {{List}}, then we call the {{List#size()}} method. According to 
> the current implementation, the returned list is of type {{ConsList}}. The 
> time complexity of {{ConsList#size}} is O(p) (p is the number of vertices on 
> the path), which is inefficient. 
> In this issue, we revise the implementation of {{ConsList#size}} so that it 
> takes O(1) time. In addition, we also give a utiltiy to get the shortest 
> distances between nodes. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to