[ 
https://issues.apache.org/jira/browse/CALCITE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17132852#comment-17132852
 ] 

Liya Fan commented on CALCITE-4049:
-----------------------------------

[~julianhyde] Thanks a lot for your comments. 
The down-side of adding the size field is that it makes the object size 4 bytes 
larger (I am not sure if it is 50% larger, as an object has other meta-data to 
store). The up-side is reducing the time complexity of getting size from O(n) 
to O(1). 

IMO, a general trend for performance optimization is to reduce time complexity 
by taking more spaces, as memory/disk spaces are becoming less expensive today 
and modern computers tend to have larger memory/disk spaces. 

For example, we build indices for relation databases to reduce the query time, 
at the expense of extra memory/disk spaces. We can also find examples from 
Calcite code base. For example, in class 
[RexCall|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexCall.java],
 we have the {{nodeCount}} field, which introduces an extra 4 byte. However, 
since it reduces the time complexity of getting node count, it should be 
considered a good optimization. 


> Reduce the time complexity of getting shortest distances
> --------------------------------------------------------
>
>                 Key: CALCITE-4049
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4049
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Liya Fan
>            Assignee: Liya Fan
>            Priority: Major
>             Fix For: 1.24.0
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently, we have {{Graphs#makeImmutable}} to compute the shortest paths 
> between all pairs of nodes. For many scenarios, however, we do not need the 
> exact paths between nodes. Instead, we are only interested in the lengths of 
> the shortest paths. 
> To get the path length, we need to get the shortest path first, which is 
> returned as a {{List}}, then we call the {{List#size()}} method. According to 
> the current implementation, the returned list is of type {{ConsList}}. The 
> time complexity of {{ConsList#size}} is O(p) (p is the number of vertices on 
> the path), which is inefficient. 
> In this issue, we revise the implementation of {{ConsList#size}} so that it 
> takes O(1) time. In addition, we also give a utiltiy to get the shortest 
> distances between nodes. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to