[
https://issues.apache.org/jira/browse/CALCITE-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994872#comment-14994872
]
Jesse Yates commented on CALCITE-849:
-------------------------------------
Ok, finally getting some time to work on this and taking a fresh look -
cooperative multi-tasking totally makes sense, if you are assuming multiple
thread access. It might be nice to have a switch/config option for if you are
definitely _not_ going to have multithreaded access (otherwise, you end up
creating a bunch more 'stuff' per request that isn't needed).
>From there, my open question is how should two different calls to 'run' on
>different threads be viewed by the callee on the same statement? Should the
>second one see the next set of rows _or_ should it go back to the beginning
>and see the the results from there?
Design-wise, I'm thinking about managing requests through a queue with a simple
shared state of:
- running
- next N rows.
State starts to get more complicated if we assume that another caller to 'run'
goes to the beginning as that tracks another 'list' of the 'next N', in which
case each caller gets a row list, essentially forking the request.
I'm partial to the former, you just see the next batch, because of the
simplicity in implementation and lower concerns about blowing out memory
without intending to. However, I see how it might be nicer logically to just go
back to the beginning.
Thoughts?
> Streams/Slow iterators dont close on statement close
> ----------------------------------------------------
>
> Key: CALCITE-849
> URL: https://issues.apache.org/jira/browse/CALCITE-849
> Project: Calcite
> Issue Type: Bug
> Reporter: Jesse Yates
> Assignee: Julian Hyde
> Fix For: 1.5.0
>
> Attachments: calcite-849-bug.patch
>
>
> This is easily seen when querying an infinite stream with a clause that
> cannot be matched
> {code}
> select stream PRODUCT from orders where PRODUCT LIKE 'noMatch';
> select stream * from orders where PRODUCT LIKE 'noMatch';
> {code}
> The issue arises when accessing the results in a multi-threaded context. Yes,
> its not a good idea (and things will break, like here). However, this case
> feels like it ought to be an exception.
> Suppose you are accessing a stream and have a query that doesn't match
> anything on the stream for a long time. Because of the way a ResultSet is
> built, the call to executeQuery() will hang until the first matching result
> is received. In that case, you might want to cancel the query because its
> taking so long. You also want the thing that's accessing the stream (the
> StreamTable implementation) to cancel the querying/collection - via a call to
> close on the passed iterator/enumerable.
> Since the first result was never generated, the ResultSet was never returned
> to the caller. You can get around this by using a second thread and keeping a
> handle to the creating statement. When you go to close that statement though,
> you end up not closing the cursor (and the underlying iterables/enumberables)
> because it never finished getting created.
> It gets even more problematic if you are use select * as the iterable doesn't
> finish getting created in the AvaticaResultSet.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)