[jira] [Resolved] (IMPALA-8925) Consider replacing ClientRequestState ResultCache with result spooling

2020-10-12 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8925.
--
Resolution: Later

This would be nice to have, but not seeing a strong reason to do this at the 
moment. So closing as "Later".

> Consider replacing ClientRequestState ResultCache with result spooling
> --
>
> Key: IMPALA-8925
> URL: https://issues.apache.org/jira/browse/IMPALA-8925
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Clients
>Reporter: Sahil Takiar
>Priority: Minor
>
> The {{ClientRequestState}} maintains an internal results cache (which is 
> really just a {{QueryResultSet}}) in order to provide support for the 
> {{TFetchOrientation.FETCH_FIRST}} fetch orientation (used by Hue - see 
> [https://github.com/apache/impala/commit/6b769d011d2016a73483f63b311e108d17d9a083]).
> The cache itself has some limitations:
>  * It caches all results in a {{QueryResultSet}} with limited admission 
> control integration
>  * It has a max size, if the size is exceeded the cache is emptied
>  * It cannot spill to disk
> Result spooling could potentially replace the query result cache and provide 
> a few benefits; it should be able to fit more rows since it can spill to 
> disk. The memory is better tracked as well since it integrates with both 
> admitted and reserved memory. Hue currently sets the max result set fetch 
> size to 
> [https://github.com/cloudera/hue/blob/master/apps/impala/src/impala/impala_flags.py#L61],
>  would be good to check how well that value works for Hue users so we can 
> decide if replacing the current result cache with result spooling makes sense.
> This would require some changes to result spooling as well, currently it 
> discards rows whenever it reads them from the underlying 
> {{BufferedTupleStream}}. It would need the ability to reset the read cursor, 
> which would require some changes to the {{PlanRootSink}} interface as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8925) Consider replacing ClientRequestState ResultCache with result spooling

2020-10-12 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8925.
--
Resolution: Later

This would be nice to have, but not seeing a strong reason to do this at the 
moment. So closing as "Later".

> Consider replacing ClientRequestState ResultCache with result spooling
> --
>
> Key: IMPALA-8925
> URL: https://issues.apache.org/jira/browse/IMPALA-8925
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Clients
>Reporter: Sahil Takiar
>Priority: Minor
>
> The {{ClientRequestState}} maintains an internal results cache (which is 
> really just a {{QueryResultSet}}) in order to provide support for the 
> {{TFetchOrientation.FETCH_FIRST}} fetch orientation (used by Hue - see 
> [https://github.com/apache/impala/commit/6b769d011d2016a73483f63b311e108d17d9a083]).
> The cache itself has some limitations:
>  * It caches all results in a {{QueryResultSet}} with limited admission 
> control integration
>  * It has a max size, if the size is exceeded the cache is emptied
>  * It cannot spill to disk
> Result spooling could potentially replace the query result cache and provide 
> a few benefits; it should be able to fit more rows since it can spill to 
> disk. The memory is better tracked as well since it integrates with both 
> admitted and reserved memory. Hue currently sets the max result set fetch 
> size to 
> [https://github.com/cloudera/hue/blob/master/apps/impala/src/impala/impala_flags.py#L61],
>  would be good to check how well that value works for Hue users so we can 
> decide if replacing the current result cache with result spooling makes sense.
> This would require some changes to result spooling as well, currently it 
> discards rows whenever it reads them from the underlying 
> {{BufferedTupleStream}}. It would need the ability to reset the read cursor, 
> which would require some changes to the {{PlanRootSink}} interface as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)