[jira] [Resolved] (IMPALA-8845) Close ExecNode tree prior to calling FlushFinal in FragmentInstanceState

2019-08-27 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8845.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Close ExecNode tree prior to calling FlushFinal in FragmentInstanceState
> 
>
> Key: IMPALA-8845
> URL: https://issues.apache.org/jira/browse/IMPALA-8845
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Sahil Takiar
>Assignee: Michael Ho
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> While testing IMPALA-8818, I found that IMPALA-8780 does not always cause all 
> non-coordinator fragments to shutdown. In certain setups, TopN queries 
> ({{select * from [table] order by [col] limit [limit]}}) where all results 
> are successfully spooled, still keep non-coordinator fragments alive.
> The issue is that sometimes the {{DATASTREAM SINK}} for the TopN <-- Scan 
> Node fragment ends up blocking waiting for a response to a {{TransmitData()}} 
> RPC. This prevents the fragment from shutting down.
> I haven't traced the issue exactly, but what I *think* is happening is that 
> the {{MERGING-EXCHANGE}} operator in the coordinator fragment hits {{eos}} 
> whenever it has received enough rows to reach the limit defined in the query, 
> which could occur before the {{DATASTREAM SINK}} sends all the rows from the 
> TopN / Scan Node fragment.
> So the TopN / Scan Node fragments end up hanging until they are explicitly 
> closed.
> The fix is to close the {{ExecNode}} tree in {{FragmentInstanceState}} as 
> eagerly as possible. Moving the close call to before the call to 
> {{DataSink::FlushFinal}} fixes the issue. It has the added benefit that it 
> shuts down and releases all {{ExecNode}} resources as soon as it can. When 
> result spooling is enabled, this is particularly important because 
> {{FlushFinal}} might block until the consumer reads all rows.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (IMPALA-8845) Close ExecNode tree prior to calling FlushFinal in FragmentInstanceState

2019-08-27 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8845.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Close ExecNode tree prior to calling FlushFinal in FragmentInstanceState
> 
>
> Key: IMPALA-8845
> URL: https://issues.apache.org/jira/browse/IMPALA-8845
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Sahil Takiar
>Assignee: Michael Ho
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> While testing IMPALA-8818, I found that IMPALA-8780 does not always cause all 
> non-coordinator fragments to shutdown. In certain setups, TopN queries 
> ({{select * from [table] order by [col] limit [limit]}}) where all results 
> are successfully spooled, still keep non-coordinator fragments alive.
> The issue is that sometimes the {{DATASTREAM SINK}} for the TopN <-- Scan 
> Node fragment ends up blocking waiting for a response to a {{TransmitData()}} 
> RPC. This prevents the fragment from shutting down.
> I haven't traced the issue exactly, but what I *think* is happening is that 
> the {{MERGING-EXCHANGE}} operator in the coordinator fragment hits {{eos}} 
> whenever it has received enough rows to reach the limit defined in the query, 
> which could occur before the {{DATASTREAM SINK}} sends all the rows from the 
> TopN / Scan Node fragment.
> So the TopN / Scan Node fragments end up hanging until they are explicitly 
> closed.
> The fix is to close the {{ExecNode}} tree in {{FragmentInstanceState}} as 
> eagerly as possible. Moving the close call to before the call to 
> {{DataSink::FlushFinal}} fixes the issue. It has the added benefit that it 
> shuts down and releases all {{ExecNode}} resources as soon as it can. When 
> result spooling is enabled, this is particularly important because 
> {{FlushFinal}} might block until the consumer reads all rows.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org