[jira] [Commented] (IMPALA-9154) KRPC DataStreamService threads blocked in PublishFilter

2019-11-21 Thread Michael Ho (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16979682#comment-16979682
 ] 

Michael Ho commented on IMPALA-9154:


Given the fix is non-trivial, it may make sense to back out the offending 
change for now.

> KRPC DataStreamService threads blocked in PublishFilter
> ---
>
> Key: IMPALA-9154
> URL: https://issues.apache.org/jira/browse/IMPALA-9154
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.4.0
>Reporter: Tim Armstrong
>Assignee: Fang-Yu Rao
>Priority: Blocker
>  Labels: hang
> Attachments: image-2019-11-13-08-30-27-178.png, pstack-exchange.txt
>
>
> I hit this on primitive_many_fragments when doing a single node perf run:
> {noformat}
>  ./bin/single_node_perf_run.py --num_impalads=1 --scale=30 --ninja 
> --workloads=targeted-perf  --iterations=5
> {noformat}tan 
> I noticed that the query was hung and the execution threads were hung sending 
> row batches. Then looking at the RPCz page, all of the threads were busy:
>  !image-2019-11-13-08-30-27-178.png! 
> Multiple threads were stuck in UpdateFilter() - see  [^pstack-exchange.txt]. 
> It looks like this is a deadlock bug because a KRPC thread is blocked waiting 
> for an RPC that needs to be served by one of the limited threads from that 
> same thread pool



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9154) KRPC DataStreamService threads blocked in PublishFilter

2019-11-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980610#comment-16980610
 ] 

ASF subversion and git services commented on IMPALA-9154:
-

Commit e716e76cccf59c2780571429b1b945d6bbc61b8d in impala's branch 
refs/heads/master from Fang-Yu Rao
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e716e76 ]

IMPALA-9154: Revert "IMPALA-7984: Port runtime filter from Thrift RPC to KRPC"

The previous patch porting runtime filter from Thrift RPC to KRPC
introduces a deadlock if there are a very limited number of threads on
the Impala cluster.

Specifically, in that patch a Coordinator used a synchronous KRPC to
propagate an aggregated filter to other hosts. A deadlock would happen
if there is no thread available on the receiving side to answer that
KRPC especially the calling and receiving threads are called from the
same thread pool. One possible way to address this issue is to make
the call of propagating a runtime filter asynchronous to free the
calling thread. Before resolving this issue, we revert this patch for
now.

This reverts commit ec11c18884988e838a8838e1e8ecc37461e1a138.

Change-Id: I32371a515fb607da396914502da8c7fb071406bc
Reviewed-on: http://gerrit.cloudera.org:8080/14780
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> KRPC DataStreamService threads blocked in PublishFilter
> ---
>
> Key: IMPALA-9154
> URL: https://issues.apache.org/jira/browse/IMPALA-9154
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.4.0
>Reporter: Tim Armstrong
>Assignee: Fang-Yu Rao
>Priority: Blocker
>  Labels: hang
> Attachments: image-2019-11-13-08-30-27-178.png, pstack-exchange.txt
>
>
> I hit this on primitive_many_fragments when doing a single node perf run:
> {noformat}
>  ./bin/single_node_perf_run.py --num_impalads=1 --scale=30 --ninja 
> --workloads=targeted-perf  --iterations=5
> {noformat}tan 
> I noticed that the query was hung and the execution threads were hung sending 
> row batches. Then looking at the RPCz page, all of the threads were busy:
>  !image-2019-11-13-08-30-27-178.png! 
> Multiple threads were stuck in UpdateFilter() - see  [^pstack-exchange.txt]. 
> It looks like this is a deadlock bug because a KRPC thread is blocked waiting 
> for an RPC that needs to be served by one of the limited threads from that 
> same thread pool



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9154) KRPC DataStreamService threads blocked in PublishFilter

2020-01-20 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019784#comment-17019784
 ] 

ASF subversion and git services commented on IMPALA-9154:
-

Commit 79aae231443a305ce8503dbc7b4335e8ae3f3946 in impala's branch 
refs/heads/master from Fang-Yu Rao
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=79aae23 ]

IMPALA-9154: Make runtime filter propagation asynchronous

This patch fixes a bug introduced by IMPALA-7984 that ports the
functions implementing the aggregation and propagation of runtime
filters from Thrift RPC to KRPC.

Specifically, in IMPALA-7984, the propagation of an aggregated
runtime filter was implemented using the synchronous KRPC. Hence, when
there is a very limited number of KRPC threads for Impala's data stream
service, e.g., 1, there will be a deadlock if the node running the
Coordinator is trying to propagate the aggregated filter to the same
node running the Coordinator since there is no available thread to
receive the aggregated filter.

This patch makes the propagation of an aggregated runtime filter
asynchronous to address the issue described above. To prevent the
memory consumed by the aggregated filter from being reclaimed when the
aggregated filter is still referenced by some inflight KRPC's, we add an
additional field in the class Coordinator::FilterState to keep track of
the number of inflight KRPC's for the propagation of this aggregated
filter to make sure that we will reclaim the memory only when all the
associated KRPC's have completed. Moreover, when ReleaseExecResources()
is invoked by the Coordinator to release all the resources associated
with query execution, including the memory consumed by the aggregated
runtime filters, we make sure the consumed memory by the aggregated
filters is released only when the inflight KRPC's associated with each
aggregated filter have finished.

Testing:
- Passed primitive_many_fragments.test with the database tpch30 in an
  Impala minicluster started with the parameter
  --impalad_args=--datastream_service_num_svc_threads=1.
- Passed the exhaustive tests in the DEBUG build.
- Passed the core tests in the ASAN build.

Change-Id: Ifb6726d349be701f3a0602b2ad5a934082f188a0
Reviewed-on: http://gerrit.cloudera.org:8080/14975
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> KRPC DataStreamService threads blocked in PublishFilter
> ---
>
> Key: IMPALA-9154
> URL: https://issues.apache.org/jira/browse/IMPALA-9154
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.4.0
>Reporter: Tim Armstrong
>Assignee: Fang-Yu Rao
>Priority: Blocker
>  Labels: hang
> Attachments: image-2019-11-13-08-30-27-178.png, pstack-exchange.txt
>
>
> I hit this on primitive_many_fragments when doing a single node perf run:
> {noformat}
>  ./bin/single_node_perf_run.py --num_impalads=1 --scale=30 --ninja 
> --workloads=targeted-perf  --iterations=5
> {noformat}tan 
> I noticed that the query was hung and the execution threads were hung sending 
> row batches. Then looking at the RPCz page, all of the threads were busy:
>  !image-2019-11-13-08-30-27-178.png! 
> Multiple threads were stuck in UpdateFilter() - see  [^pstack-exchange.txt]. 
> It looks like this is a deadlock bug because a KRPC thread is blocked waiting 
> for an RPC that needs to be served by one of the limited threads from that 
> same thread pool



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org