[ 
https://issues.apache.org/jira/browse/IMPALA-13040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845175#comment-17845175
 ] 

ASF subversion and git services commented on IMPALA-13040:
----------------------------------------------------------

Commit 09d2f10f4ddf3499b6255a6d14653e7738c2928b in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=09d2f10f4 ]

IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote

It is possible to have UpdateFilterFromRemote RPC arrive to an impalad
executor before QueryState of the destination query is created or
complete initialization. This patch add wait mechanism in
UpdateFilterFromRemote RPC endpoint to wait for few miliseconds until
QueryState exist and complete initialization.

The wait time is fixed at 500ms, with exponential sleep period in
between. If wait time passed and QueryState still not found or
initialized, UpdateFilterFromRemote RPC is deemed fail and query
execution move on without complete filter.

Testing:
- Add BE tests in network-util-test.cc
- Add test_runtime_filter_aggregation.py::TestLateQueryStateInit
- Pass exhastive runs of test_runtime_filter_aggregation.py,
  test_query_live.py, and test_query_log.py

Change-Id: I156d1f0c694b91ba34be70bc53ae9bacf924b3b9
Reviewed-on: http://gerrit.cloudera.org:8080/21383
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>


> SIGSEGV in  QueryState::UpdateFilterFromRemote
> ----------------------------------------------
>
>                 Key: IMPALA-13040
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13040
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Csaba Ringhofer
>            Priority: Critical
>
> {code}
> Crash reason:  SIGSEGV /SEGV_MAPERR
> Crash address: 0x48
> Process uptime: not available
> Thread 114 (crashed)
>  0  libpthread.so.0 + 0x9d00
>     rax = 0x000000019e57ad00   rdx = 0x000000002a656720
>     rcx = 0x00000000059a9860   rbx = 0x0000000000000000
>     rsi = 0x000000019e57ad00   rdi = 0x0000000000000038
>     rbp = 0x00007f6233d544e0   rsp = 0x00007f6233d544a8
>      r8 = 0x0000000006a53540    r9 = 0x0000000000000039
>     r10 = 0x0000000000000000   r11 = 0x000000000000000a
>     r12 = 0x000000019e57ad00   r13 = 0x00007f62a2f997d0
>     r14 = 0x00007f6233d544f8   r15 = 0x000000001632c0f0
>     rip = 0x00007f62a2f96d00
>     Found by: given as instruction pointer in context
>  1  
> impalad!impala::QueryState::UpdateFilterFromRemote(impala::UpdateFilterParamsPB
>  const&, kudu::rpc::RpcContext*) [query-state.cc : 1033 + 0x5]
>     rbp = 0x00007f6233d54520   rsp = 0x00007f6233d544f0
>     rip = 0x00000000015c0837
>     Found by: previous frame's frame pointer
>  2  
> impalad!impala::DataStreamService::UpdateFilterFromRemote(impala::UpdateFilterParamsPB
>  const*, impala::UpdateFilterResultPB*, kudu::rpc::RpcContext*) 
> [data-stream-service.cc : 134 + 0xb]
>     rbp = 0x00007f6233d54640   rsp = 0x00007f6233d54530
>     rip = 0x00000000017c05de
>     Found by: previous frame's frame pointer
> {code}
> The line that crashes is 
> https://github.com/apache/impala/blob/b39cd79ae84c415e0aebec2c2b4d7690d2a0cc7a/be/src/runtime/query-state.cc#L1033
> My guess is that inside the actual segfault is within WaitForPrepare() but it 
> was inlined. Not sure if a remote filter can arrive even before 
> QueryState::Init is finished - that would explain the issue, as 
> instances_prepared_barrier_ is not yet created at that point.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to