[ 
https://issues.apache.org/jira/browse/IMPALA-7906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862627#comment-16862627
 ] 

WangSheng commented on IMPALA-7906:
-----------------------------------

Our impala cluster met the same problem, and cannot reproduced. Our impala 
version is 2.12.0. And here is
 the error info in hs_err_pid*.logļ¼š

{code:java}
[error occurred during error reporting (printing register info), id 0xb]

Stack: [0x00007fce79458000,0x00007fce79559000],  sp=0x00007fce795577a0,  free 
space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x984ab2]  oopDesc* 
PSPromotionManager::copy_to_survivor_space<false>(oopDesc*)+0x162
V  [libjvm.so+0x988276]  StealTask::do_it(GCTaskManager*, unsigned int)+0x2f6
V  [libjvm.so+0x5d459f]  GCTaskThread::run()+0x12f
V  [libjvm.so+0x91d9d8]  java_start(Thread*)+0x108
{code}

> Crash in JVM PSPromotionManager::copy_to_survivor_space
> -------------------------------------------------------
>
>                 Key: IMPALA-7906
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7906
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 3.2.0
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Critical
>              Labels: broken-build, crash
>         Attachments: hs_err_pid6290.log
>
>
> {noformat}
> #0  0x00007f44ca5261f7 in raise () from /lib64/libc.so.6
> #1  0x00007f44ca5278e8 in abort () from /lib64/libc.so.6
> #2  0x00007f44cd726185 in os::abort(bool) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #3  0x00007f44cd8c8593 in VMError::report_and_die() () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #4  0x00007f44cd8c8a7e in crash_handler(int, siginfo*, void*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #5  0x00007f44cd724f72 in os::Linux::chained_handler(int, siginfo*, void*) () 
> from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #6  0x00007f44cd72b5f6 in JVM_handle_linux_signal () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #7  0x00007f44cd721be3 in signalHandler(int, siginfo*, void*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #8  <signal handler called>
> #9  0x00007f44cd713e95 in oopDesc::print_on(outputStream*) const () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #10 0x00007f44cd72afdb in os::print_register_info(outputStream*, void*) () 
> from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #11 0x00007f44cd8c6c13 in VMError::report(outputStream*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #12 0x00007f44cd8c818a in VMError::report_and_die() () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #13 0x00007f44cd72b68f in JVM_handle_linux_signal () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #14 0x00007f44cd721be3 in signalHandler(int, siginfo*, void*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #15 <signal handler called>
> #16 0x00007f44cd78f562 in oopDesc* 
> PSPromotionManager::copy_to_survivor_space<false>(oopDesc*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #17 0x00007f44cd7924a5 in PSRootsClosure<false>::do_oop(oopDesc**) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #18 0x00007f44cd716a96 in InterpreterOopMap::iterate_oop(OffsetClosure*) 
> const () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #19 0x00007f44cd38f789 in frame::oops_interpreted_do(OopClosure*, 
> CLDClosure*, RegisterMap const*, bool) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #20 0x00007f44cd86eaa1 in JavaThread::oops_do(OopClosure*, CLDClosure*, 
> CodeBlobClosure*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #21 0x00007f44cd79270f in ThreadRootsTask::do_it(GCTaskManager*, unsigned 
> int) () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #22 0x00007f44cd3d7ecf in GCTaskThread::run() () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #23 0x00007f44cd727338 in java_start(Thread*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #24 0x00007f44ca8bbe25 in start_thread () from /lib64/libpthread.so.0
> #25 0x00007f44ca5e934d in clone () from /lib64/libc.so.6
> {noformat}
> These are the tests running at the time
> {noformat}
> 006:53:04 [gw1] PASSED 
> query_test/test_mem_usage_scaling.py::TestQueryMemLimitScaling::test_mem_usage_scaling[mem_limit:
>  -1 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 06:53:07 
> query_test/test_mem_usage_scaling.py::TestQueryMemLimitScaling::test_mem_usage_scaling[mem_limit:
>  400m | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 06:53:07 [gw5] PASSED 
> query_test/test_analytic_tpcds.py::TestAnalyticTpcds::test_analytic_functions_tpcds[batch_size:
>  1 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> 06:53:08 
> query_test/test_cancellation.py::TestCancellationParallel::test_cancel_select[protocol:
>  beeswax | table_format: text/gzip/block | exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | query_type: SELECT | wait_action: 
> 0:GETNEXT:WAIT | cancel_delay: 0.01 | cpu_limit_s: 100000 | query: select * 
> from lineitem limit 50 | fail_rpc_action: 
> COORD_CANCEL_QUERY_FINSTANCES_RPC:FAIL | join_before_close: True | 
> buffer_pool_limit: 0] 
> 06:53:08 [gw5] PASSED 
> query_test/test_cancellation.py::TestCancellationParallel::test_cancel_select[protocol:
>  beeswax | table_format: text/gzip/block | exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | query_type: SELECT | wait_action: 
> 0:GETNEXT:WAIT | cancel_delay: 0.01 | cpu_limit_s: 100000 | query: select * 
> from lineitem limit 50 | fail_rpc_action: 
> COORD_CANCEL_QUERY_FINSTANCES_RPC:FAIL | join_before_close: True | 
> buffer_pool_limit: 0] 
> 06:53:08 [gw2] PASSED 
> query_test/test_decimal_casting.py::TestDecimalCasting::test_min_max_zero_null[cast_from:
>  number | decimal_type: (31, 14) | exec_option: {'decimal_v2': 'true'}] 
> 06:53:09 
> query_test/test_decimal_casting.py::TestDecimalCasting::test_min_max_zero_null[cast_from:
>  number | decimal_type: (31, 22) | exec_option: {'decimal_v2': 'true'}] 
> 06:54:07 
> query_test/test_cancellation.py::TestCancellationParallel::test_cancel_select[protocol:
>  beeswax | table_format: kudu/none | exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | query_type: SELECT | wait_action: 
> 0:GETNEXT:WAIT | cancel_delay: 0 | cpu_limit_s: 100000 | query: compute stats 
> lineitem | fail_rpc_action: COORD_CANCEL_QUERY_FINSTANCES_RPC:FAIL | 
> join_before_close: True | buffer_pool_limit: 0] 
> 06:54:08 [gw6] FAILED 
> query_test/test_decimal_fuzz.py::TestDecimalFuzz::test_decimal_ops[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0}] 
> 06:54:08 
> query_test/test_decimal_fuzz.py::TestDecimalFuzz::test_width_bucket[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0}] 
> 06:54:08 [gw6] FAILED 
> query_test/test_decimal_fuzz.py::TestDecimalFuzz::test_width_bucket[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0}] 
> 06:54:08 
> query_test/test_decimal_queries.py::TestDecimalQueries::test_queries[protocol:
>  beeswax | exec_option: {'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': 'false', 'decimal_v2': 'false', 'batch_size': 0} | 
> table_format: text/none] 
> 06:54:08 [gw6] ERROR 
> query_test/test_decimal_queries.py::TestDecimalQueries::test_queries[protocol:
>  beeswax | exec_option: {'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': 'false', 'decimal_v2': 'false', 'batch_size': 0} | 
> table_format: text/none] 
> 06:54:08 
> query_test/test_decimal_queries.py::TestDecimalQueries::test_queries[protocol:
>  hs2 | exec_option: {'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> 'true', 'decimal_v2': 'false', 'batch_size': 0} | table_format: parquet/none] 
> {noformat}
> One thing that's a little interesting is that it's running select 
> repeat('AZ', 128 * 1024 * 1024), which passes a large string from the backend 
> to frontend - maybe something went wrong there?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to