[jira] [Commented] (KUDU-2753) kudu cluster rebalance crashes with core dump
[ https://issues.apache.org/jira/browse/KUDU-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16805435#comment-16805435 ] Will Berkeley commented on KUDU-2753: - For others who might run into this issue, a response from Cloudera can be found [on their community forums|https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Kudu-rebalance-crash/m-p/88462#M5486]. > kudu cluster rebalance crashes with core dump > - > > Key: KUDU-2753 > URL: https://issues.apache.org/jira/browse/KUDU-2753 > Project: Kudu > Issue Type: Bug > Components: CLI >Affects Versions: 1.7.0 > Environment: kudu-master-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-client-devel-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-tserver-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-client0-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 >Reporter: Arseniy Tashoyan >Assignee: Will Berkeley >Priority: Major > Fix For: n/a > > > The utility crashes: > {code} > -bash-4.2$ kudu cluster rebalance host1,host2,host3 > terminate called after throwing an instance of 'std::regex_error' > what(): regex_error > *** Aborted at 1553854510 (unix time) try "date -d @1553854510" if you are > using GNU date *** > PC: @ 0x7f9287fd6207 __GI_raise > *** SIGABRT (@0x3ca0006ab69) received by PID 437097 (TID 0x7f928a61ea00) from > PID 437097; stack trace: *** > @ 0x7f9289fe1680 (unknown) > @ 0x7f9287fd6207 __GI_raise > @ 0x7f9287fd78f8 __GI_abort > @ 0x7f92888e57d5 __gnu_cxx::__verbose_terminate_handler() > @ 0x7f92888e3746 (unknown) > @ 0x7f92888e3773 std::terminate() > @ 0x7f92888e3993 __cxa_throw > @ 0x7f9288938dd5 std::__throw_regex_error() > @ 0x931c32 std::__detail::_Compiler<>::_M_bracket_expression() > @ 0x931e3a std::__detail::_Compiler<>::_M_atom() > @ 0x932469 std::__detail::_Compiler<>::_M_alternative() > @ 0x9324c4 std::__detail::_Compiler<>::_M_alternative() > @ 0x932649 std::__detail::_Compiler<>::_M_disjunction() > @ 0x93297b std::__detail::_Compiler<>::_Compiler() > @ 0x932cb7 std::__detail::__compile<>() > @ 0x92bfc6 (unknown) > @ 0x92c664 std::_Function_handler<>::_M_invoke() > @ 0xde6672 kudu::tools::Action::Run() > @ 0x9957d7 kudu::tools::DispatchCommand() > @ 0x99619b kudu::tools::RunTool() > @ 0x8dee4d main > @ 0x7f9287fc23d5 __libc_start_main > @ 0x9284b5 (unknown) > Aborted (core dumped) > {code} > The same behavior when ports are specified: > 'host1:7150,host2:7150,host3:7150'. I cannot attach the core dump due to file > size limit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KUDU-2753) kudu cluster rebalance crashes with core dump
[ https://issues.apache.org/jira/browse/KUDU-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Berkeley resolved KUDU-2753. - Resolution: Won't Fix Fix Version/s: n/a This is a problem with the downstream Kudu vendor's code, specifically, with how it is using the std::regex library to do some version detection, which causes crashes on some platforms because the regex library was broken. You should approach Cloudera about a fix or workaround. > kudu cluster rebalance crashes with core dump > - > > Key: KUDU-2753 > URL: https://issues.apache.org/jira/browse/KUDU-2753 > Project: Kudu > Issue Type: Bug > Components: CLI >Affects Versions: 1.7.0 > Environment: kudu-master-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-client-devel-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-tserver-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-client0-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 >Reporter: Arseniy Tashoyan >Assignee: Will Berkeley >Priority: Major > Fix For: n/a > > > The utility crashes: > {code} > -bash-4.2$ kudu cluster rebalance host1,host2,host3 > terminate called after throwing an instance of 'std::regex_error' > what(): regex_error > *** Aborted at 1553854510 (unix time) try "date -d @1553854510" if you are > using GNU date *** > PC: @ 0x7f9287fd6207 __GI_raise > *** SIGABRT (@0x3ca0006ab69) received by PID 437097 (TID 0x7f928a61ea00) from > PID 437097; stack trace: *** > @ 0x7f9289fe1680 (unknown) > @ 0x7f9287fd6207 __GI_raise > @ 0x7f9287fd78f8 __GI_abort > @ 0x7f92888e57d5 __gnu_cxx::__verbose_terminate_handler() > @ 0x7f92888e3746 (unknown) > @ 0x7f92888e3773 std::terminate() > @ 0x7f92888e3993 __cxa_throw > @ 0x7f9288938dd5 std::__throw_regex_error() > @ 0x931c32 std::__detail::_Compiler<>::_M_bracket_expression() > @ 0x931e3a std::__detail::_Compiler<>::_M_atom() > @ 0x932469 std::__detail::_Compiler<>::_M_alternative() > @ 0x9324c4 std::__detail::_Compiler<>::_M_alternative() > @ 0x932649 std::__detail::_Compiler<>::_M_disjunction() > @ 0x93297b std::__detail::_Compiler<>::_Compiler() > @ 0x932cb7 std::__detail::__compile<>() > @ 0x92bfc6 (unknown) > @ 0x92c664 std::_Function_handler<>::_M_invoke() > @ 0xde6672 kudu::tools::Action::Run() > @ 0x9957d7 kudu::tools::DispatchCommand() > @ 0x99619b kudu::tools::RunTool() > @ 0x8dee4d main > @ 0x7f9287fc23d5 __libc_start_main > @ 0x9284b5 (unknown) > Aborted (core dumped) > {code} > The same behavior when ports are specified: > 'host1:7150,host2:7150,host3:7150'. I cannot attach the core dump due to file > size limit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KUDU-2753) kudu cluster rebalance crashes with core dump
[ https://issues.apache.org/jira/browse/KUDU-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Berkeley reassigned KUDU-2753: --- Assignee: Will Berkeley > kudu cluster rebalance crashes with core dump > - > > Key: KUDU-2753 > URL: https://issues.apache.org/jira/browse/KUDU-2753 > Project: Kudu > Issue Type: Bug > Components: CLI >Affects Versions: 1.7.0 > Environment: kudu-master-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-client-devel-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-tserver-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-client0-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 >Reporter: Arseniy Tashoyan >Assignee: Will Berkeley >Priority: Major > > The utility crashes: > {code} > -bash-4.2$ kudu cluster rebalance host1,host2,host3 > terminate called after throwing an instance of 'std::regex_error' > what(): regex_error > *** Aborted at 1553854510 (unix time) try "date -d @1553854510" if you are > using GNU date *** > PC: @ 0x7f9287fd6207 __GI_raise > *** SIGABRT (@0x3ca0006ab69) received by PID 437097 (TID 0x7f928a61ea00) from > PID 437097; stack trace: *** > @ 0x7f9289fe1680 (unknown) > @ 0x7f9287fd6207 __GI_raise > @ 0x7f9287fd78f8 __GI_abort > @ 0x7f92888e57d5 __gnu_cxx::__verbose_terminate_handler() > @ 0x7f92888e3746 (unknown) > @ 0x7f92888e3773 std::terminate() > @ 0x7f92888e3993 __cxa_throw > @ 0x7f9288938dd5 std::__throw_regex_error() > @ 0x931c32 std::__detail::_Compiler<>::_M_bracket_expression() > @ 0x931e3a std::__detail::_Compiler<>::_M_atom() > @ 0x932469 std::__detail::_Compiler<>::_M_alternative() > @ 0x9324c4 std::__detail::_Compiler<>::_M_alternative() > @ 0x932649 std::__detail::_Compiler<>::_M_disjunction() > @ 0x93297b std::__detail::_Compiler<>::_Compiler() > @ 0x932cb7 std::__detail::__compile<>() > @ 0x92bfc6 (unknown) > @ 0x92c664 std::_Function_handler<>::_M_invoke() > @ 0xde6672 kudu::tools::Action::Run() > @ 0x9957d7 kudu::tools::DispatchCommand() > @ 0x99619b kudu::tools::RunTool() > @ 0x8dee4d main > @ 0x7f9287fc23d5 __libc_start_main > @ 0x9284b5 (unknown) > Aborted (core dumped) > {code} > The same behavior when ports are specified: > 'host1:7150,host2:7150,host3:7150'. I cannot attach the core dump due to file > size limit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KUDU-2754) Keep a maximum number of old log files
[ https://issues.apache.org/jira/browse/KUDU-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke reassigned KUDU-2754: - Assignee: Grant Henke > Keep a maximum number of old log files > -- > > Key: KUDU-2754 > URL: https://issues.apache.org/jira/browse/KUDU-2754 > Project: Kudu > Issue Type: Improvement >Reporter: Grant Henke >Assignee: Grant Henke >Priority: Major > > Kudu generates various different log files > (INFO,WARNING,ERROR,diagnostic,minidumps,etc). To prevent issues running out > of logging space, it would be nice if a user could configure the maximum > number of each log file type to keep. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-2754) Keep a maximum number of old log files
[ https://issues.apache.org/jira/browse/KUDU-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16805152#comment-16805152 ] Grant Henke commented on KUDU-2754: --- It looks like this feature already exists, but configuring the maximum number is "experimental". This jira can just track changing `max_log_files` to stable. https://github.com/apache/kudu/blob/master/src/kudu/util/logging.cc#L69 > Keep a maximum number of old log files > -- > > Key: KUDU-2754 > URL: https://issues.apache.org/jira/browse/KUDU-2754 > Project: Kudu > Issue Type: Improvement >Reporter: Grant Henke >Priority: Major > > Kudu generates various different log files > (INFO,WARNING,ERROR,diagnostic,minidumps,etc). To prevent issues running out > of logging space, it would be nice if a user could configure the maximum > number of each log file type to keep. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KUDU-2754) Keep a maximum number of old log files
Grant Henke created KUDU-2754: - Summary: Keep a maximum number of old log files Key: KUDU-2754 URL: https://issues.apache.org/jira/browse/KUDU-2754 Project: Kudu Issue Type: Improvement Reporter: Grant Henke Kudu generates various different log files (INFO,WARNING,ERROR,diagnostic,minidumps,etc). To prevent issues running out of logging space, it would be nice if a user could configure the maximum number of each log file type to keep. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1395) Scanner KeepAlive requests can get starved on an overloaded server
[ https://issues.apache.org/jira/browse/KUDU-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16805109#comment-16805109 ] Grant Henke commented on KUDU-1395: --- FWIW the Java client retries keepAlive requests (KUDU-2710) > Scanner KeepAlive requests can get starved on an overloaded server > -- > > Key: KUDU-1395 > URL: https://issues.apache.org/jira/browse/KUDU-1395 > Project: Kudu > Issue Type: Bug > Components: impala, rpc, tserver >Affects Versions: 0.8.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Major > Labels: backup > > As of 0.8.0, the RPC system schedules RPCs on an earliest-deadline-first > basis, rejecting those with later deadlines. This works well for RPCs which > are retried on SERVER_TOO_BUSY errors, since the retries maintain the > original deadline and thus get higher and higher priority as they get closer > to timing out. > We don't, however, do any retries on scanner KeepAlive RPCs. So, if a > keepalive RPC arrives at a heavily overloaded tserver, it will likely get > rejected, and won't retry. This means that Impala queries or other long scans > that rely on KeepAlives will likely fail on overloaded clusters since the > KeepAlive never gets through. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KUDU-1395) Scanner KeepAlive requests can get starved on an overloaded server
[ https://issues.apache.org/jira/browse/KUDU-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke updated KUDU-1395: -- Labels: backup (was: ) > Scanner KeepAlive requests can get starved on an overloaded server > -- > > Key: KUDU-1395 > URL: https://issues.apache.org/jira/browse/KUDU-1395 > Project: Kudu > Issue Type: Bug > Components: impala, rpc, tserver >Affects Versions: 0.8.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Major > Labels: backup > > As of 0.8.0, the RPC system schedules RPCs on an earliest-deadline-first > basis, rejecting those with later deadlines. This works well for RPCs which > are retried on SERVER_TOO_BUSY errors, since the retries maintain the > original deadline and thus get higher and higher priority as they get closer > to timing out. > We don't, however, do any retries on scanner KeepAlive RPCs. So, if a > keepalive RPC arrives at a heavily overloaded tserver, it will likely get > rejected, and won't retry. This means that Impala queries or other long scans > that rely on KeepAlives will likely fail on overloaded clusters since the > KeepAlive never gets through. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KUDU-2753) kudu cluster rebalance crashes with core dump
[ https://issues.apache.org/jira/browse/KUDU-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arseniy Tashoyan updated KUDU-2753: --- Priority: Major (was: Minor) Description: The utility crashes: {code} -bash-4.2$ kudu cluster rebalance host1,host2,host3 terminate called after throwing an instance of 'std::regex_error' what(): regex_error *** Aborted at 1553854510 (unix time) try "date -d @1553854510" if you are using GNU date *** PC: @ 0x7f9287fd6207 __GI_raise *** SIGABRT (@0x3ca0006ab69) received by PID 437097 (TID 0x7f928a61ea00) from PID 437097; stack trace: *** @ 0x7f9289fe1680 (unknown) @ 0x7f9287fd6207 __GI_raise @ 0x7f9287fd78f8 __GI_abort @ 0x7f92888e57d5 __gnu_cxx::__verbose_terminate_handler() @ 0x7f92888e3746 (unknown) @ 0x7f92888e3773 std::terminate() @ 0x7f92888e3993 __cxa_throw @ 0x7f9288938dd5 std::__throw_regex_error() @ 0x931c32 std::__detail::_Compiler<>::_M_bracket_expression() @ 0x931e3a std::__detail::_Compiler<>::_M_atom() @ 0x932469 std::__detail::_Compiler<>::_M_alternative() @ 0x9324c4 std::__detail::_Compiler<>::_M_alternative() @ 0x932649 std::__detail::_Compiler<>::_M_disjunction() @ 0x93297b std::__detail::_Compiler<>::_Compiler() @ 0x932cb7 std::__detail::__compile<>() @ 0x92bfc6 (unknown) @ 0x92c664 std::_Function_handler<>::_M_invoke() @ 0xde6672 kudu::tools::Action::Run() @ 0x9957d7 kudu::tools::DispatchCommand() @ 0x99619b kudu::tools::RunTool() @ 0x8dee4d main @ 0x7f9287fc23d5 __libc_start_main @ 0x9284b5 (unknown) Aborted (core dumped) {code} The same behavior when ports are specified: 'host1:7150,host2:7150,host3:7150'. I cannot attach the core dump due to file size limit. was: When specifying masters without ports, the utility crashes: {code} -bash-4.2$ kudu cluster rebalance host1,host2,host3 -report_only terminate called after throwing an instance of 'std::regex_error' what(): regex_error *** Aborted at 1553854510 (unix time) try "date -d @1553854510" if you are using GNU date *** PC: @ 0x7f9287fd6207 __GI_raise *** SIGABRT (@0x3ca0006ab69) received by PID 437097 (TID 0x7f928a61ea00) from PID 437097; stack trace: *** @ 0x7f9289fe1680 (unknown) @ 0x7f9287fd6207 __GI_raise @ 0x7f9287fd78f8 __GI_abort @ 0x7f92888e57d5 __gnu_cxx::__verbose_terminate_handler() @ 0x7f92888e3746 (unknown) @ 0x7f92888e3773 std::terminate() @ 0x7f92888e3993 __cxa_throw @ 0x7f9288938dd5 std::__throw_regex_error() @ 0x931c32 std::__detail::_Compiler<>::_M_bracket_expression() @ 0x931e3a std::__detail::_Compiler<>::_M_atom() @ 0x932469 std::__detail::_Compiler<>::_M_alternative() @ 0x9324c4 std::__detail::_Compiler<>::_M_alternative() @ 0x932649 std::__detail::_Compiler<>::_M_disjunction() @ 0x93297b std::__detail::_Compiler<>::_Compiler() @ 0x932cb7 std::__detail::__compile<>() @ 0x92bfc6 (unknown) @ 0x92c664 std::_Function_handler<>::_M_invoke() @ 0xde6672 kudu::tools::Action::Run() @ 0x9957d7 kudu::tools::DispatchCommand() @ 0x99619b kudu::tools::RunTool() @ 0x8dee4d main @ 0x7f9287fc23d5 __libc_start_main @ 0x9284b5 (unknown) Aborted (core dumped) {code} When specified masters with ports, the utility works fine: 'host1:7150,host2:7150,host3:7150' Summary: kudu cluster rebalance crashes with core dump (was: kudu cluster rebalance crashes when unable to parse masters) > kudu cluster rebalance crashes with core dump > - > > Key: KUDU-2753 > URL: https://issues.apache.org/jira/browse/KUDU-2753 > Project: Kudu > Issue Type: Bug > Components: CLI >Affects Versions: 1.7.0 > Environment: kudu-master-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-client-devel-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-tserver-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 > kudu-client0-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 >Reporter: Arseniy Tashoyan >Priority: Major > > The utility crashes: > {code} > -bash-4.2$ kudu cluster rebalance host1,host2,host3 > terminate called after throwing an instance of 'std::regex_error' > what(): regex_error > *** Aborted at 1553854510 (unix time) try "date -d @1553854510" if you are > using GNU date *** > PC: @ 0x7f9287fd6207 __GI_raise > *** SIGABRT (@0x3ca0006ab69) received by PID 437097 (TID 0x7f928a61ea00) from
[jira] [Created] (KUDU-2753) kudu cluster rebalance crashes when unable to parse masters
Arseniy Tashoyan created KUDU-2753: -- Summary: kudu cluster rebalance crashes when unable to parse masters Key: KUDU-2753 URL: https://issues.apache.org/jira/browse/KUDU-2753 Project: Kudu Issue Type: Bug Components: CLI Affects Versions: 1.7.0 Environment: kudu-master-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 kudu-client-devel-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 kudu-tserver-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 kudu-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 kudu-client0-1.7.0+cdh5.16.1+0-1.cdh5.16.1.p0.3.el7.x86_64 Reporter: Arseniy Tashoyan When specifying masters without ports, the utility crashes: {code} -bash-4.2$ kudu cluster rebalance host1,host2,host3 -report_only terminate called after throwing an instance of 'std::regex_error' what(): regex_error *** Aborted at 1553854510 (unix time) try "date -d @1553854510" if you are using GNU date *** PC: @ 0x7f9287fd6207 __GI_raise *** SIGABRT (@0x3ca0006ab69) received by PID 437097 (TID 0x7f928a61ea00) from PID 437097; stack trace: *** @ 0x7f9289fe1680 (unknown) @ 0x7f9287fd6207 __GI_raise @ 0x7f9287fd78f8 __GI_abort @ 0x7f92888e57d5 __gnu_cxx::__verbose_terminate_handler() @ 0x7f92888e3746 (unknown) @ 0x7f92888e3773 std::terminate() @ 0x7f92888e3993 __cxa_throw @ 0x7f9288938dd5 std::__throw_regex_error() @ 0x931c32 std::__detail::_Compiler<>::_M_bracket_expression() @ 0x931e3a std::__detail::_Compiler<>::_M_atom() @ 0x932469 std::__detail::_Compiler<>::_M_alternative() @ 0x9324c4 std::__detail::_Compiler<>::_M_alternative() @ 0x932649 std::__detail::_Compiler<>::_M_disjunction() @ 0x93297b std::__detail::_Compiler<>::_Compiler() @ 0x932cb7 std::__detail::__compile<>() @ 0x92bfc6 (unknown) @ 0x92c664 std::_Function_handler<>::_M_invoke() @ 0xde6672 kudu::tools::Action::Run() @ 0x9957d7 kudu::tools::DispatchCommand() @ 0x99619b kudu::tools::RunTool() @ 0x8dee4d main @ 0x7f9287fc23d5 __libc_start_main @ 0x9284b5 (unknown) Aborted (core dumped) {code} When specified masters with ports, the utility works fine: 'host1:7150,host2:7150,host3:7150' -- This message was sent by Atlassian JIRA (v7.6.3#76005)