[jira] [Created] (IMPALA-7934) Switch to using Java 8's Base64 impl for incremental stats encoding
bharath v created IMPALA-7934: - Summary: Switch to using Java 8's Base64 impl for incremental stats encoding Key: IMPALA-7934 URL: https://issues.apache.org/jira/browse/IMPALA-7934 Project: IMPALA Issue Type: Bug Components: Catalog Affects Versions: Impala 3.1.0 Reporter: bharath v Attachments: base64.png Incremental stats are compressed and Base64 encoded before they are chunked and written to the HMS' partition parameters map. When they are read back, we need to Base64 decode and decompress. For certain incremental stats heavy tables, we noticed that a significant amount of time is spent in these base64 classes (see the attached image for the stack. Unfortunately, I don't have the text version of it). Java 8 comes with its own Base64 implementation and that has shown much better perf results [1] compared to apache codec's impl. So consider switching to Java 8's base64 impl. [1] http://java-performance.info/base64-encoding-and-decoding-performance/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7934) Switch to using Java 8's Base64 impl for incremental stats encoding
[ https://issues.apache.org/jira/browse/IMPALA-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bharath v updated IMPALA-7934: -- Labels: ramp-up (was: ) > Switch to using Java 8's Base64 impl for incremental stats encoding > --- > > Key: IMPALA-7934 > URL: https://issues.apache.org/jira/browse/IMPALA-7934 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 3.1.0 >Reporter: bharath v >Priority: Major > Labels: ramp-up > Attachments: base64.png > > > Incremental stats are compressed and Base64 encoded before they are chunked > and written to the HMS' partition parameters map. When they are read back, we > need to Base64 decode and decompress. > For certain incremental stats heavy tables, we noticed that a significant > amount of time is spent in these base64 classes (see the attached image for > the stack. Unfortunately, I don't have the text version of it). > Java 8 comes with its own Base64 implementation and that has shown much > better perf results [1] compared to apache codec's impl. So consider > switching to Java 8's base64 impl. > [1] http://java-performance.info/base64-encoding-and-decoding-performance/ > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-7933) Consider using read-write locks for partial fetch requests.
bharath v created IMPALA-7933: - Summary: Consider using read-write locks for partial fetch requests. Key: IMPALA-7933 URL: https://issues.apache.org/jira/browse/IMPALA-7933 Project: IMPALA Issue Type: Sub-task Components: Catalog Affects Versions: Impala 3.1.0 Reporter: bharath v Partial table fetches currently use an exclusive lock. Switch to a read-write lock instead? {code} // TODO(todd): consider a read-write lock here. table.getLock().lock(); try { return table.getPartialInfo(req); } finally { table.getLock().unlock(); } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7249) Cancel shutdown of impalad
[ https://issues.apache.org/jira/browse/IMPALA-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-7249: -- Description: Following on from IMPALA-1760, it could be useful to cancel shutdown for some use cases. An extension would be to allow extending the deadline. was:Following on from IMPALA-1760, it could be useful to cancel shutdown for some use cases. > Cancel shutdown of impalad > -- > > Key: IMPALA-7249 > URL: https://issues.apache.org/jira/browse/IMPALA-7249 > Project: IMPALA > Issue Type: New Feature > Components: Distributed Exec >Reporter: Tim Armstrong >Priority: Minor > > Following on from IMPALA-1760, it could be useful to cancel shutdown for some > use cases. > An extension would be to allow extending the deadline. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-7932) Cannot change shutdown deadline after issuing initial shutdown command
[ https://issues.apache.org/jira/browse/IMPALA-7932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-7932. --- Resolution: Duplicate I think this is essentially the same use case as IMPALA-7249. I don't have plans to work on that. > Cannot change shutdown deadline after issuing initial shutdown command > -- > > Key: IMPALA-7932 > URL: https://issues.apache.org/jira/browse/IMPALA-7932 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.1.0, Impala 3.2.0 >Reporter: Lars Volker >Assignee: Tim Armstrong >Priority: Major > > Starting Impala Shell without Kerberos authentication > Opened TCP connection to localhost:21000 > Connected to localhost:21000 > Server version: impalad version 3.1.0-SNAPSHOT DEBUG (build > 3d38043e6b9da2bab38490a23dda2103368f4e0a) > *** > Welcome to the Impala shell. > (Impala Shell v3.1.0-SNAPSHOT (3d38043) built on Mon Dec 3 15:50:55 PST 2018) > After running a query, type SUMMARY to see a summary of where time was spent. > *** > [localhost:21000] default> :shutdown(100); > Query: :shutdown(100) > Query submitted at: 2018-12-05 19:22:07 (Coordinator: http://lv-desktop:25000) > Query progress can be monitored at: > http://lv-desktop:25000/query_plan?query_id=2f41eaf1c21603b8:d6546bf1 > +---+ > | summary > | > +---+ > | startup grace period left: 2m, deadline left: 1m40s, fragment instances: 0, > queries registered: 1 | > +---+ > Fetched 1 row(s) in 0.11s > [localhost:21000] default> :shutdown(10); > Query: :shutdown(10) > Query submitted at: 2018-12-05 19:22:10 (Coordinator: http://lv-desktop:25000) > ERROR: Server is being shut down: startup grace period left: 1m56s, deadline > left: 1m36s, fragment instances: 0, queries registered: 0. > [localhost:21000] default> :shutdown(1000); > Query: :shutdown(1000) > Query submitted at: 2018-12-05 19:22:14 (Coordinator: http://lv-desktop:25000) > ERROR: Server is being shut down: startup grace period left: 1m52s, deadline > left: 1m32s, fragment instances: 0, queries registered: 0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7931) test_shutdown_executor fails with timeout waiting for query target state
[ https://issues.apache.org/jira/browse/IMPALA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker updated IMPALA-7931: Description: On a recent S3 test run test_shutdown_executor hit a timeout waiting for a query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION). {noformat} 12:51:11 __ TestShutdownCommand.test_shutdown_executor __ 12:51:11 custom_cluster/test_restart_services.py:209: in test_shutdown_executor 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, before_shutdown_handle) == 3 12:51:11 custom_cluster/test_restart_services.py:356: in __fetch_and_get_num_backends 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20) 12:51:11 common/impala_service.py:267: in wait_for_query_state 12:51:11 target_state, query_state) 12:51:11 E AssertionError: Did not reach query state in time target=4 actual=5 {noformat} >From the logs I can see that the query fails because one of the executors >becomes unreachable: {noformat} I1204 12:31:39.954125 5609 impala-server.cc:1792] Query a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): jenkins-worker:22001 {noformat} The query was {{select count\(*) from functional_parquet.alltypes where sleep(1) = bool_col}}. It seems that the query took longer than expected and was still running when the executor shut down. was: On a recent S3 test run test_shutdown_executor hit a timeout waiting for a query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION). {noformat} 12:51:11 __ TestShutdownCommand.test_shutdown_executor __ 12:51:11 custom_cluster/test_restart_services.py:209: in test_shutdown_executor 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, before_shutdown_handle) == 3 12:51:11 custom_cluster/test_restart_services.py:356: in __fetch_and_get_num_backends 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20) 12:51:11 common/impala_service.py:267: in wait_for_query_state 12:51:11 target_state, query_state) 12:51:11 E AssertionError: Did not reach query state in time target=4 actual=5 {noformat} >From the logs I can see that the query fails because one of the executors >becomes unreachable: {noformat} I1204 12:31:39.954125 5609 impala-server.cc:1792] Query a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): jenkins-worker:22001 {noformat} The query was {{select count(*) from functional_parquet.alltypes where sleep(1) = bool_col}}. It seems that the query took longer than expected and was still running when the executor shut down. > test_shutdown_executor fails with timeout waiting for query target state > > > Key: IMPALA-7931 > URL: https://issues.apache.org/jira/browse/IMPALA-7931 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Lars Volker >Priority: Critical > Labels: broken-build > > On a recent S3 test run test_shutdown_executor hit a timeout waiting for a > query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION). > {noformat} > 12:51:11 __ TestShutdownCommand.test_shutdown_executor > __ > 12:51:11 custom_cluster/test_restart_services.py:209: in > test_shutdown_executor > 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, > before_shutdown_handle) == 3 > 12:51:11 custom_cluster/test_restart_services.py:356: in > __fetch_and_get_num_backends > 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20) > 12:51:11 common/impala_service.py:267: in wait_for_query_state > 12:51:11 target_state, query_state) > 12:51:11 E AssertionError: Did not reach query state in time target=4 > actual=5 > {noformat} > From the logs I can see that the query fails because one of the executors > becomes unreachable: > {noformat} > I1204 12:31:39.954125 5609 impala-server.cc:1792] Query > a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): > jenkins-worker:22001 > {noformat} > The query was {{select count\(*) from functional_parquet.alltypes where > sleep(1) = bool_col}}. > It seems that the query took longer than expected and was still running when > the executor shut down. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-7932) Cannot change shutdown deadline after issuing initial shutdown command
Lars Volker created IMPALA-7932: --- Summary: Cannot change shutdown deadline after issuing initial shutdown command Key: IMPALA-7932 URL: https://issues.apache.org/jira/browse/IMPALA-7932 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.1.0, Impala 3.2.0 Reporter: Lars Volker Assignee: Tim Armstrong Starting Impala Shell without Kerberos authentication Opened TCP connection to localhost:21000 Connected to localhost:21000 Server version: impalad version 3.1.0-SNAPSHOT DEBUG (build 3d38043e6b9da2bab38490a23dda2103368f4e0a) *** Welcome to the Impala shell. (Impala Shell v3.1.0-SNAPSHOT (3d38043) built on Mon Dec 3 15:50:55 PST 2018) After running a query, type SUMMARY to see a summary of where time was spent. *** [localhost:21000] default> :shutdown(100); Query: :shutdown(100) Query submitted at: 2018-12-05 19:22:07 (Coordinator: http://lv-desktop:25000) Query progress can be monitored at: http://lv-desktop:25000/query_plan?query_id=2f41eaf1c21603b8:d6546bf1 +---+ | summary | +---+ | startup grace period left: 2m, deadline left: 1m40s, fragment instances: 0, queries registered: 1 | +---+ Fetched 1 row(s) in 0.11s [localhost:21000] default> :shutdown(10); Query: :shutdown(10) Query submitted at: 2018-12-05 19:22:10 (Coordinator: http://lv-desktop:25000) ERROR: Server is being shut down: startup grace period left: 1m56s, deadline left: 1m36s, fragment instances: 0, queries registered: 0. [localhost:21000] default> :shutdown(1000); Query: :shutdown(1000) Query submitted at: 2018-12-05 19:22:14 (Coordinator: http://lv-desktop:25000) ERROR: Server is being shut down: startup grace period left: 1m52s, deadline left: 1m32s, fragment instances: 0, queries registered: 0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-7931) test_shutdown_executor fails with timeout waiting for query target state
[ https://issues.apache.org/jira/browse/IMPALA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker reassigned IMPALA-7931: --- Assignee: Tim Armstrong > test_shutdown_executor fails with timeout waiting for query target state > > > Key: IMPALA-7931 > URL: https://issues.apache.org/jira/browse/IMPALA-7931 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Lars Volker >Assignee: Tim Armstrong >Priority: Critical > Labels: broken-build > > On a recent S3 test run test_shutdown_executor hit a timeout waiting for a > query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION). > {noformat} > 12:51:11 __ TestShutdownCommand.test_shutdown_executor > __ > 12:51:11 custom_cluster/test_restart_services.py:209: in > test_shutdown_executor > 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, > before_shutdown_handle) == 3 > 12:51:11 custom_cluster/test_restart_services.py:356: in > __fetch_and_get_num_backends > 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20) > 12:51:11 common/impala_service.py:267: in wait_for_query_state > 12:51:11 target_state, query_state) > 12:51:11 E AssertionError: Did not reach query state in time target=4 > actual=5 > {noformat} > From the logs I can see that the query fails because one of the executors > becomes unreachable: > {noformat} > I1204 12:31:39.954125 5609 impala-server.cc:1792] Query > a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): > jenkins-worker:22001 > {noformat} > The query was {{select count\(*) from functional_parquet.alltypes where > sleep(1) = bool_col}}. > It seems that the query took longer than expected and was still running when > the executor shut down. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7931) test_shutdown_executor fails with timeout waiting for query target state
[ https://issues.apache.org/jira/browse/IMPALA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710932#comment-16710932 ] Lars Volker commented on IMPALA-7931: - [~tarmstrong] - I’m assigning this to you thinking you might have an idea what’s going on here; feel free to find another person or assign back to me if you're swamped. > test_shutdown_executor fails with timeout waiting for query target state > > > Key: IMPALA-7931 > URL: https://issues.apache.org/jira/browse/IMPALA-7931 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Lars Volker >Priority: Critical > Labels: broken-build > > On a recent S3 test run test_shutdown_executor hit a timeout waiting for a > query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION). > {noformat} > 12:51:11 __ TestShutdownCommand.test_shutdown_executor > __ > 12:51:11 custom_cluster/test_restart_services.py:209: in > test_shutdown_executor > 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, > before_shutdown_handle) == 3 > 12:51:11 custom_cluster/test_restart_services.py:356: in > __fetch_and_get_num_backends > 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20) > 12:51:11 common/impala_service.py:267: in wait_for_query_state > 12:51:11 target_state, query_state) > 12:51:11 E AssertionError: Did not reach query state in time target=4 > actual=5 > {noformat} > From the logs I can see that the query fails because one of the executors > becomes unreachable: > {noformat} > I1204 12:31:39.954125 5609 impala-server.cc:1792] Query > a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): > jenkins-worker:22001 > {noformat} > The query was {{select count\(*) from functional_parquet.alltypes where > sleep(1) = bool_col}}. > It seems that the query took longer than expected and was still running when > the executor shut down. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-7931) test_shutdown_executor fails with timeout waiting for query target state
Lars Volker created IMPALA-7931: --- Summary: test_shutdown_executor fails with timeout waiting for query target state Key: IMPALA-7931 URL: https://issues.apache.org/jira/browse/IMPALA-7931 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.2.0 Reporter: Lars Volker On a recent S3 test run test_shutdown_executor hit a timeout waiting for a query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION). {noformat} 12:51:11 __ TestShutdownCommand.test_shutdown_executor __ 12:51:11 custom_cluster/test_restart_services.py:209: in test_shutdown_executor 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, before_shutdown_handle) == 3 12:51:11 custom_cluster/test_restart_services.py:356: in __fetch_and_get_num_backends 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20) 12:51:11 common/impala_service.py:267: in wait_for_query_state 12:51:11 target_state, query_state) 12:51:11 E AssertionError: Did not reach query state in time target=4 actual=5 {noformat} >From the logs I can see that the query fails because one of the executors >becomes unreachable: {noformat} I1204 12:31:39.954125 5609 impala-server.cc:1792] Query a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): jenkins-worker:22001 {noformat} The query was {{select count(*) from functional_parquet.alltypes where sleep(1) = bool_col}}. It seems that the query took longer than expected and was still running when the executor shut down. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7931) test_shutdown_executor fails with timeout waiting for query target state
[ https://issues.apache.org/jira/browse/IMPALA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker updated IMPALA-7931: Labels: broken-build (was: ) > test_shutdown_executor fails with timeout waiting for query target state > > > Key: IMPALA-7931 > URL: https://issues.apache.org/jira/browse/IMPALA-7931 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Lars Volker >Priority: Critical > Labels: broken-build > > On a recent S3 test run test_shutdown_executor hit a timeout waiting for a > query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION). > {noformat} > 12:51:11 __ TestShutdownCommand.test_shutdown_executor > __ > 12:51:11 custom_cluster/test_restart_services.py:209: in > test_shutdown_executor > 12:51:11 assert self.__fetch_and_get_num_backends(QUERY, > before_shutdown_handle) == 3 > 12:51:11 custom_cluster/test_restart_services.py:356: in > __fetch_and_get_num_backends > 12:51:11 self.client.QUERY_STATES['FINISHED'], timeout=20) > 12:51:11 common/impala_service.py:267: in wait_for_query_state > 12:51:11 target_state, query_state) > 12:51:11 E AssertionError: Did not reach query state in time target=4 > actual=5 > {noformat} > From the logs I can see that the query fails because one of the executors > becomes unreachable: > {noformat} > I1204 12:31:39.954125 5609 impala-server.cc:1792] Query > a34c3a84775e5599:b2b25eb9: Failed due to unreachable impalad(s): > jenkins-worker:22001 > {noformat} > The query was {{select count(*) from functional_parquet.alltypes where > sleep(1) = bool_col}}. > It seems that the query took longer than expected and was still running when > the executor shut down. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7906) Crash in JVM PSPromotionManager::copy_to_survivor_space
[ https://issues.apache.org/jira/browse/IMPALA-7906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710881#comment-16710881 ] Tim Armstrong commented on IMPALA-7906: --- I tried looping some tests to reproduce with no luck. > Crash in JVM PSPromotionManager::copy_to_survivor_space > --- > > Key: IMPALA-7906 > URL: https://issues.apache.org/jira/browse/IMPALA-7906 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Critical > Labels: broken-build, crash > Attachments: hs_err_pid6290.log > > > {noformat} > #0 0x7f44ca5261f7 in raise () from /lib64/libc.so.6 > #1 0x7f44ca5278e8 in abort () from /lib64/libc.so.6 > #2 0x7f44cd726185 in os::abort(bool) () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #3 0x7f44cd8c8593 in VMError::report_and_die() () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #4 0x7f44cd8c8a7e in crash_handler(int, siginfo*, void*) () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #5 0x7f44cd724f72 in os::Linux::chained_handler(int, siginfo*, void*) () > from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #6 0x7f44cd72b5f6 in JVM_handle_linux_signal () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #7 0x7f44cd721be3 in signalHandler(int, siginfo*, void*) () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #8 > #9 0x7f44cd713e95 in oopDesc::print_on(outputStream*) const () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #10 0x7f44cd72afdb in os::print_register_info(outputStream*, void*) () > from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #11 0x7f44cd8c6c13 in VMError::report(outputStream*) () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #12 0x7f44cd8c818a in VMError::report_and_die() () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #13 0x7f44cd72b68f in JVM_handle_linux_signal () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #14 0x7f44cd721be3 in signalHandler(int, siginfo*, void*) () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #15 > #16 0x7f44cd78f562 in oopDesc* > PSPromotionManager::copy_to_survivor_space(oopDesc*) () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #17 0x7f44cd7924a5 in PSRootsClosure::do_oop(oopDesc**) () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #18 0x7f44cd716a96 in InterpreterOopMap::iterate_oop(OffsetClosure*) > const () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #19 0x7f44cd38f789 in frame::oops_interpreted_do(OopClosure*, > CLDClosure*, RegisterMap const*, bool) () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #20 0x7f44cd86eaa1 in JavaThread::oops_do(OopClosure*, CLDClosure*, > CodeBlobClosure*) () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #21 0x7f44cd79270f in ThreadRootsTask::do_it(GCTaskManager*, unsigned > int) () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #22 0x7f44cd3d7ecf in GCTaskThread::run() () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #23 0x7f44cd727338 in java_start(Thread*) () from > /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so > #24 0x7f44ca8bbe25 in start_thread () from /lib64/libpthread.so.0 > #25 0x7f44ca5e934d in clone () from /lib64/libc.so.6 > {noformat} > These are the tests running at the time > {noformat} > 006:53:04 [gw1] PASSED > query_test/test_mem_usage_scaling.py::TestQueryMemLimitScaling::test_mem_usage_scaling[mem_limit: > -1 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': > 0} | table_format: parquet/none] > 06:53:07 > query_test/test_mem_usage_scaling.py::TestQueryMemLimitScaling::test_mem_usage_scaling[mem_limit: > 400m | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': > 0} | table_format: parquet/none] > 06:53:07 [gw5] PASSED > query_test/test_analytic_tpcds.py::TestAnalyticTpcds::test_analytic_functions_tpcds[batch_size: > 1 | protocol: beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': > 0} | table_format: parquet/none] > 06:53:08 >
[jira] [Commented] (IMPALA-7930) Crash in thrift-server-test
[ https://issues.apache.org/jira/browse/IMPALA-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710856#comment-16710856 ] Lars Volker commented on IMPALA-7930: - [~twmarshall] - I’m assigning this to you thinking you might have an idea what’s going on here; feel free to find another person or assign back to me if you're swamped. > Crash in thrift-server-test > --- > > Key: IMPALA-7930 > URL: https://issues.apache.org/jira/browse/IMPALA-7930 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Lars Volker >Priority: Critical > Labels: broken-build, flaky > > I've seen a crash in thrift-server-test during an exhaustive test run. > Unfortunately the core file indicated that it was written by a directory, > which caused the automatic core dump resolution to fail. Here's the resolved > minidump: > {noformat} > Crash reason: SIGABRT > Crash address: 0x7d11d19 > Process uptime: not available > Thread 0 (crashed) > 0 libc-2.17.so + 0x351f7 > rax = 0x rdx = 0x0006 > rcx = 0x rbx = 0x7f1e65876000 > rsi = 0x1d19 rdi = 0x1d19 > rbp = 0x7f1e61dbde68 rsp = 0x7fffc22796d8 > r8 = 0x000a1a10r9 = 0xfefefefefeff092d > r10 = 0x0008 r11 = 0x0202 > r12 = 0x029dca31 r13 = 0x033a5e00 > r14 = 0x r15 = 0x > rip = 0x7f1e61c721f7 > Found by: given as instruction pointer in context > 1 libc-2.17.so + 0x368e8 > rsp = 0x7fffc22796e0 rip = 0x7f1e61c738e8 > Found by: stack scanning > 2 libc-2.17.so + 0x17df70 > rsp = 0x7fffc2279770 rip = 0x7f1e61dbaf70 > Found by: stack scanning > 3 thrift-server-test!_fini + 0xdf918 > rsp = 0x7fffc2279778 rip = 0x02ab0288 > Found by: stack scanning > 4 libc-2.17.so + 0x2fbc3 > rsp = 0x7fffc2279790 rip = 0x7f1e61c6cbc3 > Found by: stack scanning > 5 > thrift-server-test!testing::internal::TestEventRepeater::OnTestProgramEnd(testing::UnitTest > const&) + 0x55 > rsp = 0x7fffc22797b0 rip = 0x028711f5 > Found by: stack scanning > 6 libc-2.17.so + 0x17df70 > rbx = 0x rbp = 0x > rsp = 0x7fffc22797e0 r12 = 0x > r13 = 0x0005 rip = 0x7f1e61dbaf70 > Found by: call frame info > 7 thrift-server-test!_fini + 0xc0c1 > rsp = 0x7fffc22797f0 rip = 0x029dca31 > Found by: stack scanning > 8 thrift-server-test!_fini + 0x9d5490 > rsp = 0x7fffc22797f8 rip = 0x033a5e00 > Found by: stack scanning > 9 libc-2.17.so + 0x180e68 > rsp = 0x7fffc2279808 rip = 0x7f1e61dbde68 > Found by: stack scanning > 10 libc-2.17.so + 0x2e266 > rsp = 0x7fffc2279810 rip = 0x7f1e61c6b266 > Found by: stack scanning > 11 thrift-server-test!_fini + 0x9d5490 > rsp = 0x7fffc2279818 rip = 0x033a5e00 > Found by: stack scanning > 12 libc-2.17.so + 0x17df70 > rsp = 0x7fffc2279820 rip = 0x7f1e61dbaf70 > Found by: stack scanning > 13 thrift-server-test!_fini + 0xc0c1 > rsp = 0x7fffc2279828 rip = 0x029dca31 > Found by: stack scanning > 14 thrift-server-test!_fini + 0xdf918 > rsp = 0x7fffc2279840 rip = 0x02ab0288 > Found by: stack scanning > 15 thrift-server-test!_fini + 0x9d5490 > rsp = 0x7fffc2279850 rip = 0x033a5e00 > Found by: stack scanning > 16 thrift-server-test!_fini + 0xc0c1 > rsp = 0x7fffc2279860 rip = 0x029dca31 > Found by: stack scanning > 17 thrift-server-test!_fini + 0x9d5490 > rsp = 0x7fffc2279870 rip = 0x033a5e00 > Found by: stack scanning > 18 thrift-server-test!_fini + 0xc0c1 > rsp = 0x7fffc2279878 rip = 0x029dca31 > Found by: stack scanning > 19 thrift-server-test!_fini + 0xdf918 > rsp = 0x7fffc2279880 rip = 0x02ab0288 > Found by: stack scanning > 20 libc-2.17.so + 0x2e312 > rsp = 0x7fffc2279890 rip = 0x7f1e61c6b312 > Found by: stack scanning > 21 > thrift-server-test!boost::shared_array::~shared_array() > + 0x70 > rsp = 0x7fffc22798b0 rip = 0x02719b40 > Found by: stack scanning > 22 > thrift-server-test!boost::detail::sp_counted_impl_p::dispose() > + 0x4f > rsp = 0x7fffc22798c0 rip = 0x0271e1af > Found by: stack scanning > 23 > thrift-server-test!boost::detail::sp_counted_impl_pd boost::checked_array_deleter > >::dispose() + 0xaa > rbx = 0x042f7128 rsp = 0x7fffc22798d0 > rip =
[jira] [Commented] (IMPALA-7351) Add memory estimates for plan nodes and sinks with missing estimates
[ https://issues.apache.org/jira/browse/IMPALA-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710865#comment-16710865 ] Bikramjeet Vig commented on IMPALA-7351: Yes, I still have to look at sinks. Will address those soon. > Add memory estimates for plan nodes and sinks with missing estimates > > > Key: IMPALA-7351 > URL: https://issues.apache.org/jira/browse/IMPALA-7351 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Tim Armstrong >Assignee: Bikramjeet Vig >Priority: Major > Labels: admission-control, resource-management > > Many plan nodes and sinks, e.g. KuduScanNode, KuduTableSink, ExchangeNode, > etc are missing memory estimates entirely. > We should add a basic estimate for all these cases based on experiments and > data from real workloads. In some cases 0 may be the right estimate (e.g. for > streaming nodes like SelectNode that just pass through data) but we should > remove TODOs and document the reasoning in those cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-7930) Crash in thrift-server-test
[ https://issues.apache.org/jira/browse/IMPALA-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker reassigned IMPALA-7930: --- Assignee: Lars Volker > Crash in thrift-server-test > --- > > Key: IMPALA-7930 > URL: https://issues.apache.org/jira/browse/IMPALA-7930 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Critical > Labels: broken-build, flaky > > I've seen a crash in thrift-server-test during an exhaustive test run. > Unfortunately the core file indicated that it was written by a directory, > which caused the automatic core dump resolution to fail. Here's the resolved > minidump: > {noformat} > Crash reason: SIGABRT > Crash address: 0x7d11d19 > Process uptime: not available > Thread 0 (crashed) > 0 libc-2.17.so + 0x351f7 > rax = 0x rdx = 0x0006 > rcx = 0x rbx = 0x7f1e65876000 > rsi = 0x1d19 rdi = 0x1d19 > rbp = 0x7f1e61dbde68 rsp = 0x7fffc22796d8 > r8 = 0x000a1a10r9 = 0xfefefefefeff092d > r10 = 0x0008 r11 = 0x0202 > r12 = 0x029dca31 r13 = 0x033a5e00 > r14 = 0x r15 = 0x > rip = 0x7f1e61c721f7 > Found by: given as instruction pointer in context > 1 libc-2.17.so + 0x368e8 > rsp = 0x7fffc22796e0 rip = 0x7f1e61c738e8 > Found by: stack scanning > 2 libc-2.17.so + 0x17df70 > rsp = 0x7fffc2279770 rip = 0x7f1e61dbaf70 > Found by: stack scanning > 3 thrift-server-test!_fini + 0xdf918 > rsp = 0x7fffc2279778 rip = 0x02ab0288 > Found by: stack scanning > 4 libc-2.17.so + 0x2fbc3 > rsp = 0x7fffc2279790 rip = 0x7f1e61c6cbc3 > Found by: stack scanning > 5 > thrift-server-test!testing::internal::TestEventRepeater::OnTestProgramEnd(testing::UnitTest > const&) + 0x55 > rsp = 0x7fffc22797b0 rip = 0x028711f5 > Found by: stack scanning > 6 libc-2.17.so + 0x17df70 > rbx = 0x rbp = 0x > rsp = 0x7fffc22797e0 r12 = 0x > r13 = 0x0005 rip = 0x7f1e61dbaf70 > Found by: call frame info > 7 thrift-server-test!_fini + 0xc0c1 > rsp = 0x7fffc22797f0 rip = 0x029dca31 > Found by: stack scanning > 8 thrift-server-test!_fini + 0x9d5490 > rsp = 0x7fffc22797f8 rip = 0x033a5e00 > Found by: stack scanning > 9 libc-2.17.so + 0x180e68 > rsp = 0x7fffc2279808 rip = 0x7f1e61dbde68 > Found by: stack scanning > 10 libc-2.17.so + 0x2e266 > rsp = 0x7fffc2279810 rip = 0x7f1e61c6b266 > Found by: stack scanning > 11 thrift-server-test!_fini + 0x9d5490 > rsp = 0x7fffc2279818 rip = 0x033a5e00 > Found by: stack scanning > 12 libc-2.17.so + 0x17df70 > rsp = 0x7fffc2279820 rip = 0x7f1e61dbaf70 > Found by: stack scanning > 13 thrift-server-test!_fini + 0xc0c1 > rsp = 0x7fffc2279828 rip = 0x029dca31 > Found by: stack scanning > 14 thrift-server-test!_fini + 0xdf918 > rsp = 0x7fffc2279840 rip = 0x02ab0288 > Found by: stack scanning > 15 thrift-server-test!_fini + 0x9d5490 > rsp = 0x7fffc2279850 rip = 0x033a5e00 > Found by: stack scanning > 16 thrift-server-test!_fini + 0xc0c1 > rsp = 0x7fffc2279860 rip = 0x029dca31 > Found by: stack scanning > 17 thrift-server-test!_fini + 0x9d5490 > rsp = 0x7fffc2279870 rip = 0x033a5e00 > Found by: stack scanning > 18 thrift-server-test!_fini + 0xc0c1 > rsp = 0x7fffc2279878 rip = 0x029dca31 > Found by: stack scanning > 19 thrift-server-test!_fini + 0xdf918 > rsp = 0x7fffc2279880 rip = 0x02ab0288 > Found by: stack scanning > 20 libc-2.17.so + 0x2e312 > rsp = 0x7fffc2279890 rip = 0x7f1e61c6b312 > Found by: stack scanning > 21 > thrift-server-test!boost::shared_array::~shared_array() > + 0x70 > rsp = 0x7fffc22798b0 rip = 0x02719b40 > Found by: stack scanning > 22 > thrift-server-test!boost::detail::sp_counted_impl_p::dispose() > + 0x4f > rsp = 0x7fffc22798c0 rip = 0x0271e1af > Found by: stack scanning > 23 > thrift-server-test!boost::detail::sp_counted_impl_pd boost::checked_array_deleter > >::dispose() + 0xaa > rbx = 0x042f7128 rsp = 0x7fffc22798d0 > rip = 0x02719cfa > Found by: call frame info > 24 > thrift-server-test!boost::shared_array::~shared_array() > + 0x39 > rbx = 0x0436f900
[jira] [Assigned] (IMPALA-7930) Crash in thrift-server-test
[ https://issues.apache.org/jira/browse/IMPALA-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker reassigned IMPALA-7930: --- Assignee: Thomas Tauber-Marshall (was: Lars Volker) > Crash in thrift-server-test > --- > > Key: IMPALA-7930 > URL: https://issues.apache.org/jira/browse/IMPALA-7930 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Lars Volker >Assignee: Thomas Tauber-Marshall >Priority: Critical > Labels: broken-build, flaky > > I've seen a crash in thrift-server-test during an exhaustive test run. > Unfortunately the core file indicated that it was written by a directory, > which caused the automatic core dump resolution to fail. Here's the resolved > minidump: > {noformat} > Crash reason: SIGABRT > Crash address: 0x7d11d19 > Process uptime: not available > Thread 0 (crashed) > 0 libc-2.17.so + 0x351f7 > rax = 0x rdx = 0x0006 > rcx = 0x rbx = 0x7f1e65876000 > rsi = 0x1d19 rdi = 0x1d19 > rbp = 0x7f1e61dbde68 rsp = 0x7fffc22796d8 > r8 = 0x000a1a10r9 = 0xfefefefefeff092d > r10 = 0x0008 r11 = 0x0202 > r12 = 0x029dca31 r13 = 0x033a5e00 > r14 = 0x r15 = 0x > rip = 0x7f1e61c721f7 > Found by: given as instruction pointer in context > 1 libc-2.17.so + 0x368e8 > rsp = 0x7fffc22796e0 rip = 0x7f1e61c738e8 > Found by: stack scanning > 2 libc-2.17.so + 0x17df70 > rsp = 0x7fffc2279770 rip = 0x7f1e61dbaf70 > Found by: stack scanning > 3 thrift-server-test!_fini + 0xdf918 > rsp = 0x7fffc2279778 rip = 0x02ab0288 > Found by: stack scanning > 4 libc-2.17.so + 0x2fbc3 > rsp = 0x7fffc2279790 rip = 0x7f1e61c6cbc3 > Found by: stack scanning > 5 > thrift-server-test!testing::internal::TestEventRepeater::OnTestProgramEnd(testing::UnitTest > const&) + 0x55 > rsp = 0x7fffc22797b0 rip = 0x028711f5 > Found by: stack scanning > 6 libc-2.17.so + 0x17df70 > rbx = 0x rbp = 0x > rsp = 0x7fffc22797e0 r12 = 0x > r13 = 0x0005 rip = 0x7f1e61dbaf70 > Found by: call frame info > 7 thrift-server-test!_fini + 0xc0c1 > rsp = 0x7fffc22797f0 rip = 0x029dca31 > Found by: stack scanning > 8 thrift-server-test!_fini + 0x9d5490 > rsp = 0x7fffc22797f8 rip = 0x033a5e00 > Found by: stack scanning > 9 libc-2.17.so + 0x180e68 > rsp = 0x7fffc2279808 rip = 0x7f1e61dbde68 > Found by: stack scanning > 10 libc-2.17.so + 0x2e266 > rsp = 0x7fffc2279810 rip = 0x7f1e61c6b266 > Found by: stack scanning > 11 thrift-server-test!_fini + 0x9d5490 > rsp = 0x7fffc2279818 rip = 0x033a5e00 > Found by: stack scanning > 12 libc-2.17.so + 0x17df70 > rsp = 0x7fffc2279820 rip = 0x7f1e61dbaf70 > Found by: stack scanning > 13 thrift-server-test!_fini + 0xc0c1 > rsp = 0x7fffc2279828 rip = 0x029dca31 > Found by: stack scanning > 14 thrift-server-test!_fini + 0xdf918 > rsp = 0x7fffc2279840 rip = 0x02ab0288 > Found by: stack scanning > 15 thrift-server-test!_fini + 0x9d5490 > rsp = 0x7fffc2279850 rip = 0x033a5e00 > Found by: stack scanning > 16 thrift-server-test!_fini + 0xc0c1 > rsp = 0x7fffc2279860 rip = 0x029dca31 > Found by: stack scanning > 17 thrift-server-test!_fini + 0x9d5490 > rsp = 0x7fffc2279870 rip = 0x033a5e00 > Found by: stack scanning > 18 thrift-server-test!_fini + 0xc0c1 > rsp = 0x7fffc2279878 rip = 0x029dca31 > Found by: stack scanning > 19 thrift-server-test!_fini + 0xdf918 > rsp = 0x7fffc2279880 rip = 0x02ab0288 > Found by: stack scanning > 20 libc-2.17.so + 0x2e312 > rsp = 0x7fffc2279890 rip = 0x7f1e61c6b312 > Found by: stack scanning > 21 > thrift-server-test!boost::shared_array::~shared_array() > + 0x70 > rsp = 0x7fffc22798b0 rip = 0x02719b40 > Found by: stack scanning > 22 > thrift-server-test!boost::detail::sp_counted_impl_p::dispose() > + 0x4f > rsp = 0x7fffc22798c0 rip = 0x0271e1af > Found by: stack scanning > 23 > thrift-server-test!boost::detail::sp_counted_impl_pd boost::checked_array_deleter > >::dispose() + 0xaa > rbx = 0x042f7128 rsp = 0x7fffc22798d0 > rip = 0x02719cfa > Found by: call frame info > 24 > thrift-server-test!boost::shared_array::~shared_array() >
[jira] [Created] (IMPALA-7930) Crash in thrift-server-test
Lars Volker created IMPALA-7930: --- Summary: Crash in thrift-server-test Key: IMPALA-7930 URL: https://issues.apache.org/jira/browse/IMPALA-7930 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.2.0 Reporter: Lars Volker I've seen a crash in thrift-server-test during an exhaustive test run. Unfortunately the core file indicated that it was written by a directory, which caused the automatic core dump resolution to fail. Here's the resolved minidump: {noformat} Crash reason: SIGABRT Crash address: 0x7d11d19 Process uptime: not available Thread 0 (crashed) 0 libc-2.17.so + 0x351f7 rax = 0x rdx = 0x0006 rcx = 0x rbx = 0x7f1e65876000 rsi = 0x1d19 rdi = 0x1d19 rbp = 0x7f1e61dbde68 rsp = 0x7fffc22796d8 r8 = 0x000a1a10r9 = 0xfefefefefeff092d r10 = 0x0008 r11 = 0x0202 r12 = 0x029dca31 r13 = 0x033a5e00 r14 = 0x r15 = 0x rip = 0x7f1e61c721f7 Found by: given as instruction pointer in context 1 libc-2.17.so + 0x368e8 rsp = 0x7fffc22796e0 rip = 0x7f1e61c738e8 Found by: stack scanning 2 libc-2.17.so + 0x17df70 rsp = 0x7fffc2279770 rip = 0x7f1e61dbaf70 Found by: stack scanning 3 thrift-server-test!_fini + 0xdf918 rsp = 0x7fffc2279778 rip = 0x02ab0288 Found by: stack scanning 4 libc-2.17.so + 0x2fbc3 rsp = 0x7fffc2279790 rip = 0x7f1e61c6cbc3 Found by: stack scanning 5 thrift-server-test!testing::internal::TestEventRepeater::OnTestProgramEnd(testing::UnitTest const&) + 0x55 rsp = 0x7fffc22797b0 rip = 0x028711f5 Found by: stack scanning 6 libc-2.17.so + 0x17df70 rbx = 0x rbp = 0x rsp = 0x7fffc22797e0 r12 = 0x r13 = 0x0005 rip = 0x7f1e61dbaf70 Found by: call frame info 7 thrift-server-test!_fini + 0xc0c1 rsp = 0x7fffc22797f0 rip = 0x029dca31 Found by: stack scanning 8 thrift-server-test!_fini + 0x9d5490 rsp = 0x7fffc22797f8 rip = 0x033a5e00 Found by: stack scanning 9 libc-2.17.so + 0x180e68 rsp = 0x7fffc2279808 rip = 0x7f1e61dbde68 Found by: stack scanning 10 libc-2.17.so + 0x2e266 rsp = 0x7fffc2279810 rip = 0x7f1e61c6b266 Found by: stack scanning 11 thrift-server-test!_fini + 0x9d5490 rsp = 0x7fffc2279818 rip = 0x033a5e00 Found by: stack scanning 12 libc-2.17.so + 0x17df70 rsp = 0x7fffc2279820 rip = 0x7f1e61dbaf70 Found by: stack scanning 13 thrift-server-test!_fini + 0xc0c1 rsp = 0x7fffc2279828 rip = 0x029dca31 Found by: stack scanning 14 thrift-server-test!_fini + 0xdf918 rsp = 0x7fffc2279840 rip = 0x02ab0288 Found by: stack scanning 15 thrift-server-test!_fini + 0x9d5490 rsp = 0x7fffc2279850 rip = 0x033a5e00 Found by: stack scanning 16 thrift-server-test!_fini + 0xc0c1 rsp = 0x7fffc2279860 rip = 0x029dca31 Found by: stack scanning 17 thrift-server-test!_fini + 0x9d5490 rsp = 0x7fffc2279870 rip = 0x033a5e00 Found by: stack scanning 18 thrift-server-test!_fini + 0xc0c1 rsp = 0x7fffc2279878 rip = 0x029dca31 Found by: stack scanning 19 thrift-server-test!_fini + 0xdf918 rsp = 0x7fffc2279880 rip = 0x02ab0288 Found by: stack scanning 20 libc-2.17.so + 0x2e312 rsp = 0x7fffc2279890 rip = 0x7f1e61c6b312 Found by: stack scanning 21 thrift-server-test!boost::shared_array::~shared_array() + 0x70 rsp = 0x7fffc22798b0 rip = 0x02719b40 Found by: stack scanning 22 thrift-server-test!boost::detail::sp_counted_impl_p::dispose() + 0x4f rsp = 0x7fffc22798c0 rip = 0x0271e1af Found by: stack scanning 23 thrift-server-test!boost::detail::sp_counted_impl_pd >::dispose() + 0xaa rbx = 0x042f7128 rsp = 0x7fffc22798d0 rip = 0x02719cfa Found by: call frame info 24 thrift-server-test!boost::shared_array::~shared_array() + 0x39 rbx = 0x0436f900 rbp = 0x rsp = 0x7fffc2279900 r12 = 0x0001 r13 = 0x04323680 r14 = 0x rip = 0x02719b09 Found by: call frame info 25 libc-2.17.so + 0x38a69 rbx = 0x rbp = 0x7f1e61ff96c8 rsp = 0x7fffc2279920 r12 = 0x0001 r13 = 0x04323680 r14 = 0x rip = 0x7f1e61c75a69 Found by: call frame info 26 thrift-server-test!_GLOBAL__sub_I_json_escaping.cc + 0x2e rsp =
[jira] [Commented] (IMPALA-7802) Implement support for closing idle sessions
[ https://issues.apache.org/jira/browse/IMPALA-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710806#comment-16710806 ] Tim Armstrong commented on IMPALA-7802: --- [~zoram] The thing is does do is report a meaningful error message when the user comes back and tries to use the session - i.e. "Client session expired due to more than...". E.g. a quick google revealed this forum thread where the error message pointed the user in the right direction, https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Query-blah-expired-due-to-client-inactivity-timeout-is-10m/m-p/66842. Maybe the error reporting isn't worth the other hassles though - or maybe we just need to set clearer expectations for client behaviour - that they need to handle sessions being terminated in this way. > Implement support for closing idle sessions > --- > > Key: IMPALA-7802 > URL: https://issues.apache.org/jira/browse/IMPALA-7802 > Project: IMPALA > Issue Type: Improvement > Components: Clients >Affects Versions: Impala 3.0, Impala 2.12.0 >Reporter: Michael Ho >Assignee: Zoram Thanga >Priority: Critical > Labels: supportability > > Currently, the query option {{idle_session_timeout}} specifies a timeout in > seconds after which all running queries of that idle session will be > cancelled and no new queries can be issued to it. However, the idle session > will remain open and it needs to be closed explicitly. Please see the > [documentation|https://www.cloudera.com/documentation/enterprise/latest/topics/impala_idle_session_timeout.html] > for details. > This behavior may be undesirable as each session still consumes an Impala > frontend service thread. The number of frontend service threads is bound by > the flag {{fe_service_threads}}. So, in a multi-tenant environment, an Impala > server can have a lot of idle sessions but they still consume against the > quota of {{fe_service_threads}}. If the number of sessions established > reaches {{fe_service_threads}}, all new session creations will block until > some of the existing sessions exit. There may be no time bound on when these > zombie idle sessions will be closed and it's at the mercy of the client > implementation to close them. In some sense, leaving many idle sessions open > is a way to launch a denial of service attack on Impala. > To fix this situation, we should have an option to forcefully close a session > when it's considered idle so it won't unnecessarily consume the limited > number of frontend service threads. cc'ing [~zoram] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-7802) Implement support for closing idle sessions
[ https://issues.apache.org/jira/browse/IMPALA-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710806#comment-16710806 ] Tim Armstrong edited comment on IMPALA-7802 at 12/6/18 12:38 AM: - [~zoram] The thing is does do is report a meaningful error message when the user comes back and tries to use the session - i.e. "Client session expired due to more than...". E.g. a quick google revealed this forum thread where the error message pointed the user in the right direction, https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Query-blah-expired-due-to-client-inactivity-timeout-is-10m/m-p/66842. Maybe the error reporting isn't worth the other hassles though - or maybe we just need to set clearer expectations for client behaviour - that they need to handle sessions being terminated in this way. Edit: definitely glad that you're pushing on this though, the current state of things isn't right. was (Author: tarmstrong): [~zoram] The thing is does do is report a meaningful error message when the user comes back and tries to use the session - i.e. "Client session expired due to more than...". E.g. a quick google revealed this forum thread where the error message pointed the user in the right direction, https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Query-blah-expired-due-to-client-inactivity-timeout-is-10m/m-p/66842. Maybe the error reporting isn't worth the other hassles though - or maybe we just need to set clearer expectations for client behaviour - that they need to handle sessions being terminated in this way. > Implement support for closing idle sessions > --- > > Key: IMPALA-7802 > URL: https://issues.apache.org/jira/browse/IMPALA-7802 > Project: IMPALA > Issue Type: Improvement > Components: Clients >Affects Versions: Impala 3.0, Impala 2.12.0 >Reporter: Michael Ho >Assignee: Zoram Thanga >Priority: Critical > Labels: supportability > > Currently, the query option {{idle_session_timeout}} specifies a timeout in > seconds after which all running queries of that idle session will be > cancelled and no new queries can be issued to it. However, the idle session > will remain open and it needs to be closed explicitly. Please see the > [documentation|https://www.cloudera.com/documentation/enterprise/latest/topics/impala_idle_session_timeout.html] > for details. > This behavior may be undesirable as each session still consumes an Impala > frontend service thread. The number of frontend service threads is bound by > the flag {{fe_service_threads}}. So, in a multi-tenant environment, an Impala > server can have a lot of idle sessions but they still consume against the > quota of {{fe_service_threads}}. If the number of sessions established > reaches {{fe_service_threads}}, all new session creations will block until > some of the existing sessions exit. There may be no time bound on when these > zombie idle sessions will be closed and it's at the mercy of the client > implementation to close them. In some sense, leaving many idle sessions open > is a way to launch a denial of service attack on Impala. > To fix this situation, we should have an option to forcefully close a session > when it's considered idle so it won't unnecessarily consume the limited > number of frontend service threads. cc'ing [~zoram] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-5397) Set "End Time" earlier rather than on unregistration.
[ https://issues.apache.org/jira/browse/IMPALA-5397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-5397: -- Labels: admission-control query-lifecycle (was: query-lifecycle) > Set "End Time" earlier rather than on unregistration. > - > > Key: IMPALA-5397 > URL: https://issues.apache.org/jira/browse/IMPALA-5397 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.9.0 >Reporter: Mostafa Mokhtar >Priority: Major > Labels: admission-control, query-lifecycle > > When queries are executed from Hue and hit the idle query timeout then the > query duration keeps going up even though the query was cancelled and it is > not actually doing any more work. The end time is only set when the query is > actually unregistered. > Queries below finished in 1s640ms while the reported time is much longer. > |User||Default Db||Statement||Query Type||Start Time||Waiting > Time||Duration||Scan Progress||State||Last Event||# rows fetched||Resource > Pool||Details||Action| > |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select > count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 > 09:38:20.472804000|4m27s|4m32s|261 / 261 ( 100%)|FINISHED|First row > fetched|1|root.default|Details|Close| > |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select > count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 > 08:38:52.780237000|2017-05-31 09:38:20.289582000|59m27s|261 / 261 ( > 100%)|FINISHED|1|root.default|Details| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-5397) Set "End Time" earlier rather than on unregistration.
[ https://issues.apache.org/jira/browse/IMPALA-5397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710781#comment-16710781 ] Tim Armstrong commented on IMPALA-5397: --- I think we should adopt a definition of "End Time" that more closely aligns to intuition, i.e. when the "real work" of the operation has completed. * For queries, where execution proceeds concurrently with results being fetched, End Time should be set when admission resources are released (when the query is cancelled, or all results are fetched). * For other operations, End Time should be set when the operation enters the FINISHED state. > Set "End Time" earlier rather than on unregistration. > - > > Key: IMPALA-5397 > URL: https://issues.apache.org/jira/browse/IMPALA-5397 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.9.0 >Reporter: Mostafa Mokhtar >Priority: Major > Labels: query-lifecycle > > When queries are executed from Hue and hit the idle query timeout then the > query duration keeps going up even though the query was cancelled and it is > not actually doing any more work. The end time is only set when the query is > actually unregistered. > Queries below finished in 1s640ms while the reported time is much longer. > |User||Default Db||Statement||Query Type||Start Time||Waiting > Time||Duration||Scan Progress||State||Last Event||# rows fetched||Resource > Pool||Details||Action| > |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select > count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 > 09:38:20.472804000|4m27s|4m32s|261 / 261 ( 100%)|FINISHED|First row > fetched|1|root.default|Details|Close| > |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select > count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 > 08:38:52.780237000|2017-05-31 09:38:20.289582000|59m27s|261 / 261 ( > 100%)|FINISHED|1|root.default|Details| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-5397) Set "End Time" earlier rather than on unregistration.
[ https://issues.apache.org/jira/browse/IMPALA-5397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-5397: -- Summary: Set "End Time" earlier rather than on unregistration. (was: Queries/sessions that are left idle after executing a query report incorrect duration ) > Set "End Time" earlier rather than on unregistration. > - > > Key: IMPALA-5397 > URL: https://issues.apache.org/jira/browse/IMPALA-5397 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.9.0 >Reporter: Mostafa Mokhtar >Priority: Major > Labels: query-lifecycle > > When queries are executed from Hue and hit the idle query timeout then the > query duration keeps going up even though the query was cancelled and it is > not actually doing any more work. The end time is only set when the query is > actually unregistered. > Queries below finished in 1s640ms while the reported time is much longer. > |User||Default Db||Statement||Query Type||Start Time||Waiting > Time||Duration||Scan Progress||State||Last Event||# rows fetched||Resource > Pool||Details||Action| > |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select > count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 > 09:38:20.472804000|4m27s|4m32s|261 / 261 ( 100%)|FINISHED|First row > fetched|1|root.default|Details|Close| > |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select > count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 > 08:38:52.780237000|2017-05-31 09:38:20.289582000|59m27s|261 / 261 ( > 100%)|FINISHED|1|root.default|Details| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-5397) Queries/sessions that are left idle after executing a query report incorrect duration
[ https://issues.apache.org/jira/browse/IMPALA-5397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-5397: -- Target Version: Impala 3.2.0 Description: When queries are executed from Hue and hit the idle query timeout then the query duration keeps going up even though the query was cancelled and it is not actually doing any more work. The end time is only set when the query is actually unregistered. Queries below finished in 1s640ms while the reported time is much longer. |User||Default Db||Statement||Query Type||Start Time||Waiting Time||Duration||Scan Progress||State||Last Event||# rows fetched||Resource Pool||Details||Action| |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 09:38:20.472804000|4m27s|4m32s|261 / 261 ( 100%)|FINISHED|First row fetched|1|root.default|Details|Close| |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 08:38:52.780237000|2017-05-31 09:38:20.289582000|59m27s|261 / 261 ( 100%)|FINISHED|1|root.default|Details| was: When queries are executed from Hue then the session is left idle and incorrect query duration is reported. As the session is left alive the query duration keeps going up even though the query stats is FINISHED. Queries below finished in 1s640ms while the reported time is much longer. |User||Default Db||Statement||Query Type||Start Time||Waiting Time||Duration||Scan Progress||State||Last Event||# rows fetched||Resource Pool||Details||Action| |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 09:38:20.472804000|4m27s|4m32s|261 / 261 ( 100%)|FINISHED|First row fetched|1|root.default|Details|Close| |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 08:38:52.780237000|2017-05-31 09:38:20.289582000|59m27s|261 / 261 ( 100%)|FINISHED|1|root.default|Details| > Queries/sessions that are left idle after executing a query report incorrect > duration > -- > > Key: IMPALA-5397 > URL: https://issues.apache.org/jira/browse/IMPALA-5397 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.9.0 >Reporter: Mostafa Mokhtar >Priority: Major > Labels: query-lifecycle > > When queries are executed from Hue and hit the idle query timeout then the > query duration keeps going up even though the query was cancelled and it is > not actually doing any more work. The end time is only set when the query is > actually unregistered. > Queries below finished in 1s640ms while the reported time is much longer. > |User||Default Db||Statement||Query Type||Start Time||Waiting > Time||Duration||Scan Progress||State||Last Event||# rows fetched||Resource > Pool||Details||Action| > |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select > count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 > 09:38:20.472804000|4m27s|4m32s|261 / 261 ( 100%)|FINISHED|First row > fetched|1|root.default|Details|Close| > |hue/va1026.halxg.cloudera@halxg.cloudera.com|tpcds_1000_parquet|select > count(*) from tpcds_1000_parquet.inventory|QUERY|2017-05-31 > 08:38:52.780237000|2017-05-31 09:38:20.289582000|59m27s|261 / 261 ( > 100%)|FINISHED|1|root.default|Details| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-5958) Remove duplication of 'yarn-extras' AllocationFileLoaderService
[ https://issues.apache.org/jira/browse/IMPALA-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-5958. --- Resolution: Later I don't think the code cleanup is worth tracking as a open JIRA > Remove duplication of 'yarn-extras' AllocationFileLoaderService > --- > > Key: IMPALA-5958 > URL: https://issues.apache.org/jira/browse/IMPALA-5958 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 2.11.0 >Reporter: Matthew Jacobs >Priority: Trivial > Labels: admission-control, ramp-up > > In IMPALA-5920, some Yarn code that is used by Impala admission control is > brought into the Impala codebase. > In the code review, [~zamsden] pointed out that the > AllocationFileLoaderService thread for monitoring the fair-scheduler.xml file > for changes could be removed if the RequestPoolService used the > impala.util.FileWatcherService. See > https://gerrit.cloudera.org/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java@103 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7312) Non-blocking mode for Fetch() RPC
[ https://issues.apache.org/jira/browse/IMPALA-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-7312: -- Description: Currently Fetch() can block for an arbitrary amount of time until a batch of rows is produced. It might be helpful to have a mode where it returns quickly when there is no data available, so that threads and RPC slots are not tied up. (was: Currently Fetch() can block for an arbitrary amount of time until a batch of rows is produced. It might be helpful to have a mode where it returns quickly when there is no data available, that that threads and RPC slots are not tied up.) > Non-blocking mode for Fetch() RPC > - > > Key: IMPALA-7312 > URL: https://issues.apache.org/jira/browse/IMPALA-7312 > Project: IMPALA > Issue Type: Improvement > Components: Clients >Reporter: Tim Armstrong >Priority: Major > Labels: resource-management > > Currently Fetch() can block for an arbitrary amount of time until a batch of > rows is produced. It might be helpful to have a mode where it returns quickly > when there is no data available, so that threads and RPC slots are not tied > up. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7312) Non-blocking mode for Fetch() RPC
[ https://issues.apache.org/jira/browse/IMPALA-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-7312: -- Target Version: Impala 3.2.0 > Non-blocking mode for Fetch() RPC > - > > Key: IMPALA-7312 > URL: https://issues.apache.org/jira/browse/IMPALA-7312 > Project: IMPALA > Issue Type: Improvement > Components: Clients >Reporter: Tim Armstrong >Priority: Major > Labels: resource-management > > Currently Fetch() can block for an arbitrary amount of time until a batch of > rows is produced. It might be helpful to have a mode where it returns quickly > when there is no data available, so that threads and RPC slots are not tied > up. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7672) Play nice with load balancers when shutting down coordinator
[ https://issues.apache.org/jira/browse/IMPALA-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-7672: -- Target Version: Impala 3.2.0 (was: Product Backlog) > Play nice with load balancers when shutting down coordinator > > > Key: IMPALA-7672 > URL: https://issues.apache.org/jira/browse/IMPALA-7672 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Reporter: Tim Armstrong >Priority: Major > Labels: resource-management > > This is a placeholder to figure out what we need to do to get load balancers > like HAProxy and F5 to cleanly switch to alternative coordinators when we do > a graceful shutdown. E.g. do we need to stop accepting new TCP connections? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7814) AggregationNode's memory estimate should be based on NDV only for non-grouping aggs
[ https://issues.apache.org/jira/browse/IMPALA-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-7814: -- Labels: resource-management (was: ) > AggregationNode's memory estimate should be based on NDV only for > non-grouping aggs > > > Key: IMPALA-7814 > URL: https://issues.apache.org/jira/browse/IMPALA-7814 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Pooja Nilangekar >Assignee: Pooja Nilangekar >Priority: Major > Labels: resource-management > > Currently, the AggregationNode always computes the NDV to estimate the number > of rows. However, for grouping aggregates, the entire input has to be > consumed before the output can be produced, hence its memory estimate should > not consider the NDV. This is acceptable for non-grouping aggregates because > it only need to store the value expression during the build phase, instead of > the entire tuple. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7814) AggregationNode's memory estimate should be based on NDV only for non-grouping aggs
[ https://issues.apache.org/jira/browse/IMPALA-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-7814: -- Component/s: Frontend > AggregationNode's memory estimate should be based on NDV only for > non-grouping aggs > > > Key: IMPALA-7814 > URL: https://issues.apache.org/jira/browse/IMPALA-7814 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Pooja Nilangekar >Assignee: Pooja Nilangekar >Priority: Major > Labels: resource-management > > Currently, the AggregationNode always computes the NDV to estimate the number > of rows. However, for grouping aggregates, the entire input has to be > consumed before the output can be produced, hence its memory estimate should > not consider the NDV. This is acceptable for non-grouping aggregates because > it only need to store the value expression during the build phase, instead of > the entire tuple. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7351) Add memory estimates for plan nodes and sinks with missing estimates
[ https://issues.apache.org/jira/browse/IMPALA-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710758#comment-16710758 ] Tim Armstrong commented on IMPALA-7351: --- [~bikramjeet.vig] is there anything left to do here? I guess some of the sinks still have TODOs. > Add memory estimates for plan nodes and sinks with missing estimates > > > Key: IMPALA-7351 > URL: https://issues.apache.org/jira/browse/IMPALA-7351 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Tim Armstrong >Assignee: Bikramjeet Vig >Priority: Major > Labels: admission-control, resource-management > > Many plan nodes and sinks, e.g. KuduScanNode, KuduTableSink, ExchangeNode, > etc are missing memory estimates entirely. > We should add a basic estimate for all these cases based on experiments and > data from real workloads. In some cases 0 may be the right estimate (e.g. for > streaming nodes like SelectNode that just pass through data) but we should > remove TODOs and document the reasoning in those cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7389) Admission control should set aside less memory on dedicated coordinator if coordinator fragment is lightweight
[ https://issues.apache.org/jira/browse/IMPALA-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-7389: -- Target Version: Impala 3.2.0 (was: Product Backlog) > Admission control should set aside less memory on dedicated coordinator if > coordinator fragment is lightweight > -- > > Key: IMPALA-7389 > URL: https://issues.apache.org/jira/browse/IMPALA-7389 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec >Affects Versions: Impala 3.0, Impala 2.12.0 >Reporter: Tim Armstrong >Priority: Major > Labels: admission-control, resource-management > > The current admission control treats all backends symmetrically and sets > aside the mem_limit. This makes sense for now given that we have the same > mem_limit setting for all backends. > One case where this could be somewhat problematic is if you have dedicated > coordinators with less memory than the executors, because the coordinator's > process memory limit will be fully admitted before the executors. > If you have multiple coordinators and queries are distributed between them > this is relatively unlikely to become a problem. If you have a single > coordinator this is more of an issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-6032) Configuration knobs to automatically reject and fail queries
[ https://issues.apache.org/jira/browse/IMPALA-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-6032: -- Priority: Minor (was: Major) > Configuration knobs to automatically reject and fail queries > > > Key: IMPALA-6032 > URL: https://issues.apache.org/jira/browse/IMPALA-6032 > Project: IMPALA > Issue Type: New Feature > Components: Distributed Exec >Reporter: Mostafa Mokhtar >Priority: Minor > Labels: admission-control, resource-management > > Umbrella JIRA for Admission control enhancements. > Query options would be set on a resource pool basis. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-5013) Re-evaluate our approach to per-operator memory estimates
[ https://issues.apache.org/jira/browse/IMPALA-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-5013. --- Resolution: Done Fix Version/s: Not Applicable With IMPALA-7349 the estimate is a guess at the "ideal" memory required to execute the query with full performance. > Re-evaluate our approach to per-operator memory estimates > - > > Key: IMPALA-5013 > URL: https://issues.apache.org/jira/browse/IMPALA-5013 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 2.8.0 >Reporter: Tim Armstrong >Priority: Major > Labels: resource-management > Fix For: Not Applicable > > > The way that memory estimates are computed for PlanNodes and Sinks are ad-hoc > and in some cases much less accurate than they could be. We should clarify > what the memory estimates mean, how they should be computed and then > systematically fix them. > In general it's difficult to produce accurate memory estimates, because it > depends on having accurate estimates of cardinality and other runtime > parameters, so this JIRA isn't meant to guarantee any specific level of > accuracy of estimates, just to generally improve the estimates and clarify > what they mean and how they should be calculated > We should also consider deprecating or removing these estimates, unless they > are useful for computing "ideal" memory in IMPALA-3706. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-5043) Flag when Impala daemon is disconnected from statestore
[ https://issues.apache.org/jira/browse/IMPALA-5043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-5043: -- Target Version: Impala 3.2.0 Summary: Flag when Impala daemon is disconnected from statestore (was: When daemons are disconnected from the Statestore they can show incorrect admission control limits) > Flag when Impala daemon is disconnected from statestore > --- > > Key: IMPALA-5043 > URL: https://issues.apache.org/jira/browse/IMPALA-5043 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.6.0 >Reporter: Thomas Scott >Priority: Major > Labels: admission-control, resource-management, supportability > > When (for whatever reason) one or more daemons are disconnected from the > statestore the admission control data held on the daemon goes stale. This can > lead to the daemon accepting queries when there is not capacity or rejecting > queries when there is capacity. > For example, a pool somepool has a limit of 10 concurrent queries and is at > that limit when a daemon is disconnected from the statestore. Even when other > queries in somepool finish and the pool is now empty the disconnected daemon > will report the following when new queries are executed: > ERROR: Admission for query exceeded timeout 6ms. Queued reason: number of > running queries 10 is over limit 10 > Could we have some warning to say that the admission control data is stale > here? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-5063) Enable monitoring of Admission Control queue information
[ https://issues.apache.org/jira/browse/IMPALA-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-5063: -- Target Version: Impala 3.2.0 > Enable monitoring of Admission Control queue information > > > Key: IMPALA-5063 > URL: https://issues.apache.org/jira/browse/IMPALA-5063 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 2.9.0 >Reporter: Miklos Szurap >Priority: Major > Labels: admission-control, resource-management, supportability > > It would be nice if we could track the Admission Control / queue information > from the StateStore WebUI. > The topics page just shows a summary about "impala-request-queue" but nothing > on the details of the queues / number of queries / mem usage. > Besides showing this on the WebUI, it would be nice to have it logged, so > there would be some kind of historical view. > These would enable to track issues when a query is rejected due to admission > control. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-7929) Impala query on HBASE table failing with InternalException: Required field*
[ https://issues.apache.org/jira/browse/IMPALA-7929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang reassigned IMPALA-7929: - Assignee: Yongjun Zhang > Impala query on HBASE table failing with InternalException: Required field* > --- > > Key: IMPALA-7929 > URL: https://issues.apache.org/jira/browse/IMPALA-7929 > Project: IMPALA > Issue Type: Bug >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang >Priority: Major > > This looks a corner case bug demonstrated at impala-hbase boundary. > The way to reproduce: > Create a table in hive shell, > {code} > create database abc; > CREATE TABLE abc.test_hbase1 (k STRING, c STRING) STORED BY > 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (' > hbase.columns.mapping'=':key,cf:c', 'serialization.format'='1') TBLPROPERTIES > ('hbase.table.name'='test_hbase1', 'storage_handler'='o > rg.apache.hadoop.hive.hbase.HBaseStorageHandler'); > {code} > Then issue query at impala shell: > {code} > select * from abc.test_hbase1 where k != "row1"; > {code} > Observe: > {code} > Query: select * from abc.test_hbase1 where k != "row1" > > Query submitted at: 2018-12-04 17:02:42 (Coordinator: http://xyz:25000) > ERROR: InternalException: Required field 'qualifier' was not present! Struct: > THBaseFilter(family::key, qualifier:null, op_ordinal:3, filter_constant:row1) > {code} > More observations: > # Replacing {{k != "row1"}} with {{k <> "row1"}} fails the same way. However, > replacing it with other operators, such as ">", "<", "=", all works. > # Replacing {{k != "row1}} with {{c != "row1"}}, it succeeded without the > error reported above. > The above example uses a two-column table, creating a similar table with > three columns fails the same way: adding inequality predicate on the first > column fails, adding inequility predicate doesn't fail. > The code that issues the error message is in HBase, it seems Impala did not > pass the needed info to HBase in this special case. Also wonder if it's > because the first column of the table is the key in hbase table that could > reveal the bug. > {code} > hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TColumnIncrement.java: > throw new org.apache.thrift.protocol.TProtocolException("Required field > 'qualifier' was not present! Struct: " + toString()); > hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TColumnValue.java: > throw new org.apache.thrift.protocol.TProtocolException("Required field > 'qualifier' was not present! Struct: " + toString()); > hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/THBaseService.java: > throw new org.apache.thrift.protocol.TProtocolException("Required > field 'qualifier' was not present! Struct: " + toString()); > hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/THBaseService.java: > throw new org.apache.thrift.protocol.TProtocolException("Required > field 'qualifier' was not present! Struct: " + toString()); > hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/THBaseService.java: > throw new org.apache.thrift.protocol.TProtocolException("Required > field 'qualifier' was not present! Struct: " + toString()); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-7929) Impala query on HBASE table failing with InternalException: Required field*
Yongjun Zhang created IMPALA-7929: - Summary: Impala query on HBASE table failing with InternalException: Required field* Key: IMPALA-7929 URL: https://issues.apache.org/jira/browse/IMPALA-7929 Project: IMPALA Issue Type: Bug Reporter: Yongjun Zhang This looks a corner case bug demonstrated at impala-hbase boundary. The way to reproduce: Create a table in hive shell, {code} create database abc; CREATE TABLE abc.test_hbase1 (k STRING, c STRING) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (' hbase.columns.mapping'=':key,cf:c', 'serialization.format'='1') TBLPROPERTIES ('hbase.table.name'='test_hbase1', 'storage_handler'='o rg.apache.hadoop.hive.hbase.HBaseStorageHandler'); {code} Then issue query at impala shell: {code} select * from abc.test_hbase1 where k != "row1"; {code} Observe: {code} Query: select * from abc.test_hbase1 where k != "row1" Query submitted at: 2018-12-04 17:02:42 (Coordinator: http://xyz:25000) ERROR: InternalException: Required field 'qualifier' was not present! Struct: THBaseFilter(family::key, qualifier:null, op_ordinal:3, filter_constant:row1) {code} More observations: # Replacing {{k != "row1"}} with {{k <> "row1"}} fails the same way. However, replacing it with other operators, such as ">", "<", "=", all works. # Replacing {{k != "row1}} with {{c != "row1"}}, it succeeded without the error reported above. The above example uses a two-column table, creating a similar table with three columns fails the same way: adding inequality predicate on the first column fails, adding inequility predicate doesn't fail. The code that issues the error message is in HBase, it seems Impala did not pass the needed info to HBase in this special case. Also wonder if it's because the first column of the table is the key in hbase table that could reveal the bug. {code} hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TColumnIncrement.java: throw new org.apache.thrift.protocol.TProtocolException("Required field 'qualifier' was not present! Struct: " + toString()); hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TColumnValue.java: throw new org.apache.thrift.protocol.TProtocolException("Required field 'qualifier' was not present! Struct: " + toString()); hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/THBaseService.java: throw new org.apache.thrift.protocol.TProtocolException("Required field 'qualifier' was not present! Struct: " + toString()); hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/THBaseService.java: throw new org.apache.thrift.protocol.TProtocolException("Required field 'qualifier' was not present! Struct: " + toString()); hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/generated/THBaseService.java: throw new org.apache.thrift.protocol.TProtocolException("Required field 'qualifier' was not present! Struct: " + toString()); {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7802) Implement support for closing idle sessions
[ https://issues.apache.org/jira/browse/IMPALA-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710704#comment-16710704 ] Zoram Thanga commented on IMPALA-7802: -- The documentation states that: {quote} Once a session is expired, you cannot issue any new query requests to it. The session remains open, but the only operation you can perform is to close it. {quote} This basically says that an expired session serves no useful purpose to any one - not to Impala as it consumes an fe_service_thread, and not to the client because the only operation allowed on it is to close it. I would like to change the session expiry code to always force-close expired sessions from the server side by calling ImpalaServer::CloseSessionInternal() or a modified version of it. > Implement support for closing idle sessions > --- > > Key: IMPALA-7802 > URL: https://issues.apache.org/jira/browse/IMPALA-7802 > Project: IMPALA > Issue Type: Improvement > Components: Clients >Affects Versions: Impala 3.0, Impala 2.12.0 >Reporter: Michael Ho >Assignee: Zoram Thanga >Priority: Critical > Labels: supportability > > Currently, the query option {{idle_session_timeout}} specifies a timeout in > seconds after which all running queries of that idle session will be > cancelled and no new queries can be issued to it. However, the idle session > will remain open and it needs to be closed explicitly. Please see the > [documentation|https://www.cloudera.com/documentation/enterprise/latest/topics/impala_idle_session_timeout.html] > for details. > This behavior may be undesirable as each session still consumes an Impala > frontend service thread. The number of frontend service threads is bound by > the flag {{fe_service_threads}}. So, in a multi-tenant environment, an Impala > server can have a lot of idle sessions but they still consume against the > quota of {{fe_service_threads}}. If the number of sessions established > reaches {{fe_service_threads}}, all new session creations will block until > some of the existing sessions exit. There may be no time bound on when these > zombie idle sessions will be closed and it's at the mercy of the client > implementation to close them. In some sense, leaving many idle sessions open > is a way to launch a denial of service attack on Impala. > To fix this situation, we should have an option to forcefully close a session > when it's considered idle so it won't unnecessarily consume the limited > number of frontend service threads. cc'ing [~zoram] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-7802) Implement support for closing idle sessions
[ https://issues.apache.org/jira/browse/IMPALA-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-7802 started by Zoram Thanga. > Implement support for closing idle sessions > --- > > Key: IMPALA-7802 > URL: https://issues.apache.org/jira/browse/IMPALA-7802 > Project: IMPALA > Issue Type: Improvement > Components: Clients >Affects Versions: Impala 3.0, Impala 2.12.0 >Reporter: Michael Ho >Assignee: Zoram Thanga >Priority: Critical > Labels: supportability > > Currently, the query option {{idle_session_timeout}} specifies a timeout in > seconds after which all running queries of that idle session will be > cancelled and no new queries can be issued to it. However, the idle session > will remain open and it needs to be closed explicitly. Please see the > [documentation|https://www.cloudera.com/documentation/enterprise/latest/topics/impala_idle_session_timeout.html] > for details. > This behavior may be undesirable as each session still consumes an Impala > frontend service thread. The number of frontend service threads is bound by > the flag {{fe_service_threads}}. So, in a multi-tenant environment, an Impala > server can have a lot of idle sessions but they still consume against the > quota of {{fe_service_threads}}. If the number of sessions established > reaches {{fe_service_threads}}, all new session creations will block until > some of the existing sessions exit. There may be no time bound on when these > zombie idle sessions will be closed and it's at the mercy of the client > implementation to close them. In some sense, leaving many idle sessions open > is a way to launch a denial of service attack on Impala. > To fix this situation, we should have an option to forcefully close a session > when it's considered idle so it won't unnecessarily consume the limited > number of frontend service threads. cc'ing [~zoram] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7928) Investigate consistent placement of remote scan ranges
[ https://issues.apache.org/jira/browse/IMPALA-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710544#comment-16710544 ] Philip Zeyliger commented on IMPALA-7928: - I'm interested in the results even in the currently common case of the number of nodes not changing, but I agree that we'll eventually want more stability than that. > Investigate consistent placement of remote scan ranges > -- > > Key: IMPALA-7928 > URL: https://issues.apache.org/jira/browse/IMPALA-7928 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Joe McDonnell >Priority: Major > > With the file handle cache, it is useful for repeated scans of the same file > to go to the same node, as that node will already have a file handle cached. > When scheduling remote ranges, the scheduler introduces randomness that can > spread reads across all of the nodes. Repeated executions of queries on the > same set of files will not schedule the remote reads on the same nodes. This > causes a large amount of duplication across file handle caches on different > nodes. This reduces the efficiency of the cache significantly. > It may be useful for the scheduler to introduce some determinism in > scheduling remote reads to take advantage of the file handle cache. This is a > variation on the well-known tradeoff between skew and locality. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7928) Investigate consistent placement of remote scan ranges
[ https://issues.apache.org/jira/browse/IMPALA-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710507#comment-16710507 ] Joe McDonnell commented on IMPALA-7928: --- One problem I ran into when implementing this is that a simple hash will have bad behavior if the number of nodes changes. I'm taking a look at consistent hashes to see if that makes sense. Example: http://highscalability.com/blog/2018/6/18/how-ably-efficiently-implemented-consistent-hashing.html > Investigate consistent placement of remote scan ranges > -- > > Key: IMPALA-7928 > URL: https://issues.apache.org/jira/browse/IMPALA-7928 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Joe McDonnell >Priority: Major > > With the file handle cache, it is useful for repeated scans of the same file > to go to the same node, as that node will already have a file handle cached. > When scheduling remote ranges, the scheduler introduces randomness that can > spread reads across all of the nodes. Repeated executions of queries on the > same set of files will not schedule the remote reads on the same nodes. This > causes a large amount of duplication across file handle caches on different > nodes. This reduces the efficiency of the cache significantly. > It may be useful for the scheduler to introduce some determinism in > scheduling remote reads to take advantage of the file handle cache. This is a > variation on the well-known tradeoff between skew and locality. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Closed] (IMPALA-5605) document how to increase thread resource limits
[ https://issues.apache.org/jira/browse/IMPALA-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Rodoni closed IMPALA-5605. --- Resolution: Fixed Fix Version/s: (was: Impala 2.10.0) Impala 3.2.0 > document how to increase thread resource limits > --- > > Key: IMPALA-5605 > URL: https://issues.apache.org/jira/browse/IMPALA-5605 > Project: IMPALA > Issue Type: Task > Components: Docs >Affects Versions: Impala 2.9.0 >Reporter: Matthew Mulder >Assignee: Alex Rodoni >Priority: Major > Fix For: Impala 3.2.0 > > > Depending on the workload, Impala may need to create a very large number of > threads. If so, it is necessary to configure the system correctly to prevent > Impala from crashing because of resource limitations. Such a crash would look > like this:{code}F0629 08:20:02.956413 29088 llvm-codegen.cc:111] LLVM hit > fatal error: Unable to allocate section memory! > terminate called after throwing an instance of > 'boost::exception_detail::clone_impl > >'{code}To prevent this, each Impala host should be configured like > this:{code}echo 200 > /proc/sys/kernel/threads-max > echo 200 > /proc/sys/kernel/pid_max > echo 800 > /proc/sys/vm/max_map_count{code}In /etc/security/limits.conf > add{code}impala soft nproc 262144 > impala hard nproc 262144{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-2424) Rack-aware scheduling
[ https://issues.apache.org/jira/browse/IMPALA-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710273#comment-16710273 ] Peter Ebert commented on IMPALA-2424: - This is becoming increasingly important for scaling and separation of storage and compute. If impala is installed on a subset of nodes, or distinct compute only nodes, remote reads would be essentially random and cross rack traffic may become saturated, especially at large scale where network over-subscription is common this could be a problem. With rack aware scheduling and proper distribution of impala and storage nodes per rack, rack aware scheduling could keep traffic within the TOR switches and improve performance. > Rack-aware scheduling > - > > Key: IMPALA-2424 > URL: https://issues.apache.org/jira/browse/IMPALA-2424 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec >Affects Versions: Impala 2.2.4 >Reporter: Marcel Kornacker >Priority: Minor > Labels: scalability, scheduling > > Currently, Impala makes an effort to schedule plan fragments local to the > data that is being scanned; when no collocated impalad is available, the plan > fragment is placed randomly. > In order to support configurations where Impala is run on a subset of the > nodes in a cluster, we should schedule fragments within the same rack that > holds the assigned scan ranges (if a collocated impalad isn't available). > See https://issues.apache.org/jira/browse/HADOOP-692 for details of how rack > locality is recorded in hdfs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org