[jira] [Commented] (IMPALA-11501) Add flag to allow metadata-cache operations on masked tables
[ https://issues.apache.org/jira/browse/IMPALA-11501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580333#comment-17580333 ] Kurt Deschler commented on IMPALA-11501: What is the proposed default value? > Add flag to allow metadata-cache operations on masked tables > > > Key: IMPALA-11501 > URL: https://issues.apache.org/jira/browse/IMPALA-11501 > Project: IMPALA > Issue Type: New Feature > Components: Security >Reporter: Quanlong Huang >Priority: Critical > > "REFRESH " and "INVALIDATE METADATA " are the table level > metadata-cache operations that only used in Impala (not Hive, SparkSQL or > else). > In Hive-Ranger plugin, when a table is masked (either by column-masking or > row-filtering policy) for a user, the user can't perform any modification > (insert/delete/update) on the table (RANGER-1087, RANGER-1100). However, Hive > doesn't have those metadata-cache operations. It's a grey area whether we > should block them or not. > Currently, Impala blocks metadata-cache operations as well (IMPALA-10554, > IMPALA-11281). However, it's possible that, before upgrade, some > data-consumer jobs already have REFRESH in them. It'd be better to have a > flag to allow such operations for smooth upgrade process. > The flag can be something like "allow_refresh_by_masked_users". > CC [~fangyurao], [~csringhofer] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11633) metadata.test_ddl.TestAsyncDDL.test_get_operation_status_for_async_ddl timeout in CTAS
Kurt Deschler created IMPALA-11633: -- Summary: metadata.test_ddl.TestAsyncDDL.test_get_operation_status_for_async_ddl timeout in CTAS Key: IMPALA-11633 URL: https://issues.apache.org/jira/browse/IMPALA-11633 Project: IMPALA Issue Type: Bug Reporter: Kurt Deschler metadata.test_ddl.TestAsyncDDL.test_get_operation_status_for_async_ddl[protocol: hs2-http | exec_option: \{'sync_ddl': 0, 'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none] (from pytest) The following CTAS timed out: create table test_get_operation_status_for_async_ddl_f3067b3f.alltypes_clone as select * from functional_parquet.alltypes; https://master-03.jenkins.cloudera.com/view/Impala/view/Evergreen-asf-master/job/impala-asf-master-core-s3/222/testReport/junit/metadata.test_ddl/TestAsyncDDL/test_get_operation_status_for_async_ddl_protocol__hs2_http___exec_optionsync_ddl___0___test_replan___1___batch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Reopened] (IMPALA-9320) test_udf_concurrency.TestUdfConcurrency.test_concurrent_jar_drop_use failed with error hdfs path doesn't exist
[ https://issues.apache.org/jira/browse/IMPALA-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler reopened IMPALA-9320: --- https://master-03.jenkins.cloudera.com/view/Impala/view/Evergreen-asf-master/job/impala-asf-master-exhaustive-data-cache/207/testReport/junit/custom_cluster.test_udf_concurrency/TestUdfConcurrency/test_concurrent_jar_drop_use_protocol__beeswax___exec_optiontest_replan___1___batch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/ > test_udf_concurrency.TestUdfConcurrency.test_concurrent_jar_drop_use failed > with error hdfs path doesn't exist > -- > > Key: IMPALA-9320 > URL: https://issues.apache.org/jira/browse/IMPALA-9320 > Project: IMPALA > Issue Type: Test > Components: Infrastructure >Affects Versions: Impala 3.4.0 >Reporter: Xiaomeng Zhang >Assignee: Joe McDonnell >Priority: Major > Labels: broken-build > > {code:java} > custom_cluster/test_udf_concurrency.py:162: in test_concurrent_jar_drop_use > self.filesystem_client.copy_from_local(udf_src_path, udf_tgt_path) > util/hdfs_util.py:82: in copy_from_local > self.hdfs_filesystem_client.copy_from_local(src, dst) > util/hdfs_util.py:256: in copy_from_local > src, dst) + stderr + '; ' + stdout > E AssertionError: HDFS copy from > /data/jenkins/workspace/impala-cdpd-master-exhaustive/repos/Impala/testdata/udfs/impala-hive-udfs.jar > to > /test-warehouse/test_concurrent_jar_drop_use_91093fa5.db/impala-hive-udfs.jar > failed: copyFromLocal: > `/test-warehouse/test_concurrent_jar_drop_use_91093fa5.db/impala-hive-udfs.jar': > No such file or directory: > `hdfs://localhost:20500/test-warehouse/test_concurrent_jar_drop_use_91093fa5.db/impala-hive-udfs.jar' > E ; > {code} > [https://master-02.jenkins.cloudera.com/job/impala-cdpd-master-exhaustive/244/testReport/junit/custom_cluster.test_udf_concurrency/TestUdfConcurrency/test_concurrent_jar_drop_use_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/] > This test has been continuously failing in last 10 builds. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-11653) Identify and time out connections that are not from a supported Impala client more eagerly
[ https://issues.apache.org/jira/browse/IMPALA-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler reassigned IMPALA-11653: -- Assignee: Qifan Chen > Identify and time out connections that are not from a supported Impala client > more eagerly > -- > > Key: IMPALA-11653 > URL: https://issues.apache.org/jira/browse/IMPALA-11653 > Project: IMPALA > Issue Type: Improvement >Affects Versions: Impala 4.1.0 >Reporter: Vincent Tran >Assignee: Qifan Chen >Priority: Major > Attachments: simple_tcp_client.py > > > When a tcp client opens a connection to an Impala client interface (hs2 or > beeswax), the connection is accepted immediately after the 3-way handshake > (SYN, SYN-ACK, ACK) and is queued for > *TAcceptQueueServer::SetupConnection()*. However, if the client sends > nothing else, the ImpalaServer will block in > *apache::thrift::transport::TSocket::read()* until the client sends a RST/FIN > or until *sasl_connect_tcp_timeout_ms* elapses (which is by default, 5 > minutes). > The connection setup thread stack trace can be observed below during this > period. > {noformat} > (gdb) bt > #0 0x7f3b972ee20d in poll () from ./lib64/libc.so.6 > #1 0x02dcd5bc in apache::thrift::transport::TSocket::read(unsigned > char*, unsigned int) () > #2 0x02dd1803 in unsigned int > apache::thrift::transport::readAll(apache::thrift::transport::TSocket&, > unsigned char*, unsigned int) () > #3 0x01330cc9 in readAll (len=5, buf=0x7f3277ea4f8b "", > this=) at > ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TTransport.h:121 > #4 apache::thrift::transport::TSaslTransport::receiveSaslMessage > (this=this@entry=0x278a96b0, status=status@entry=0x7f3277ea500c, > length=length@entry=0x7f3277ea5008) at TSaslTransport.cpp:259 > #5 0x0132db14 in > apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage > (this=0x278a96b0) at TSaslServerTransport.cpp:95 > #6 0x01330e33 in > apache::thrift::transport::TSaslTransport::doSaslNegotiation > (this=0x278a96b0) at TSaslTransport.cpp:81 > #7 0x0132e723 in open (this=0x12e29750) at > ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TBufferTransports.h:218 > #8 apache::thrift::transport::TSaslServerTransport::Factory::getTransport > (this=0xf825a70, trans=...) at TSaslServerTransport.cpp:173 > #9 0x010cd49d in > apache::thrift::server::TAcceptQueueServer::SetupConnection (this=0x174270c0, > entry=...) at TAcceptQueueServer.cpp:233 > #10 0x010cef4d in operator() (tid=, item=..., > __closure=) at TAcceptQueueServer.cpp:323 > #11 > boost::detail::function::void_function_obj_invoker2 const boost::shared_ptr&)>, void, > int, const > boost::shared_ptr&>::invoke(boost::detail::function::function_buffer > &, int, const boost::shared_ptr > &) (function_obj_ptr=..., a0=, a1=...) > at > ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:159 > #12 0x010d3e59 in operator() (a1=..., a0=1, this=0x7f3279ea9510) at > ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770 > #13 > impala::ThreadPool > >::WorkerThread (this=0x7f3279ea94c0, thread_id=1) at > ../util/thread-pool.h:166 > #14 0x0144f8f2 in operator() (this=0x7f3277ea5b40) at > ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770 > #15 impala::Thread::SuperviseThread(std::__cxx11::basic_string std::char_traits, std::allocator > const&, > std::__cxx11::basic_string, std::allocator > > const&, boost::function, impala::ThreadDebugInfo const*, > impala::Promise*) (name=..., category=..., > functor=..., parent_thread_info=, > thread_started=0x7f3279ea9110) at thread.cc:360 > #16 0x01450d6b in operator() std::__cxx11::basic_string&, const std::__cxx11::basic_string&, > boost::function, const impala::ThreadDebugInfo*, impala::Promise int>*), boost::_bi::list0> (a=, > f=@0x1417ccf8: 0x144f5f0 > std::char_traits, std::allocator > const&, > std::__cxx11::basic_string, std::allocator > > const&, boost::function, impala::ThreadDebugInfo const*, > impala::Promise*)>, this=0x1417cd00) at > ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:531 > #17 operator() (this=0x1417ccf8) at > ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222 > #18 boost::detail::thread_data (*)(std::__cxx11::basic_string, > std::allocator > const&, std::__cxx11::basic_string std::char_traits, std::allocator > const&, boost::
[jira] [Commented] (IMPALA-11669) Make Thrift max message size configuration
[ https://issues.apache.org/jira/browse/IMPALA-11669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17619497#comment-17619497 ] Kurt Deschler commented on IMPALA-11669: This sizing really need to be automatic so existing applications don't start hitting errors. > Make Thrift max message size configuration > -- > > Key: IMPALA-11669 > URL: https://issues.apache.org/jira/browse/IMPALA-11669 > Project: IMPALA > Issue Type: Task > Components: Backend >Affects Versions: Impala 4.2.0 >Reporter: Joe McDonnell >Priority: Critical > > With the upgrade to Thrift 0.16, Thrift now has a protection against > malicious message in the form of a maximum size for messages. This is > currently set to 100MB by default. Impala should add the ability to override > this default value. In particular, it seems like communication between > coordinators and the catalogd may need a larger value. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-11653) Identify and time out connections that are not from a supported Impala client more eagerly
[ https://issues.apache.org/jira/browse/IMPALA-11653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler reassigned IMPALA-11653: -- Assignee: Fang-Yu Rao (was: Qifan Chen) > Identify and time out connections that are not from a supported Impala client > more eagerly > -- > > Key: IMPALA-11653 > URL: https://issues.apache.org/jira/browse/IMPALA-11653 > Project: IMPALA > Issue Type: Improvement >Affects Versions: Impala 4.1.0 >Reporter: Vincent Tran >Assignee: Fang-Yu Rao >Priority: Major > Attachments: simple_tcp_client.py > > > When a tcp client opens a connection to an Impala client interface (hs2 or > beeswax), the connection is accepted immediately after the 3-way handshake > (SYN, SYN-ACK, ACK) and is queued for > *TAcceptQueueServer::SetupConnection()*. However, if the client sends > nothing else, the ImpalaServer will block in > *apache::thrift::transport::TSocket::read()* until the client sends a RST/FIN > or until *sasl_connect_tcp_timeout_ms* elapses (which is by default, 5 > minutes). > The connection setup thread stack trace can be observed below during this > period. > {noformat} > (gdb) bt > #0 0x7f3b972ee20d in poll () from ./lib64/libc.so.6 > #1 0x02dcd5bc in apache::thrift::transport::TSocket::read(unsigned > char*, unsigned int) () > #2 0x02dd1803 in unsigned int > apache::thrift::transport::readAll(apache::thrift::transport::TSocket&, > unsigned char*, unsigned int) () > #3 0x01330cc9 in readAll (len=5, buf=0x7f3277ea4f8b "", > this=) at > ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TTransport.h:121 > #4 apache::thrift::transport::TSaslTransport::receiveSaslMessage > (this=this@entry=0x278a96b0, status=status@entry=0x7f3277ea500c, > length=length@entry=0x7f3277ea5008) at TSaslTransport.cpp:259 > #5 0x0132db14 in > apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage > (this=0x278a96b0) at TSaslServerTransport.cpp:95 > #6 0x01330e33 in > apache::thrift::transport::TSaslTransport::doSaslNegotiation > (this=0x278a96b0) at TSaslTransport.cpp:81 > #7 0x0132e723 in open (this=0x12e29750) at > ../../../toolchain/toolchain-packages-gcc7.5.0/thrift-0.9.3-p8/include/thrift/transport/TBufferTransports.h:218 > #8 apache::thrift::transport::TSaslServerTransport::Factory::getTransport > (this=0xf825a70, trans=...) at TSaslServerTransport.cpp:173 > #9 0x010cd49d in > apache::thrift::server::TAcceptQueueServer::SetupConnection (this=0x174270c0, > entry=...) at TAcceptQueueServer.cpp:233 > #10 0x010cef4d in operator() (tid=, item=..., > __closure=) at TAcceptQueueServer.cpp:323 > #11 > boost::detail::function::void_function_obj_invoker2 const boost::shared_ptr&)>, void, > int, const > boost::shared_ptr&>::invoke(boost::detail::function::function_buffer > &, int, const boost::shared_ptr > &) (function_obj_ptr=..., a0=, a1=...) > at > ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:159 > #12 0x010d3e59 in operator() (a1=..., a0=1, this=0x7f3279ea9510) at > ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770 > #13 > impala::ThreadPool > >::WorkerThread (this=0x7f3279ea94c0, thread_id=1) at > ../util/thread-pool.h:166 > #14 0x0144f8f2 in operator() (this=0x7f3277ea5b40) at > ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770 > #15 impala::Thread::SuperviseThread(std::__cxx11::basic_string std::char_traits, std::allocator > const&, > std::__cxx11::basic_string, std::allocator > > const&, boost::function, impala::ThreadDebugInfo const*, > impala::Promise*) (name=..., category=..., > functor=..., parent_thread_info=, > thread_started=0x7f3279ea9110) at thread.cc:360 > #16 0x01450d6b in operator() std::__cxx11::basic_string&, const std::__cxx11::basic_string&, > boost::function, const impala::ThreadDebugInfo*, impala::Promise int>*), boost::_bi::list0> (a=, > f=@0x1417ccf8: 0x144f5f0 > std::char_traits, std::allocator > const&, > std::__cxx11::basic_string, std::allocator > > const&, boost::function, impala::ThreadDebugInfo const*, > impala::Promise*)>, this=0x1417cd00) at > ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:531 > #17 operator() (this=0x1417ccf8) at > ../../../toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222 > #18 boost::detail::thread_data (*)(std::__cxx11::basic_string, > std::allocator > const&, std::__cxx11::basic_string std::char_traits, std::alloca
[jira] [Assigned] (IMPALA-9339) Revise explanation of RowMaterializationTimer
[ https://issues.apache.org/jira/browse/IMPALA-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler reassigned IMPALA-9339: - Assignee: Pranav Yogi Lodha (was: Sahil Takiar) > Revise explanation of RowMaterializationTimer > - > > Key: IMPALA-9339 > URL: https://issues.apache.org/jira/browse/IMPALA-9339 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Sahil Takiar >Assignee: Pranav Yogi Lodha >Priority: Major > > IMPALA-8825 added the following explanation of the counter > {{RowMaterializationTimer}} > {quote} > /// Tracks the time spent materializing rows and converting them into a > QueryResultSet. > /// The QueryResultSet format used depends on the client, for Beeswax clients > an ASCII > /// representation is used, whereas for HS2 clients (using > TCLIService.thrift) rows are > /// converted into a TRowSet. Materializing rows includes evaluating any yet > unevaluated > /// expressions using ScalarExprEvaluators. > {quote} > This isn't accurate because the actual timer measures time taken running > {{Coordinator::GetNext}} which can block if any downstream operators in the > plan take a long time to materialize rows. > We should revise the explanation of {{RowMaterializationTimer}} and consider > adding a separate timer that truly measures the amount of time taken to > materialize rows (e.g. measure time taken in {{QueryResultSet::AddRows}}). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11857) Join build fragments not displaying correctly in graphical plan
Kurt Deschler created IMPALA-11857: -- Summary: Join build fragments not displaying correctly in graphical plan Key: IMPALA-11857 URL: https://issues.apache.org/jira/browse/IMPALA-11857 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Kurt Deschler Assignee: Kurt Deschler Repro: Run a query with mt_dop enabled and look at the plan output in the web server. Join build fragments are not connected properly to the DAG. use tpch_parquet; set mt_dop = 10; select count(*) from part join partsupp on ps_partkey=p_partkey and ps_suppkey=10; -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-11857) Join build fragments not displaying correctly in graphical plan
[ https://issues.apache.org/jira/browse/IMPALA-11857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17681419#comment-17681419 ] Kurt Deschler commented on IMPALA-11857: [~joemcdonnell] I reviewed the other patch from Shant after I posted the patch and noticed the conflict. I decided to go ahead with this patch to fix the bug part since that made plan diagrams unreadable. We will rebase the other patch and we can spend time iterating there on appropriate changes to statistics in conjunction with other rendering changes that we have in progress. > Join build fragments not displaying correctly in graphical plan > --- > > Key: IMPALA-11857 > URL: https://issues.apache.org/jira/browse/IMPALA-11857 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Minor > > Repro: > Run a query with mt_dop enabled and look at the plan output in the web > server. Join build fragments are not connected properly to the DAG. > use tpch_parquet; > set mt_dop = 10; > select count(*) from part join partsupp on ps_partkey=p_partkey and > ps_suppkey=10; -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-11857) Join build fragments not displaying correctly in graphical plan
[ https://issues.apache.org/jira/browse/IMPALA-11857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler resolved IMPALA-11857. Resolution: Fixed > Join build fragments not displaying correctly in graphical plan > --- > > Key: IMPALA-11857 > URL: https://issues.apache.org/jira/browse/IMPALA-11857 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Minor > > Repro: > Run a query with mt_dop enabled and look at the plan output in the web > server. Join build fragments are not connected properly to the DAG. > use tpch_parquet; > set mt_dop = 10; > select count(*) from part join partsupp on ps_partkey=p_partkey and > ps_suppkey=10; -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-11617) Pool service should be made aware of cpu-usage limit for each executor group set
[ https://issues.apache.org/jira/browse/IMPALA-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17682732#comment-17682732 ] Kurt Deschler commented on IMPALA-11617: Please update the description here to match what was committed. > Pool service should be made aware of cpu-usage limit for each executor group > set > > > Key: IMPALA-11617 > URL: https://issues.apache.org/jira/browse/IMPALA-11617 > Project: IMPALA > Issue Type: Improvement >Reporter: Qifan Chen >Assignee: Wenzhe Zhou >Priority: Major > > IMPALA-11604 enables the planner to compute CPU usage for certain queries and > to select suitable executor groups to run. Here the CPU usage is expressed as > the total amount of data to be processed per instance. > The limit on the total amount of data that each executor group can handle > should be provided by the pool service. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-11970) Add query timing display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-11970 started by Kurt Deschler. -- > Add query timing display to Impala WebUI > > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-11970) Add query timing display to Impala WebUI
Kurt Deschler created IMPALA-11970: -- Summary: Add query timing display to Impala WebUI Key: IMPALA-11970 URL: https://issues.apache.org/jira/browse/IMPALA-11970 Project: IMPALA Issue Type: New Feature Components: Backend Reporter: Kurt Deschler Assignee: Kurt Deschler Query profiles contain timing information for fragments and plan nodes that is difficult to analyze in a text format, especially for complex and highly-parallel execution plans. A graphical display in the WebUI that renders timing information in Gantt chart form will make the execution timing and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-11970) Add query timing display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-11970: --- Attachment: timing_mt_dop_on.png timing_mt_dop_off.png > Add query timing display to Impala WebUI > > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: timing_mt_dop_off.png, timing_mt_dop_on.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-11970) Add query timing display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-11970: --- Attachment: (was: timing_mt_dop_on.png) > Add query timing display to Impala WebUI > > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: timing_mt_dop_off.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-11970) Add query timing display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-11970: --- Attachment: timing_mt_dop_off.png > Add query timing display to Impala WebUI > > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: timing_mt_dop_off.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-11970) Add query timing display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-11970: --- Attachment: (was: timing_mt_dop_off.png) > Add query timing display to Impala WebUI > > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: timing_mt_dop_off.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-11970) Add query timing display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-11970: --- Attachment: timing_mt_dop_on.png > Add query timing display to Impala WebUI > > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: timing_mt_dop_off.png, timing_mt_dop_on.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-11970) Add query timing display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696639#comment-17696639 ] Kurt Deschler commented on IMPALA-11970: Sample Output: !timing_mt_dop_off.png! !timing_mt_dop_on.png! > Add query timing display to Impala WebUI > > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: timing_mt_dop_off.png, timing_mt_dop_on.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-11970) Add query timing display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696639#comment-17696639 ] Kurt Deschler edited comment on IMPALA-11970 at 3/6/23 12:33 AM: - Sample Output: mt_dop off !timing_mt_dop_off.png! mt_dop on !timing_mt_dop_on.png! was (Author: kdeschle): Sample Output: !timing_mt_dop_off.png! !timing_mt_dop_on.png! > Add query timing display to Impala WebUI > > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: timing_mt_dop_off.png, timing_mt_dop_on.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-11970) Add query timing display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696639#comment-17696639 ] Kurt Deschler edited comment on IMPALA-11970 at 3/6/23 12:34 AM: - Sample Output: mt_dop off !timing_mt_dop_off.png|width=801,height=486! mt_dop on !timing_mt_dop_on.png|width=800,height=538! was (Author: kdeschle): Sample Output: mt_dop off !timing_mt_dop_off.png! mt_dop on !timing_mt_dop_on.png! > Add query timing display to Impala WebUI > > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: timing_mt_dop_off.png, timing_mt_dop_on.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-11970) Add query timing display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696639#comment-17696639 ] Kurt Deschler edited comment on IMPALA-11970 at 3/6/23 12:35 AM: - Sample Output: mt_dop off !timing_mt_dop_off.png|width=700,height=425! mt_dop on !timing_mt_dop_on.png|width=700,height=471! was (Author: kdeschle): Sample Output: mt_dop off !timing_mt_dop_off.png|width=801,height=486! mt_dop on !timing_mt_dop_on.png|width=800,height=538! > Add query timing display to Impala WebUI > > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: timing_mt_dop_off.png, timing_mt_dop_on.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-11970) Add query timing display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696639#comment-17696639 ] Kurt Deschler edited comment on IMPALA-11970 at 3/6/23 12:35 AM: - Sample Output: mt_dop off !timing_mt_dop_off.png|width=600,height=364! mt_dop on !timing_mt_dop_on.png|width=603,height=406! was (Author: kdeschle): Sample Output: mt_dop off !timing_mt_dop_off.png|width=700,height=425! mt_dop on !timing_mt_dop_on.png|width=700,height=471! > Add query timing display to Impala WebUI > > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: timing_mt_dop_off.png, timing_mt_dop_on.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-11970) Add query timeline display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-11970: --- Summary: Add query timeline display to Impala WebUI (was: Add query timing display to Impala WebUI) > Add query timeline display to Impala WebUI > -- > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: timing_mt_dop_off.png, timing_mt_dop_on.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12014) Output a warning message on failed KeepAlive RPC for a Kudu scanner
[ https://issues.apache.org/jira/browse/IMPALA-12014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703776#comment-17703776 ] Kurt Deschler commented on IMPALA-12014: Warnings aren't really application-friendly. Would it be possible to output the warning as part of the iterator error? > Output a warning message on failed KeepAlive RPC for a Kudu scanner > --- > > Key: IMPALA-12014 > URL: https://issues.apache.org/jira/browse/IMPALA-12014 > Project: IMPALA > Issue Type: Improvement > Components: Backend, be >Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.7.1, Impala 2.9.0, > Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0, Impala 3.1.0, Impala > 3.2.0, Impala 4.0.0, Impala 3.3.0, Impala 3.4.0, Impala 3.4.1, Impala 4.1.0, > Impala 4.0.1, Impala 4.2.0, Impala 4.1.1 >Reporter: Alexey Serbin >Assignee: Alexey Serbin >Priority: Minor > Labels: supportability, troubleshooting > Fix For: Impala 4.3.0 > > > With [IMPALA-3292|https://issues.apache.org/jira/browse/IMPALA-3292], the > code has been modified to ignore failed KeepAlive RPCs for Kudu scanners > because one of the follow-up KeepAlive RPCs usually succeeds within TTL for > an idle Kudu scanner. > However, if a Kudu tablet server had been busy for a long time, it might > happen that all the consecutive KeepAlive requests failed as well (e.g., if > the RPC queue stayed full for the whole interval of the scanner TTL). If so > happened, the corresponding Impala query would fail with an error message in > the Impala's query profile like below: > {noformat} > ... > Query Type: QUERY > Query State: EXCEPTION > Impala Query State: ERROR > Query Status: Unable to advance iterator for node with id '1' for Kudu > table 'mega_table': Not found: Scanner 4235dd17eb444a36a945f003c23dcf81 not > found (it may have expired) > ... > {noformat} > Without a warning message logged by impalad or other clues it's hard to infer > the root cause of such a situation for people who have not much knowledge of > the Kudu specifics. > It would be great to at least log a warning message about failed attempts to > send KeepAlive RPC for Kudu scanners. As of now, the message is logged with > {{VLOG(1)}} facility, but verbose logging isn't usually enabled by default > for impalad. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-11970) Add query timeline display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704173#comment-17704173 ] Kurt Deschler commented on IMPALA-11970: Dotted lines/boxes appear show when results are being received from fragments below. They correspond to the intervals where the senders are producing results. > Add query timeline display to Impala WebUI > -- > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: image-2023-03-23-10-29-47-827.png, > timing_mt_dop_off.png, timing_mt_dop_on.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-11970) Add query timeline display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-11970: --- Attachment: image-2023-03-23-10-29-47-827.png > Add query timeline display to Impala WebUI > -- > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: image-2023-03-23-10-29-47-827.png, > timing_mt_dop_off.png, timing_mt_dop_on.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-11970) Add query timeline display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-11970: --- Attachment: image-2023-03-23-10-35-24-078.png > Add query timeline display to Impala WebUI > -- > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: image-2023-03-23-10-29-47-827.png, > image-2023-03-23-10-35-24-078.png, timing_mt_dop_off.png, timing_mt_dop_on.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-11970) Add query timeline display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704177#comment-17704177 ] Kurt Deschler commented on IMPALA-11970: Q21 Example showing DAG connections on the right that were necessary to properly render complex plans. Previous examples showed multiple edges into exchanges which was incorrect. !image-2023-03-23-10-29-47-827.png|width=603,height=544! Same example with plan-order printing: !image-2023-03-23-10-35-24-078.png|width=603,height=565! > Add query timeline display to Impala WebUI > -- > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: image-2023-03-23-10-29-47-827.png, > image-2023-03-23-10-35-24-078.png, timing_mt_dop_off.png, timing_mt_dop_on.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12014) Output a warning message on failed KeepAlive RPC for a Kudu scanner
[ https://issues.apache.org/jira/browse/IMPALA-12014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704866#comment-17704866 ] Kurt Deschler commented on IMPALA-12014: Yes the last KeepAlive status should be fine. > Output a warning message on failed KeepAlive RPC for a Kudu scanner > --- > > Key: IMPALA-12014 > URL: https://issues.apache.org/jira/browse/IMPALA-12014 > Project: IMPALA > Issue Type: Improvement > Components: Backend, be >Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.7.1, Impala 2.9.0, > Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0, Impala 3.1.0, Impala > 3.2.0, Impala 4.0.0, Impala 3.3.0, Impala 3.4.0, Impala 3.4.1, Impala 4.1.0, > Impala 4.0.1, Impala 4.2.0, Impala 4.1.1 >Reporter: Alexey Serbin >Assignee: Alexey Serbin >Priority: Minor > Labels: supportability, troubleshooting > Fix For: Impala 4.3.0 > > > With [IMPALA-3292|https://issues.apache.org/jira/browse/IMPALA-3292], the > code has been modified to ignore failed KeepAlive RPCs for Kudu scanners > because one of the follow-up KeepAlive RPCs usually succeeds within TTL for > an idle Kudu scanner. > However, if a Kudu tablet server had been busy for a long time, it might > happen that all the consecutive KeepAlive requests failed as well (e.g., if > the RPC queue stayed full for the whole interval of the scanner TTL). If so > happened, the corresponding Impala query would fail with an error message in > the Impala's query profile like below: > {noformat} > ... > Query Type: QUERY > Query State: EXCEPTION > Impala Query State: ERROR > Query Status: Unable to advance iterator for node with id '1' for Kudu > table 'mega_table': Not found: Scanner 4235dd17eb444a36a945f003c23dcf81 not > found (it may have expired) > ... > {noformat} > Without a warning message logged by impalad or other clues it's hard to infer > the root cause of such a situation for people who have not much knowledge of > the Kudu specifics. > It would be great to at least log a warning message about failed attempts to > send KeepAlive RPC for Kudu scanners. As of now, the message is logged with > {{VLOG(1)}} facility, but verbose logging isn't usually enabled by default > for impalad. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-12033) Impalad crashes when --dump_exec_request_path is used
Kurt Deschler created IMPALA-12033: -- Summary: Impalad crashes when --dump_exec_request_path is used Key: IMPALA-12033 URL: https://issues.apache.org/jira/browse/IMPALA-12033 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 4.1.0 Reporter: Kurt Deschler As part of https://issues.apache.org/jira/browse/IMPALA-10535, a new flag was added to dump TExecRequest objects from the backend. This currently crashes the backend in DumpTExecReq() when the flag is enabled. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-12033) Impalad crashes when --dump_exec_request_path is used
[ https://issues.apache.org/jira/browse/IMPALA-12033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler reassigned IMPALA-12033: -- Assignee: Kurt Deschler > Impalad crashes when --dump_exec_request_path is used > - > > Key: IMPALA-12033 > URL: https://issues.apache.org/jira/browse/IMPALA-12033 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.1.0 >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > As part of https://issues.apache.org/jira/browse/IMPALA-10535, a new flag was > added to dump TExecRequest objects from the backend. This currently crashes > the backend in DumpTExecReq() when the flag is enabled. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-12036) Web UI incorrectly shows root.default resource pool for all queries in /queries page
[ https://issues.apache.org/jira/browse/IMPALA-12036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler reassigned IMPALA-12036: -- Assignee: Kurt Deschler > Web UI incorrectly shows root.default resource pool for all queries in > /queries page > > > Key: IMPALA-12036 > URL: https://issues.apache.org/jira/browse/IMPALA-12036 > Project: IMPALA > Issue Type: Improvement >Reporter: Abhishek Rawat >Assignee: Kurt Deschler >Priority: Major > > Web UI seems to be always showing root.default resource pool even if a > different resource pool is used by the query. I also forced a different > resource pool by using `set REQUEST_POOL=root.group-set-00`, but Web UI still > showed `root.default`. This is reflected correctly in the query profiles but > not in the Web UI. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-11970) Add query timeline display to Impala WebUI
[ https://issues.apache.org/jira/browse/IMPALA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler resolved IMPALA-11970. Resolution: Fixed > Add query timeline display to Impala WebUI > -- > > Key: IMPALA-11970 > URL: https://issues.apache.org/jira/browse/IMPALA-11970 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Attachments: image-2023-03-23-10-29-47-827.png, > image-2023-03-23-10-35-24-078.png, timing_mt_dop_off.png, timing_mt_dop_on.png > > > Query profiles contain timing information for fragments and plan nodes that > is difficult to analyze in a text format, especially for complex and > highly-parallel execution plans. A graphical display in the WebUI that > renders timing information in Gantt chart form will make the execution timing > and dependencies much easier to follow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns
[ https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler reassigned IMPALA-10186: -- Assignee: Michael Smith > Write invalid parquet PageLocations which table sort by some columns > > > Key: IMPALA-10186 > URL: https://issues.apache.org/jira/browse/IMPALA-10186 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: guojingfeng >Assignee: Michael Smith >Priority: Major > Labels: parquet > > Current parquet writer write -1 of PageLocation.offset and > PageLocation.first_row_index when meet a empty page. > hdfs-parquet-file-writer.cc Line: 808 ~ 819 > {code:java} > // Write data pages > for (const DataPage& page : pages_) { > if (page.header.data_page_header.num_values == 0) { > // Skip empty pages > location.offset = -1; > location.compressed_page_size = 0; > location.first_row_index = -1; > AddLocationToOffsetIndex(location); > continue; > } > {code} > But -1 values may cause ComputeCandidatePages function run into unexpected > status. > {code:java} > bool ComputeCandidatePages( > const vector& page_locations, > const vector& candidate_ranges, > const int64_t num_rows, vector* candidate_pages) { > if (!ValidatePageLocations(page_locations, num_rows)) return false > {code} > and then cause IMPALA-9952 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-12113) Elapsed time incorrect in query timeline
Kurt Deschler created IMPALA-12113: -- Summary: Elapsed time incorrect in query timeline Key: IMPALA-12113 URL: https://issues.apache.org/jira/browse/IMPALA-12113 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Kurt Deschler Assignee: Kurt Deschler -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-12113) Elapsed time incorrect in query timeline
[ https://issues.apache.org/jira/browse/IMPALA-12113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-12113: --- Description: Elapsed time at the bottom of WebUI query timeline is supposed to show the time at the end of each interval. Currently the start time is shown. > Elapsed time incorrect in query timeline > > > Key: IMPALA-12113 > URL: https://issues.apache.org/jira/browse/IMPALA-12113 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Minor > > Elapsed time at the bottom of WebUI query timeline is supposed to show the > time at the end of each interval. Currently the start time is shown. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-12113) Elapsed time incorrect in query timeline
[ https://issues.apache.org/jira/browse/IMPALA-12113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-12113 started by Kurt Deschler. -- > Elapsed time incorrect in query timeline > > > Key: IMPALA-12113 > URL: https://issues.apache.org/jira/browse/IMPALA-12113 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Minor > > Elapsed time at the bottom of WebUI query timeline is supposed to show the > time at the end of each interval. Currently the start time is shown. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-12113) Elapsed time incorrect in query timeline
[ https://issues.apache.org/jira/browse/IMPALA-12113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler resolved IMPALA-12113. Resolution: Fixed > Elapsed time incorrect in query timeline > > > Key: IMPALA-12113 > URL: https://issues.apache.org/jira/browse/IMPALA-12113 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Minor > > Elapsed time at the bottom of WebUI query timeline is supposed to show the > time at the end of each interval. Currently the start time is shown. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-12129) Query timeline not working for running query
Kurt Deschler created IMPALA-12129: -- Summary: Query timeline not working for running query Key: IMPALA-12129 URL: https://issues.apache.org/jira/browse/IMPALA-12129 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Kurt Deschler 2 issues: 1) Json for running queries is missing fields that is not handled by the timeline logic 2) Timeline refresh is checking Query State == FINISHED which does not seem to properly reflect when a query ends. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-12129) Query timeline not working for running query
[ https://issues.apache.org/jira/browse/IMPALA-12129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler reassigned IMPALA-12129: -- Assignee: Kurt Deschler > Query timeline not working for running query > > > Key: IMPALA-12129 > URL: https://issues.apache.org/jira/browse/IMPALA-12129 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Minor > > 2 issues: > 1) Json for running queries is missing fields that is not handled by the > timeline logic > 2) Timeline refresh is checking Query State == FINISHED which does not seem > to properly reflect when a query ends. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-12129) Query timeline not working for running query
[ https://issues.apache.org/jira/browse/IMPALA-12129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-12129 started by Kurt Deschler. -- > Query timeline not working for running query > > > Key: IMPALA-12129 > URL: https://issues.apache.org/jira/browse/IMPALA-12129 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Minor > > 2 issues: > 1) Json for running queries is missing fields that is not handled by the > timeline logic > 2) Timeline refresh is checking Query State == FINISHED which does not seem > to properly reflect when a query ends. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-12134) Optimize row materialization time
Kurt Deschler created IMPALA-12134: -- Summary: Optimize row materialization time Key: IMPALA-12134 URL: https://issues.apache.org/jira/browse/IMPALA-12134 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Kurt Deschler Assignee: Kurt Deschler IMPALA-12111 addressed the most significant contributors to slow row materialization. However, there is still room for significant improvement with the following optimizations: * Specialized implementation for default Data and Timestamp formatting. * Caching deserialized column metadata -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-12129) Query timeline not working for running query
[ https://issues.apache.org/jira/browse/IMPALA-12129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler resolved IMPALA-12129. Resolution: Fixed > Query timeline not working for running query > > > Key: IMPALA-12129 > URL: https://issues.apache.org/jira/browse/IMPALA-12129 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Minor > > 2 issues: > 1) Json for running queries is missing fields that is not handled by the > timeline logic > 2) Timeline refresh is checking Query State == FINISHED which does not seem > to properly reflect when a query ends. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-12134) Optimize row materialization time
[ https://issues.apache.org/jira/browse/IMPALA-12134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-12134 started by Kurt Deschler. -- > Optimize row materialization time > - > > Key: IMPALA-12134 > URL: https://issues.apache.org/jira/browse/IMPALA-12134 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > IMPALA-12111 addressed the most significant contributors to slow row > materialization. However, there is still room for significant improvement > with the following optimizations: > * Specialized implementation for default Data and Timestamp formatting. > * Caching deserialized column metadata -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-12176) Improve client fetch metrics
Kurt Deschler created IMPALA-12176: -- Summary: Improve client fetch metrics Key: IMPALA-12176 URL: https://issues.apache.org/jira/browse/IMPALA-12176 Project: IMPALA Issue Type: Improvement Reporter: Kurt Deschler Assignee: Kurt Deschler These changes address limitations with the current metrics: * ClientFetchWaitTimer includes both Thrift serialization/write and client time. * RowMaterializationTimer includes both coordinator fetch time and time to convert rows to the client protocol. Proposed changes: * Add CreateResultSetTime metric to PLAN_ROOT_SINK node in the query profile. This will isolate the cost to convert fetched rows to the client protocol. * Add read/write times for RPCs in /rpcz webui. These will be hidden by default with a checkbox to enable display. * Show client RPC read/write/count stats in the query profile -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-12176) Improve client fetch metrics
[ https://issues.apache.org/jira/browse/IMPALA-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-12176: --- Description: These changes address limitations with the current metrics: * ClientFetchWaitTimer includes both Thrift serialization/write and client time. * RowMaterializationTimer includes both coordinator fetch time and time to convert rows to the client protocol. Proposed changes: * Add CreateResultSetTime metric to PLAN_ROOT_SINK node in the query profile. This will isolate the cost to convert fetched rows to the client protocol. * Add read/write times for RPCs in /rpcz webui. These will be hidden by default with a checkbox to enable display. * Add sum to RPC histogram metrics * Show client RPC read/write/count stats in the query profile was: These changes address limitations with the current metrics: * ClientFetchWaitTimer includes both Thrift serialization/write and client time. * RowMaterializationTimer includes both coordinator fetch time and time to convert rows to the client protocol. Proposed changes: * Add CreateResultSetTime metric to PLAN_ROOT_SINK node in the query profile. This will isolate the cost to convert fetched rows to the client protocol. * Add read/write times for RPCs in /rpcz webui. These will be hidden by default with a checkbox to enable display. * Show client RPC read/write/count stats in the query profile > Improve client fetch metrics > > > Key: IMPALA-12176 > URL: https://issues.apache.org/jira/browse/IMPALA-12176 > Project: IMPALA > Issue Type: Improvement >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > These changes address limitations with the current metrics: > * ClientFetchWaitTimer includes both Thrift serialization/write and client > time. > * RowMaterializationTimer includes both coordinator fetch time and time to > convert rows to the client protocol. > Proposed changes: > * Add CreateResultSetTime metric to PLAN_ROOT_SINK node in the query > profile. This will isolate the cost to convert fetched rows to the client > protocol. > * Add read/write times for RPCs in /rpcz webui. These will be hidden by > default with a checkbox to enable display. > * Add sum to RPC histogram metrics > * Show client RPC read/write/count stats in the query profile -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-532) Impala should tolerate bad locale settings.
[ https://issues.apache.org/jira/browse/IMPALA-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler reassigned IMPALA-532: Assignee: Pranav Yogi Lodha > Impala should tolerate bad locale settings. > --- > > Key: IMPALA-532 > URL: https://issues.apache.org/jira/browse/IMPALA-532 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 1.1 >Reporter: Ishaan Joshi >Assignee: Pranav Yogi Lodha >Priority: Major > Labels: newbie, ramp-up, supportability > > Currently, the Statestore does not tolerate a bad locale setting and crashes > while starting up. > {code} > USE_DEBUG_BUILD=false > + perl -pi -e > 's#{{CMF_CONF_DIR}}#/var/run/cloudera-scm-agent/process/2469-impala-STATESTORE#g' > > /var/run/cloudera-scm-agent/process/2469-impala-STATESTORE/impala-conf/state_store_flags > perl: warning: Setting locale failed. > perl: warning: Please check that your locale settings: > LANGUAGE = (unset), > LC_ALL = (unset), > LANG = "fr_FR.UTF-8" > are supported and installed on your system. > perl: warning: Falling back to the standard locale ("C"). > + '[' -f > /var/run/cloudera-scm-agent/process/2469-impala-STATESTORE/impala-conf/.htpasswd > ']' > + chmod 600 > /var/run/cloudera-scm-agent/process/2469-impala-STATESTORE/impala-conf/.htpasswd > + false > + export > IMPALA_BIN=/opt/cloudera/parcels/IMPALA-1.1-1.p0.8/lib/impala/sbin-retail > + IMPALA_BIN=/opt/cloudera/parcels/IMPALA-1.1-1.p0.8/lib/impala/sbin-retail > + '[' impalad = statestore ']' > + '[' statestore = statestore ']' > + exec > /opt/cloudera/parcels/IMPALA-1.1-1.p0.8/lib/impala/../../bin/statestored > --flagfile=/var/run/cloudera-scm-agent/process/2469-impala-STATESTORE/impala-conf/state_store_flags > terminate called after throwing an instance of 'std::runtime_error' > what(): locale::facet::_S_create_c_locale name not valid > {code} > It should fall back to the standard locale ("C"), if the user's locale is > messed up. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-12208) DCHECK(pending_rpcs_.empty()) fails with concurrent RPCs
Kurt Deschler created IMPALA-12208: -- Summary: DCHECK(pending_rpcs_.empty()) fails with concurrent RPCs Key: IMPALA-12208 URL: https://issues.apache.org/jira/browse/IMPALA-12208 Project: IMPALA Issue Type: Bug Reporter: Kurt Deschler Assignee: Kurt Deschler There is a race condition where concurrent RPCs can register with the ClientRequestState after Finalize has been called. This causes the DCHECK(pending_rpcs_.empty()) since these RPCs are not unregistered in Finalize() as expected. Issue reproduces with ASAN builds running tests/hs2/test_hs2.py::TestHS2::test_concurrent_unregister in a loop. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-12208) DCHECK(pending_rpcs_.empty()) fails with concurrent RPCs
[ https://issues.apache.org/jira/browse/IMPALA-12208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-12208 started by Kurt Deschler. -- > DCHECK(pending_rpcs_.empty()) fails with concurrent RPCs > > > Key: IMPALA-12208 > URL: https://issues.apache.org/jira/browse/IMPALA-12208 > Project: IMPALA > Issue Type: Bug >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > There is a race condition where concurrent RPCs can register with the > ClientRequestState after Finalize has been called. This causes the > DCHECK(pending_rpcs_.empty()) since these RPCs are not unregistered in > Finalize() as expected. > Issue reproduces with ASAN builds running > tests/hs2/test_hs2.py::TestHS2::test_concurrent_unregister in a loop. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10180) Add average size of fetch requests in runtime profile
[ https://issues.apache.org/jira/browse/IMPALA-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler resolved IMPALA-10180. Resolution: Fixed > Add average size of fetch requests in runtime profile > - > > Key: IMPALA-10180 > URL: https://issues.apache.org/jira/browse/IMPALA-10180 > Project: IMPALA > Issue Type: Improvement > Components: Clients >Reporter: Sahil Takiar >Assignee: Joe McDonnell >Priority: Major > > When queries with a high {{ClientFetchWaitTimer}} it would be useful to know > the average number of rows requested by the client per fetch request. This > can help determine if setting a higher fetch size would help improve fetch > performance where the network RTT between the client and Impala is high. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-12138) Suboptimal vector allocation of HS2 results
[ https://issues.apache.org/jira/browse/IMPALA-12138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler reassigned IMPALA-12138: -- Assignee: Csaba Ringhofer > Suboptimal vector allocation of HS2 results > --- > > Key: IMPALA-12138 > URL: https://issues.apache.org/jira/browse/IMPALA-12138 > Project: IMPALA > Issue Type: Improvement >Reporter: Daniel Becker >Assignee: Csaba Ringhofer >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-12138) Suboptimal vector allocation of HS2 results
[ https://issues.apache.org/jira/browse/IMPALA-12138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler resolved IMPALA-12138. Resolution: Fixed > Suboptimal vector allocation of HS2 results > --- > > Key: IMPALA-12138 > URL: https://issues.apache.org/jira/browse/IMPALA-12138 > Project: IMPALA > Issue Type: Improvement >Reporter: Daniel Becker >Assignee: Csaba Ringhofer >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-12134) Optimize row materialization time
[ https://issues.apache.org/jira/browse/IMPALA-12134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler resolved IMPALA-12134. Resolution: Fixed > Optimize row materialization time > - > > Key: IMPALA-12134 > URL: https://issues.apache.org/jira/browse/IMPALA-12134 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > IMPALA-12111 addressed the most significant contributors to slow row > materialization. However, there is still room for significant improvement > with the following optimizations: > * Specialized implementation for default Data and Timestamp formatting. > * Caching deserialized column metadata -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-12176) Improve client fetch metrics
[ https://issues.apache.org/jira/browse/IMPALA-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler resolved IMPALA-12176. Resolution: Fixed > Improve client fetch metrics > > > Key: IMPALA-12176 > URL: https://issues.apache.org/jira/browse/IMPALA-12176 > Project: IMPALA > Issue Type: Improvement >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > These changes address limitations with the current metrics: > * ClientFetchWaitTimer includes both Thrift serialization/write and client > time. > * RowMaterializationTimer includes both coordinator fetch time and time to > convert rows to the client protocol. > Proposed changes: > * Add CreateResultSetTime metric to PLAN_ROOT_SINK node in the query > profile. This will isolate the cost to convert fetched rows to the client > protocol. > * Add read/write times for RPCs in /rpcz webui. These will be hidden by > default with a checkbox to enable display. > * Add sum to RPC histogram metrics > * Show client RPC read/write/count stats in the query profile -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-12208) DCHECK(pending_rpcs_.empty()) fails with concurrent RPCs
[ https://issues.apache.org/jira/browse/IMPALA-12208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler resolved IMPALA-12208. Resolution: Fixed > DCHECK(pending_rpcs_.empty()) fails with concurrent RPCs > > > Key: IMPALA-12208 > URL: https://issues.apache.org/jira/browse/IMPALA-12208 > Project: IMPALA > Issue Type: Bug >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > There is a race condition where concurrent RPCs can register with the > ClientRequestState after Finalize has been called. This causes the > DCHECK(pending_rpcs_.empty()) since these RPCs are not unregistered in > Finalize() as expected. > Issue reproduces with ASAN builds running > tests/hs2/test_hs2.py::TestHS2::test_concurrent_unregister in a loop. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-12148) Create table as select (CTAS) tests time out
[ https://issues.apache.org/jira/browse/IMPALA-12148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-12148: --- Summary: Create table as select (CTAS) tests time out (was: Create table as select (CTAS) tests occasionally time out in ASAN builds) > Create table as select (CTAS) tests time out > > > Key: IMPALA-12148 > URL: https://issues.apache.org/jira/browse/IMPALA-12148 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 4.3.0 >Reporter: Laszlo Gaal >Assignee: Joe McDonnell >Priority: Major > > This is quite similar to the earlier IMPALA-11633; however there is the > additional symptom of connection failures in the captured {{stderror}} of the > tests:{code} > SET > client_identifier=metadata/test_ddl.py::TestAsyncDDLTiming::()::test_ctas[enable_async_ddl_execution:True|protocol:beeswax|exec_option:{'sync_ddl':0;'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'; > -- connecting to: localhost:21000 > -- 2023-05-15 22:16:39,043 INFO MainThread: Could not connect to ('::1', > 21000, 0, 0) > Traceback (most recent call last): > File > "/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc10.4.0/thrift-0.16.0-p3/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", > line 137, in open > handle.connect(sockaddr) > File > "/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/socket.py", > line 228, in meth > return getattr(self._sock,name)(*args) > error: [Errno 111] Connection refused > -- connecting to localhost:21050 with impyla > -- 2023-05-15 22:16:39,043 INFO MainThread: Could not connect to ('::1', > 21050, 0, 0) > Traceback (most recent call last): > File > "/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc10.4.0/thrift-0.16.0-p3/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", > line 137, in open > handle.connect(sockaddr) > File > "/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/socket.py", > line 228, in meth > return getattr(self._sock,name)(*args) > error: [Errno 111] Connection refused > -- 2023-05-15 22:16:39,057 INFO MainThread: Closing active operation > -- connecting to localhost:28000 with impyla > -- 2023-05-15 22:16:39,074 INFO MainThread: Closing active operation{code} > or later:{code} > SET > client_identifier=metadata/test_ddl.py::TestAsyncDDLTiming::()::test_ctas[enable_async_ddl_execution:True|protocol:beeswax|exec_option:{'sync_ddl':0;'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'; > -- connecting to: localhost:21000 > -- 2023-05-15 22:16:44,429 INFO MainThread: Could not connect to ('::1', > 21000, 0, 0) > Traceback (most recent call last): > File > "/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc10.4.0/thrift-0.16.0-p3/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", > line 137, in open > handle.connect(sockaddr) > File > "/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/socket.py", > line 228, in meth > return getattr(self._sock,name)(*args) > error: [Errno 111] Connection refused > -- executing against localhost:21000{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12148) Create table as select (CTAS) tests time out
[ https://issues.apache.org/jira/browse/IMPALA-12148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17733523#comment-17733523 ] Kurt Deschler commented on IMPALA-12148: Also seen in impala-asf-master-core-ozone-erasure-coding. Updated title to be more broad. > Create table as select (CTAS) tests time out > > > Key: IMPALA-12148 > URL: https://issues.apache.org/jira/browse/IMPALA-12148 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 4.3.0 >Reporter: Laszlo Gaal >Assignee: Joe McDonnell >Priority: Major > > This is quite similar to the earlier IMPALA-11633; however there is the > additional symptom of connection failures in the captured {{stderror}} of the > tests:{code} > SET > client_identifier=metadata/test_ddl.py::TestAsyncDDLTiming::()::test_ctas[enable_async_ddl_execution:True|protocol:beeswax|exec_option:{'sync_ddl':0;'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'; > -- connecting to: localhost:21000 > -- 2023-05-15 22:16:39,043 INFO MainThread: Could not connect to ('::1', > 21000, 0, 0) > Traceback (most recent call last): > File > "/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc10.4.0/thrift-0.16.0-p3/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", > line 137, in open > handle.connect(sockaddr) > File > "/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/socket.py", > line 228, in meth > return getattr(self._sock,name)(*args) > error: [Errno 111] Connection refused > -- connecting to localhost:21050 with impyla > -- 2023-05-15 22:16:39,043 INFO MainThread: Could not connect to ('::1', > 21050, 0, 0) > Traceback (most recent call last): > File > "/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc10.4.0/thrift-0.16.0-p3/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", > line 137, in open > handle.connect(sockaddr) > File > "/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/socket.py", > line 228, in meth > return getattr(self._sock,name)(*args) > error: [Errno 111] Connection refused > -- 2023-05-15 22:16:39,057 INFO MainThread: Closing active operation > -- connecting to localhost:28000 with impyla > -- 2023-05-15 22:16:39,074 INFO MainThread: Closing active operation{code} > or later:{code} > SET > client_identifier=metadata/test_ddl.py::TestAsyncDDLTiming::()::test_ctas[enable_async_ddl_execution:True|protocol:beeswax|exec_option:{'sync_ddl':0;'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'; > -- connecting to: localhost:21000 > -- 2023-05-15 22:16:44,429 INFO MainThread: Could not connect to ('::1', > 21000, 0, 0) > Traceback (most recent call last): > File > "/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc10.4.0/thrift-0.16.0-p3/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", > line 137, in open > handle.connect(sockaddr) > File > "/data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/socket.py", > line 228, in meth > return getattr(self._sock,name)(*args) > error: [Errno 111] Connection refused > -- executing against localhost:21000{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10585) retry_failed_queries=true should not apply to DMLs
[ https://issues.apache.org/jira/browse/IMPALA-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17733560#comment-17733560 ] Kurt Deschler commented on IMPALA-10585: [~csringhofer] Let's close out this ticket since there is already a fix merged and move future work to a new ticket. > retry_failed_queries=true should not apply to DMLs > -- > > Key: IMPALA-10585 > URL: https://issues.apache.org/jira/browse/IMPALA-10585 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.0.0 >Reporter: Joe McDonnell >Assignee: Csaba Ringhofer >Priority: Major > > I noticed that retry_failed_queries=true will retry insert statements: > {noformat} > [localhost:21050] joetest> insert into retrytest select count(*) from > functional.alltypes where bool_col = sleep(50); > Query: insert into retrytest select count(*) from functional.alltypes where > bool_col = sleep(50) > Query submitted at: 2021-03-15 10:23:32 (Coordinator: > http://joemcdonnell:25000) > Query progress can be monitored at: > http://joemcdonnell:25000/query_plan?query_id=5f4b8c0224faa31a:4a585cf7 > ... > Failed due to unreachable impalad(s): joemcdonnell:27002 > > Retried query link: > http://joemcdonnell:25000/query_plan?query_id=824b6b103ea68ea3:bc804b4Failed > due to unreachable impalad(s): joemcdonnell:27002 > ... > Query has been retried using query id: 824b6b103ea68ea3:bc804b45 > Retried query link: > http://joemcdonnell:25000/query_plan?query_id=824b6b103ea68ea3:bc804b45 > Modified 1 row(s) in 47.71s{noformat} > I don't think this was intended to work, because > https://issues.apache.org/jira/browse/IMPALA-9734 was closed saying that we > don't do retries for write statements. There also aren't any tests for these > cases. > I think we intended to exempt DML statements from retry_failed_queries=true. > We should implement that and add tests to make sure DMLs don't get retried. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-10585) retry_failed_queries=true should not apply to DMLs
[ https://issues.apache.org/jira/browse/IMPALA-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler reassigned IMPALA-10585: -- Assignee: Csaba Ringhofer > retry_failed_queries=true should not apply to DMLs > -- > > Key: IMPALA-10585 > URL: https://issues.apache.org/jira/browse/IMPALA-10585 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.0.0 >Reporter: Joe McDonnell >Assignee: Csaba Ringhofer >Priority: Major > > I noticed that retry_failed_queries=true will retry insert statements: > {noformat} > [localhost:21050] joetest> insert into retrytest select count(*) from > functional.alltypes where bool_col = sleep(50); > Query: insert into retrytest select count(*) from functional.alltypes where > bool_col = sleep(50) > Query submitted at: 2021-03-15 10:23:32 (Coordinator: > http://joemcdonnell:25000) > Query progress can be monitored at: > http://joemcdonnell:25000/query_plan?query_id=5f4b8c0224faa31a:4a585cf7 > ... > Failed due to unreachable impalad(s): joemcdonnell:27002 > > Retried query link: > http://joemcdonnell:25000/query_plan?query_id=824b6b103ea68ea3:bc804b4Failed > due to unreachable impalad(s): joemcdonnell:27002 > ... > Query has been retried using query id: 824b6b103ea68ea3:bc804b45 > Retried query link: > http://joemcdonnell:25000/query_plan?query_id=824b6b103ea68ea3:bc804b45 > Modified 1 row(s) in 47.71s{noformat} > I don't think this was intended to work, because > https://issues.apache.org/jira/browse/IMPALA-9734 was closed saying that we > don't do retries for write statements. There also aren't any tests for these > cases. > I think we intended to exempt DML statements from retry_failed_queries=true. > We should implement that and add tests to make sure DMLs don't get retried. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12189) updateCatalog not releasing the catalog lock if createTblTransaction() throws exceptions
[ https://issues.apache.org/jira/browse/IMPALA-12189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17733561#comment-17733561 ] Kurt Deschler commented on IMPALA-12189: [~stigahuang] Can this ticket get resolved now? > updateCatalog not releasing the catalog lock if createTblTransaction() throws > exceptions > > > Key: IMPALA-12189 > URL: https://issues.apache.org/jira/browse/IMPALA-12189 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Critical > > We saw an issue that catalogd can't finish RPC requests after this error: > {code:java} > I0605 21:04:49.356642 6145 jni-util.cc:288] > org.apache.impala.common.TransactionException: Internal error processing > allocate_table_write_ids > at > org.apache.impala.catalog.Hive3MetastoreShimBase.allocateTableWriteId(Hive3MetastoreShimBase.java:763) > at > org.apache.impala.catalog.Hive3MetastoreShimBase.createTblTransaction(Hive3MetastoreShimBase.java:129) > at > org.apache.impala.service.CatalogOpExecutor.updateCatalog(CatalogOpExecutor.java:6394) > at > org.apache.impala.service.JniCatalog.updateCatalog(JniCatalog.java:507) > I0605 21:04:49.356665 6145 status.cc:129] TransactionException: Internal > error processing allocate_table_write_ids > {code} > Code snipper of the downstream branch: > {code:java} > 6370 public TUpdateCatalogResponse updateCatalog(TUpdateCatalogRequest > update) > 6371 throws ImpalaException { > 6372 TUpdateCatalogResponse response = new TUpdateCatalogResponse(); > 6373 // Only update metastore for Hdfs tables. > 6374 Table table = getExistingTable(update.getDb_name(), > update.getTarget_table(), > 6375 "Load for INSERT"); > 6376 if (!(table instanceof FeFsTable)) { > 6377 throw new InternalException("Unexpected table type: " + > 6378 update.getTarget_table()); > 6379 } > 6380 > 6381 tryWriteLock(table, "updating the catalog"); > 6382 final Timer.Context context > 6383 = > table.getMetrics().getTimer(HdfsTable.CATALOG_UPDATE_DURATION_METRIC).time(); > 6384 > 6385 long transactionId = -1; > 6386 TblTransaction tblTxn = null; > 6387 if (update.isSetTransaction_id()) { > 6388 transactionId = update.getTransaction_id(); > 6389 Preconditions.checkState(transactionId > 0); > 6390 try (MetaStoreClient msClient = catalog_.getMetaStoreClient()) { > 6391 // Setup transactional parameters needed to do alter > table/partitions later. > 6392 // TODO: Could be optimized to possibly save some RPCs, as > these parameters are > 6393 // not always needed + the writeId of the INSERT could be > probably reused. > 6394 tblTxn = MetastoreShim.createTblTransaction( > 6395 msClient.getHiveClient(), table.getMetaStoreTable(), > transactionId); > 6396 } > 6397 } > 6398 > 6399 try { > 6400 // Get new catalog version for table in insert. > 6401 long newCatalogVersion = catalog_.incrementAndGetCatalogVersion(); > 6402 catalog_.getLock().writeLock().unlock(); > ... > 6617 } finally { > 6618 context.stop(); > 6619 UnlockWriteLockIfErronouslyLocked(); > 6620 table.releaseWriteLock(); > 6621 } > {code} > The catalog lock (versionLock) is acquired at line 6381 if the current thread > get the table lock. In normal workload, it will be released at line 6402. > However, if MetastoreShim.createTblTransaction() throws exceptions, there are > no place to release the lock. Note that there is a finally-clause at line > 6619 that can release the lock. But it's not guarding the code that calls > createTblTransaction(). > If the write lock of versionLock is not released, other threads can't proceed > in their catalog operations, including table loading and the event-processor. > I'm able to reproduce the issue by modifying the code to explicitly throws an > exception at > [https://github.com/apache/impala/blob/4cf0bfa83f9641eb95d83c76af7962e6a3f1e064/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L6636] > CC [~csringhofer] [~gfurnstahl] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-12233) Partitioned hash join with a limit can hang when using mt_dop>0
[ https://issues.apache.org/jira/browse/IMPALA-12233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler reassigned IMPALA-12233: -- Assignee: Gergely Fürnstáhl > Partitioned hash join with a limit can hang when using mt_dop>0 > --- > > Key: IMPALA-12233 > URL: https://issues.apache.org/jira/browse/IMPALA-12233 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.3.0 >Reporter: Joe McDonnell >Assignee: Gergely Fürnstáhl >Priority: Blocker > > After encountering a hung query on an Impala cluster, we were able to > reproduce it in the Impala developer environment with these steps: > {noformat} > use tpcds; > set mt_dop=2; > select ss_cdemo_sk from store_sales where ss_sold_date_sk = (select > max(ss_sold_date_sk) from store_sales) group by ss_cdemo_sk limit 1;{noformat} > The problem reproduces with limit values up to 183, then at limit 184 and > higher it doesn't reproduce. > Taking stack traces show a thread waiting for a cyclic barrier: > {noformat} > 0 libpthread.so.0!__pthread_cond_wait + 0x216 > 1 > impalad!impala::CyclicBarrier::Wait int64_t*, impala::BufferPool::ClientHandle*, impala::RuntimeProfile*, > std::deque >*, > impala::RowBatch*):: > [condition-variable.h : 49 + 0xc] > 2 impalad!impala::PhjBuilder::DoneProbingHashPartitions(long const*, > impala::BufferPool::ClientHandle*, impala::RuntimeProfile*, > std::deque std::default_delete >, > std::allocator std::default_delete > > >*, impala::RowBatch*) > [partitioned-hash-join-builder.cc : 766 + 0x25] > 3 > impalad!impala::PartitionedHashJoinNode::DoneProbing(impala::RuntimeState*, > impala::RowBatch*) [partitioned-hash-join-node.cc : 1189 + 0x28] > 4 impalad!impala::PartitionedHashJoinNode::GetNext(impala::RuntimeState*, > impala::RowBatch*, bool*) [partitioned-hash-join-node.cc : 599 + 0x15] > 5 > impalad!impala::StreamingAggregationNode::GetRowsStreaming(impala::RuntimeState*, > impala::RowBatch*) [streaming-aggregation-node.cc : 115 + 0x14] > 6 impalad!impala::StreamingAggregationNode::GetNext(impala::RuntimeState*, > impala::RowBatch*, bool*) [streaming-aggregation-node.cc : 77 + 0x15] > 7 impalad!impala::FragmentInstanceState::ExecInternal() > [fragment-instance-state.cc : 446 + 0x15] > 8 impalad!impala::FragmentInstanceState::Exec() [fragment-instance-state.cc > : 104 + 0xf] > 9 impalad!impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) > [query-state.cc : 956 + 0xf]{noformat} > Adding some debug logging around locations that go through that cyclic > barrier, we see one Impalad where it is expecting two threads and only one > arrives: > {noformat} > I0621 18:28:19.926551 210363 partitioned-hash-join-builder.cc:766] > 2a4787b28425372d:ac6bd9620004] DoneProbingHashPartitions: > num_probe_threads_=2 > I0621 18:28:19.927855 210362 streaming-aggregation-node.cc:136] > 2a4787b28425372d:ac6bd9620003] the number of rows (93) returned from the > streaming aggregation node has exceeded the limit of 1 > I0621 18:28:19.928887 210362 query-state.cc:958] > 2a4787b28425372d:ac6bd9620003] Instance completed. > instance_id=2a4787b28425372d:ac6bd9620003 #in-flight=4 status=OK{noformat} > Other instances that don't have a stuck thread see both threads arrive: > {noformat} > I0621 18:28:19.926223 210358 partitioned-hash-join-builder.cc:766] > 2a4787b28425372d:ac6bd9620005] DoneProbingHashPartitions: > num_probe_threads_=2 > I0621 18:28:19.926326 210359 partitioned-hash-join-builder.cc:766] > 2a4787b28425372d:ac6bd9620006] DoneProbingHashPartitions: > num_probe_threads_=2{noformat} > So, there must be a codepath that skips going through the cyclic barrier. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10507) Workaround for sporadic TPC-DS testdata insert error
Kurt Deschler created IMPALA-10507: -- Summary: Workaround for sporadic TPC-DS testdata insert error Key: IMPALA-10507 URL: https://issues.apache.org/jira/browse/IMPALA-10507 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.4.0 Reporter: Kurt Deschler Assignee: Kurt Deschler A TPC-DS insert statement generated from testdata/datasets/tpcds/tpcds_schema_template.sql fails with Tez errors trying to allocate space. This Jira is to woraround the error in the test. Statment: insert overwrite table store_sales partition(ss_sold_date_sk) select ... from store_sales_unpartitioned WHERE 2451272 <= ss_sold_date_sk and ss_sold_date_sk < 2451728 distribute by ss_sold_date_sk Error: Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /test-warehouse/tpcds.store_sales/.hive-staging_hive_2021-01-04_11-08-20_683_8822236846070344153-996/_task_tmp.-ext-10002/ss_sold_date_sk=2451574/_tmp.00_1 could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-10507) Workaround for sporadic TPC-DS testdata insert error
[ https://issues.apache.org/jira/browse/IMPALA-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-10507 started by Kurt Deschler. -- > Workaround for sporadic TPC-DS testdata insert error > > > Key: IMPALA-10507 > URL: https://issues.apache.org/jira/browse/IMPALA-10507 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.4.0 >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > A TPC-DS insert statement generated from > testdata/datasets/tpcds/tpcds_schema_template.sql fails with Tez errors > trying to allocate space. This Jira is to woraround the error in the test. > Statment: > insert overwrite table store_sales partition(ss_sold_date_sk) > select ... from store_sales_unpartitioned > WHERE 2451272 <= ss_sold_date_sk and ss_sold_date_sk < 2451728 > distribute by ss_sold_date_sk > Error: > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /test-warehouse/tpcds.store_sales/.hive-staging_hive_2021-01-04_11-08-20_683_8822236846070344153-996/_task_tmp.-ext-10002/ss_sold_date_sk=2451574/_tmp.00_1 > could only be written to 0 of the 1 minReplication nodes. There are 3 > datanode(s) running and 3 node(s) are excluded in this operation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10507) Workaround for sporadic TPC-DS testdata insert error
[ https://issues.apache.org/jira/browse/IMPALA-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17284250#comment-17284250 ] Kurt Deschler commented on IMPALA-10507: http://gerrit.cloudera.org:8080/17065 > Workaround for sporadic TPC-DS testdata insert error > > > Key: IMPALA-10507 > URL: https://issues.apache.org/jira/browse/IMPALA-10507 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.4.0 >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > A TPC-DS insert statement generated from > testdata/datasets/tpcds/tpcds_schema_template.sql fails with Tez errors > trying to allocate space. This Jira is to woraround the error in the test. > Statment: > insert overwrite table store_sales partition(ss_sold_date_sk) > select ... from store_sales_unpartitioned > WHERE 2451272 <= ss_sold_date_sk and ss_sold_date_sk < 2451728 > distribute by ss_sold_date_sk > Error: > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /test-warehouse/tpcds.store_sales/.hive-staging_hive_2021-01-04_11-08-20_683_8822236846070344153-996/_task_tmp.-ext-10002/ss_sold_date_sk=2451574/_tmp.00_1 > could only be written to 0 of the 1 minReplication nodes. There are 3 > datanode(s) running and 3 node(s) are excluded in this operation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10507) Workaround for sporadic TPC-DS testdata insert error
[ https://issues.apache.org/jira/browse/IMPALA-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286051#comment-17286051 ] Kurt Deschler commented on IMPALA-10507: Issue no longer reproduces after IMPALA-9777 change. > Workaround for sporadic TPC-DS testdata insert error > > > Key: IMPALA-10507 > URL: https://issues.apache.org/jira/browse/IMPALA-10507 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.4.0 >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > A TPC-DS insert statement generated from > testdata/datasets/tpcds/tpcds_schema_template.sql fails with Tez errors > trying to allocate space. This Jira is to woraround the error in the test. > Statment: > insert overwrite table store_sales partition(ss_sold_date_sk) > select ... from store_sales_unpartitioned > WHERE 2451272 <= ss_sold_date_sk and ss_sold_date_sk < 2451728 > distribute by ss_sold_date_sk > Error: > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /test-warehouse/tpcds.store_sales/.hive-staging_hive_2021-01-04_11-08-20_683_8822236846070344153-996/_task_tmp.-ext-10002/ss_sold_date_sk=2451574/_tmp.00_1 > could only be written to 0 of the 1 minReplication nodes. There are 3 > datanode(s) running and 3 node(s) are excluded in this operation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Closed] (IMPALA-10507) Workaround for sporadic TPC-DS testdata insert error
[ https://issues.apache.org/jira/browse/IMPALA-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler closed IMPALA-10507. -- Resolution: Abandoned > Workaround for sporadic TPC-DS testdata insert error > > > Key: IMPALA-10507 > URL: https://issues.apache.org/jira/browse/IMPALA-10507 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.4.0 >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > A TPC-DS insert statement generated from > testdata/datasets/tpcds/tpcds_schema_template.sql fails with Tez errors > trying to allocate space. This Jira is to woraround the error in the test. > Statment: > insert overwrite table store_sales partition(ss_sold_date_sk) > select ... from store_sales_unpartitioned > WHERE 2451272 <= ss_sold_date_sk and ss_sold_date_sk < 2451728 > distribute by ss_sold_date_sk > Error: > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /test-warehouse/tpcds.store_sales/.hive-staging_hive_2021-01-04_11-08-20_683_8822236846070344153-996/_task_tmp.-ext-10002/ss_sold_date_sk=2451574/_tmp.00_1 > could only be written to 0 of the 1 minReplication nodes. There are 3 > datanode(s) running and 3 node(s) are excluded in this operation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10535) Add interface to ImpalaServer for execution of externally compiled statements
Kurt Deschler created IMPALA-10535: -- Summary: Add interface to ImpalaServer for execution of externally compiled statements Key: IMPALA-10535 URL: https://issues.apache.org/jira/browse/IMPALA-10535 Project: IMPALA Issue Type: New Feature Components: Backend Reporter: Kurt Deschler Assignee: Kurt Deschler Add new interface ImpalaServer::ExecutePlannedStatement to allow an external frontend to execute a compiled TExecRequest on the Impala backend. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10535) Add interface to ImpalaServer for execution of externally compiled statements
[ https://issues.apache.org/jira/browse/IMPALA-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288090#comment-17288090 ] Kurt Deschler commented on IMPALA-10535: http://gerrit.cloudera.org:8080/17104 > Add interface to ImpalaServer for execution of externally compiled statements > - > > Key: IMPALA-10535 > URL: https://issues.apache.org/jira/browse/IMPALA-10535 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > Add new interface ImpalaServer::ExecutePlannedStatement to allow an external > frontend to execute a compiled TExecRequest on the Impala backend. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10522) Support external use of frontend libraries
[ https://issues.apache.org/jira/browse/IMPALA-10522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-10522: --- Summary: Support external use of frontend libraries (was: Avoid creating Frontend object for external FE) > Support external use of frontend libraries > -- > > Key: IMPALA-10522 > URL: https://issues.apache.org/jira/browse/IMPALA-10522 > Project: IMPALA > Issue Type: Sub-task > Components: Backend, Frontend >Reporter: Aman Sinha >Assignee: Aman Sinha >Priority: Major > > For external frontend, the initialization of FeSupport indirectly > instantiates the Frontend object which brings in dependencies and side > effects (such as re-initialization of the BackendConfig structure). We should > avoid creating the Frontend object since it is not needed for external > frontend. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10522) Support external use of frontend libraries
[ https://issues.apache.org/jira/browse/IMPALA-10522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-10522: --- Description: Existing impala frontend library initialization is largely driven by and complements backend initialization. There is a 2nd initialization path that is designed to support standalone test applications. These paths need to be reconciled so that the frontend libraries can also used by an external frontend that has different initialization requirements. (was: For external frontend, the initialization of FeSupport indirectly instantiates the Frontend object which brings in dependencies and side effects (such as re-initialization of the BackendConfig structure). We should avoid creating the Frontend object since it is not needed for external frontend.) > Support external use of frontend libraries > -- > > Key: IMPALA-10522 > URL: https://issues.apache.org/jira/browse/IMPALA-10522 > Project: IMPALA > Issue Type: Sub-task > Components: Backend, Frontend >Reporter: Aman Sinha >Assignee: Aman Sinha >Priority: Major > > Existing impala frontend library initialization is largely driven by and > complements backend initialization. There is a 2nd initialization path that > is designed to support standalone test applications. These paths need to be > reconciled so that the frontend libraries can also used by an external > frontend that has different initialization requirements. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10522) Support external use of frontend libraries
[ https://issues.apache.org/jira/browse/IMPALA-10522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290264#comment-17290264 ] Kurt Deschler commented on IMPALA-10522: http://gerrit.cloudera.org:8080/17115 > Support external use of frontend libraries > -- > > Key: IMPALA-10522 > URL: https://issues.apache.org/jira/browse/IMPALA-10522 > Project: IMPALA > Issue Type: Sub-task > Components: Backend, Frontend >Reporter: Aman Sinha >Assignee: Aman Sinha >Priority: Major > > Existing impala frontend library initialization is largely driven by and > complements backend initialization. There is a 2nd initialization path that > is designed to support standalone test applications. These paths need to be > reconciled so that the frontend libraries can also used by an external > frontend that has different initialization requirements. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10546) Add ImpalaServer interface to Retrieve BackendConfig from impalad
Kurt Deschler created IMPALA-10546: -- Summary: Add ImpalaServer interface to Retrieve BackendConfig from impalad Key: IMPALA-10546 URL: https://issues.apache.org/jira/browse/IMPALA-10546 Project: IMPALA Issue Type: New Feature Reporter: Kurt Deschler Assignee: Kurt Deschler The impala backend uses the TBackendGflags structure to pass flags and state information to the frontend during initialization. In order to support an external frontend, an interface is needed to populate and return TBackendGflags fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-10546) Add ImpalaServer interface to Retrieve BackendConfig from impalad
[ https://issues.apache.org/jira/browse/IMPALA-10546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-10546 started by Kurt Deschler. -- > Add ImpalaServer interface to Retrieve BackendConfig from impalad > - > > Key: IMPALA-10546 > URL: https://issues.apache.org/jira/browse/IMPALA-10546 > Project: IMPALA > Issue Type: New Feature >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > The impala backend uses the TBackendGflags structure to pass flags and state > information to the frontend during initialization. In order to support an > external frontend, an interface is needed to populate and return > TBackendGflags fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10546) Add ImpalaServer interface to Retrieve BackendConfig from impalad
[ https://issues.apache.org/jira/browse/IMPALA-10546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290577#comment-17290577 ] Kurt Deschler commented on IMPALA-10546: http://gerrit.cloudera.org:8080/17116 > Add ImpalaServer interface to Retrieve BackendConfig from impalad > - > > Key: IMPALA-10546 > URL: https://issues.apache.org/jira/browse/IMPALA-10546 > Project: IMPALA > Issue Type: New Feature >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > The impala backend uses the TBackendGflags structure to pass flags and state > information to the frontend during initialization. In order to support an > external frontend, an interface is needed to populate and return > TBackendGflags fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10549) Add interface to register transaction from external frontend
Kurt Deschler created IMPALA-10549: -- Summary: Add interface to register transaction from external frontend Key: IMPALA-10549 URL: https://issues.apache.org/jira/browse/IMPALA-10549 Project: IMPALA Issue Type: New Feature Reporter: Kurt Deschler Assignee: Kurt Deschler This change adds a frontend interface to register transactions that were started by an external frontend so that coordinator keepalive can track them properly. The Impala backend keepalive logic requires that transactions are registered with the coordinator in order to be tracked while they are executing. For external frontends to work correctly with transactional tables, an interface is needed to register externally created transactions with the coordinator. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-10549) Add interface to register transaction from external frontend
[ https://issues.apache.org/jira/browse/IMPALA-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-10549 started by Kurt Deschler. -- > Add interface to register transaction from external frontend > > > Key: IMPALA-10549 > URL: https://issues.apache.org/jira/browse/IMPALA-10549 > Project: IMPALA > Issue Type: New Feature >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > This change adds a frontend interface to register transactions that > were started by an external frontend so that coordinator keepalive can > track them properly. > The Impala backend keepalive logic requires that transactions are registered > with the coordinator in order to be tracked while they are executing. For > external frontends to work correctly with transactional tables, an interface > is needed to register externally created transactions with the coordinator. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10549) Register transactions from external frontends
[ https://issues.apache.org/jira/browse/IMPALA-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-10549: --- Summary: Register transactions from external frontends (was: Add interface to register transaction from external frontend) > Register transactions from external frontends > - > > Key: IMPALA-10549 > URL: https://issues.apache.org/jira/browse/IMPALA-10549 > Project: IMPALA > Issue Type: New Feature >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > This change adds a frontend interface to register transactions that > were started by an external frontend so that coordinator keepalive can > track them properly. > The Impala backend keepalive logic requires that transactions are registered > with the coordinator in order to be tracked while they are executing. For > external frontends to work correctly with transactional tables, an interface > is needed to register externally created transactions with the coordinator. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10549) Register transactions from external frontends
[ https://issues.apache.org/jira/browse/IMPALA-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-10549: --- Description: The Impala backend keepalive logic requires that transactions are registered with the coordinator in order to be tracked while they are executing. For external frontends to work correctly with transactional tables, any externally started transactions must register with the coordinator. (was: This change adds a frontend interface to register transactions that were started by an external frontend so that coordinator keepalive can track them properly. The Impala backend keepalive logic requires that transactions are registered with the coordinator in order to be tracked while they are executing. For external frontends to work correctly with transactional tables, an interface is needed to register externally created transactions with the coordinator.) > Register transactions from external frontends > - > > Key: IMPALA-10549 > URL: https://issues.apache.org/jira/browse/IMPALA-10549 > Project: IMPALA > Issue Type: New Feature >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > The Impala backend keepalive logic requires that transactions are registered > with the coordinator in order to be tracked while they are executing. For > external frontends to work correctly with transactional tables, any > externally started transactions must register with the coordinator. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10549) Register transactions from external frontends
[ https://issues.apache.org/jira/browse/IMPALA-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-10549: --- Description: The Impala backend keepalive logic requires that readwrite transactions are registered with the coordinator in order to be tracked while they are executing. For external frontends to work correctly with transactional tables, any externally started transactions must register with the coordinator. (was: The Impala backend keepalive logic requires that transactions are registered with the coordinator in order to be tracked while they are executing. For external frontends to work correctly with transactional tables, any externally started transactions must register with the coordinator.) > Register transactions from external frontends > - > > Key: IMPALA-10549 > URL: https://issues.apache.org/jira/browse/IMPALA-10549 > Project: IMPALA > Issue Type: New Feature >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > The Impala backend keepalive logic requires that readwrite transactions are > registered with the coordinator in order to be tracked while they are > executing. For external frontends to work correctly with transactional > tables, any externally started transactions must register with the > coordinator. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10549) Register transactions from external frontends
[ https://issues.apache.org/jira/browse/IMPALA-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291044#comment-17291044 ] Kurt Deschler commented on IMPALA-10549: http://gerrit.cloudera.org:8080/17122 > Register transactions from external frontends > - > > Key: IMPALA-10549 > URL: https://issues.apache.org/jira/browse/IMPALA-10549 > Project: IMPALA > Issue Type: New Feature >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > The Impala backend keepalive logic requires that readwrite transactions are > registered with the coordinator in order to be tracked while they are > executing. For external frontends to work correctly with transactional > tables, any externally started transactions must register with the > coordinator. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10535) Add interface to ImpalaServer for execution of externally compiled statements
[ https://issues.apache.org/jira/browse/IMPALA-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-10535: --- Parent: IMPALA-10517 Issue Type: Sub-task (was: New Feature) > Add interface to ImpalaServer for execution of externally compiled statements > - > > Key: IMPALA-10535 > URL: https://issues.apache.org/jira/browse/IMPALA-10535 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > Add new interface ImpalaServer::ExecutePlannedStatement to allow an external > frontend to execute a compiled TExecRequest on the Impala backend. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10546) Add ImpalaServer interface to Retrieve BackendConfig from impalad
[ https://issues.apache.org/jira/browse/IMPALA-10546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-10546: --- Parent: IMPALA-10517 Issue Type: Sub-task (was: New Feature) > Add ImpalaServer interface to Retrieve BackendConfig from impalad > - > > Key: IMPALA-10546 > URL: https://issues.apache.org/jira/browse/IMPALA-10546 > Project: IMPALA > Issue Type: Sub-task >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > The impala backend uses the TBackendGflags structure to pass flags and state > information to the frontend during initialization. In order to support an > external frontend, an interface is needed to populate and return > TBackendGflags fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10549) Register transactions from external frontends
[ https://issues.apache.org/jira/browse/IMPALA-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-10549: --- Parent: IMPALA-10517 Issue Type: Sub-task (was: New Feature) > Register transactions from external frontends > - > > Key: IMPALA-10549 > URL: https://issues.apache.org/jira/browse/IMPALA-10549 > Project: IMPALA > Issue Type: Sub-task >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > The Impala backend keepalive logic requires that readwrite transactions are > registered with the coordinator in order to be tracked while they are > executing. For external frontends to work correctly with transactional > tables, any externally started transactions must register with the > coordinator. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10552) Support external frontends supplying timeline for profile
[ https://issues.apache.org/jira/browse/IMPALA-10552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298455#comment-17298455 ] Kurt Deschler commented on IMPALA-10552: https://gerrit.cloudera.org/#/c/17145/ > Support external frontends supplying timeline for profile > - > > Key: IMPALA-10552 > URL: https://issues.apache.org/jira/browse/IMPALA-10552 > Project: IMPALA > Issue Type: Sub-task >Reporter: John Sherman >Assignee: John Sherman >Priority: Major > > External frontends may want to populate their own timeline events so they > show up in the impala profiles. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10551) Add result sink support
[ https://issues.apache.org/jira/browse/IMPALA-10551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298457#comment-17298457 ] Kurt Deschler commented on IMPALA-10551: https://gerrit.cloudera.org/#/c/17144/ > Add result sink support > --- > > Key: IMPALA-10551 > URL: https://issues.apache.org/jira/browse/IMPALA-10551 > Project: IMPALA > Issue Type: Sub-task >Reporter: John Sherman >Assignee: John Sherman >Priority: Major > > The intent of this feature is to allow external frontends to reuse the > Hdfs.*Sink code to control where query results are written to. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10466) Handle deprecated TWO_LEVEL Parquet arrays more gracefully
[ https://issues.apache.org/jira/browse/IMPALA-10466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300360#comment-17300360 ] Kurt Deschler commented on IMPALA-10466: Per the example in IMPALA-4725, it is not safe to auto-detect the format since there are cases that will successfully decode in both 2-level and 3-level. We should however error out instead of returning nulls (item b. the description). That should probably be the default behavior then have an option that turns off the null check so that data can be recovered if the formats get mixed. > Handle deprecated TWO_LEVEL Parquet arrays more gracefully > -- > > Key: IMPALA-10466 > URL: https://issues.apache.org/jira/browse/IMPALA-10466 > Project: IMPALA > Issue Type: Improvement >Reporter: Csaba Ringhofer >Priority: Minor > > The default of PARQUET_ARRAY_RESOLUTION was changed from > TWO_LEVEL_THEN_THREE_LEVEL to THREE_LEVEL in IMPALA-4725. This solved > incorrectly detecting some ambiguous cases, but now old TWO_LEVEL Parquet > lists are not read correctly by default, replacing values with NULL without > any error message. > I would prefer a solution that: > a. detects the correct resolution when possible > b. returns a clear warning/error when resolution is not possible or > ambiguous, and points the user toward the query option that needs to be set > manually -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10518) Add server interface to retrieve executor membership
[ https://issues.apache.org/jira/browse/IMPALA-10518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302977#comment-17302977 ] Kurt Deschler commented on IMPALA-10518: https://gerrit.cloudera.org/#/c/17181/ > Add server interface to retrieve executor membership > > > Key: IMPALA-10518 > URL: https://issues.apache.org/jira/browse/IMPALA-10518 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Aman Sinha >Assignee: Aman Sinha >Priority: Major > > Add support to ImpalaServer and associated classes to retrieve executor > membership for use by external FE. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10579) Deadloop in table metadata loading when using an invalid RemoteIterator
[ https://issues.apache.org/jira/browse/IMPALA-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303688#comment-17303688 ] Kurt Deschler commented on IMPALA-10579: Created a patch to avoid looping indefinitely http://gerrit.cloudera.org:8080/17192 > Deadloop in table metadata loading when using an invalid RemoteIterator > --- > > Key: IMPALA-10579 > URL: https://issues.apache.org/jira/browse/IMPALA-10579 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 3.4.0 >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Critical > > The file listing thread in catalogd will go into a dead loop if it gets a > RemoteIterator on a non-existing path. The first call of the > RemoteIterator.hasNext() will throw a FileNotFoundException. However, this > exception will be catched and the loop will continue, which results in a dead > loop. Related codes: > [https://github.com/apache/impala/blob/d89c04bf806682d3449c566ce979632bd2ac5b29/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java#L789-L814] > {code:java} > static class FilterIterator implements RemoteIterator { > ... > public boolean hasNext() throws IOException { > ... > while (curFile_ == null) { > FileStatus next; > try { > if (!baseIterator_.hasNext()) return false; // < throws > FileNotFoundException > ... > next = baseIterator_.next(); > } catch (FileNotFoundException ex) { > ... > LOG.warn(ex.getMessage()); > continue; // <- catch the exception and continue into a > dead loop > } > if (!isInIgnoredDirectory(startPath_, next)) { > curFile_ = next; > return true; > } > } > return true; > } > {code} > *When will the path to be loading not exist?* > It happens when metadata (table/partition location) in HMS still have the > path. But it's actually removed from the storage. > *When will impala get such an invalid RemoteIterator?* > For FileSystem implementations that don't override the > FileSystem#listStatusIterator() interface, e.g. S3AFileSystem before > HADOOP-17281, AzureBlobFileSystem, and GoogleHadoopFileSystem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Issue Comment Deleted] (IMPALA-10552) Support external frontends supplying timeline for profile
[ https://issues.apache.org/jira/browse/IMPALA-10552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-10552: --- Comment: was deleted (was: https://gerrit.cloudera.org/#/c/17145/) > Support external frontends supplying timeline for profile > - > > Key: IMPALA-10552 > URL: https://issues.apache.org/jira/browse/IMPALA-10552 > Project: IMPALA > Issue Type: Sub-task >Reporter: John Sherman >Assignee: John Sherman >Priority: Major > > External frontends may want to populate their own timeline events so they > show up in the impala profiles. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-10535) Add interface to ImpalaServer for execution of externally compiled statements
[ https://issues.apache.org/jira/browse/IMPALA-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-10535 started by Kurt Deschler. -- > Add interface to ImpalaServer for execution of externally compiled statements > - > > Key: IMPALA-10535 > URL: https://issues.apache.org/jira/browse/IMPALA-10535 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > Add new interface ImpalaServer::ExecutePlannedStatement to allow an external > frontend to execute a compiled TExecRequest on the Impala backend. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10535) Add interface to ImpalaServer for execution of externally compiled statements
[ https://issues.apache.org/jira/browse/IMPALA-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler resolved IMPALA-10535. Fix Version/s: Impala 4.0 Resolution: Fixed > Add interface to ImpalaServer for execution of externally compiled statements > - > > Key: IMPALA-10535 > URL: https://issues.apache.org/jira/browse/IMPALA-10535 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Fix For: Impala 4.0 > > > Add new interface ImpalaServer::ExecutePlannedStatement to allow an external > frontend to execute a compiled TExecRequest on the Impala backend. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10504) Add tracing for remote block reads
[ https://issues.apache.org/jira/browse/IMPALA-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler resolved IMPALA-10504. Fix Version/s: Impala 4.0 Resolution: Fixed > Add tracing for remote block reads > -- > > Key: IMPALA-10504 > URL: https://issues.apache.org/jira/browse/IMPALA-10504 > Project: IMPALA > Issue Type: Improvement >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Fix For: Impala 4.0 > > > While chasing performance issues, there were a large number of remote block > read messages in the logs. Need tracing to track down the source of these. > {noformat} > Errors: Read 3.07 GB of data across network that was expected to be local. > Block locality metadata for table 'tpcds_600_parquet.store_sales' may be > stale. > This only affects query performance and not result correctness. > One of the common causes for this warning is HDFS rebalancer moving some of > the file's blocks. > If the issue persists, consider running "INVALIDATE METADATA > `tpcds_600_parquet`.`store_sales`"{noformat} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10546) Add ImpalaServer interface to Retrieve BackendConfig from impalad
[ https://issues.apache.org/jira/browse/IMPALA-10546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler resolved IMPALA-10546. Fix Version/s: Impala 4.0 Resolution: Fixed > Add ImpalaServer interface to Retrieve BackendConfig from impalad > - > > Key: IMPALA-10546 > URL: https://issues.apache.org/jira/browse/IMPALA-10546 > Project: IMPALA > Issue Type: Sub-task >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Fix For: Impala 4.0 > > > The impala backend uses the TBackendGflags structure to pass flags and state > information to the frontend during initialization. In order to support an > external frontend, an interface is needed to populate and return > TBackendGflags fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10549) Register transactions from external frontends
[ https://issues.apache.org/jira/browse/IMPALA-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler resolved IMPALA-10549. Fix Version/s: Impala 4.0 Resolution: Fixed > Register transactions from external frontends > - > > Key: IMPALA-10549 > URL: https://issues.apache.org/jira/browse/IMPALA-10549 > Project: IMPALA > Issue Type: Sub-task >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > Fix For: Impala 4.0 > > > The Impala backend keepalive logic requires that readwrite transactions are > registered with the coordinator in order to be tracked while they are > executing. For external frontends to work correctly with transactional > tables, any externally started transactions must register with the > coordinator. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10645) Expose metrics for catalogd's HMS endpoint
[ https://issues.apache.org/jira/browse/IMPALA-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319723#comment-17319723 ] Kurt Deschler commented on IMPALA-10645: https://gerrit.cloudera.org/#/c/17284/ > Expose metrics for catalogd's HMS endpoint > -- > > Key: IMPALA-10645 > URL: https://issues.apache.org/jira/browse/IMPALA-10645 > Project: IMPALA > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Priority: Major > > Catalogd's HMS endpoint should expose metrics to help it supportability and > identify performance issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10655) Add ImpalaServer interface to Initialize TQueryCtx for external frontends
Kurt Deschler created IMPALA-10655: -- Summary: Add ImpalaServer interface to Initialize TQueryCtx for external frontends Key: IMPALA-10655 URL: https://issues.apache.org/jira/browse/IMPALA-10655 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Kurt Deschler Assignee: Kurt Deschler During query initialization, the backend populates the TQueryCtx thrift structure based on parameter settings, startup flags and other metadata. A new interface needs to be added so that external frontends can share this logic by retrieving an initialized TQueryCtx structure. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10655) Add ImpalaServer interface to Initialize TQueryCtx for external frontends
[ https://issues.apache.org/jira/browse/IMPALA-10655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Deschler updated IMPALA-10655: --- Parent: IMPALA-10517 Issue Type: Sub-task (was: Improvement) > Add ImpalaServer interface to Initialize TQueryCtx for external frontends > - > > Key: IMPALA-10655 > URL: https://issues.apache.org/jira/browse/IMPALA-10655 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > During query initialization, the backend populates the TQueryCtx thrift > structure based on parameter settings, startup flags and other metadata. A > new interface needs to be added so that external frontends can share this > logic by retrieving an initialized TQueryCtx structure. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10655) Add ImpalaServer interface to Initialize TQueryCtx for external frontends
[ https://issues.apache.org/jira/browse/IMPALA-10655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17320172#comment-17320172 ] Kurt Deschler commented on IMPALA-10655: http://gerrit.cloudera.org:8080/17312 > Add ImpalaServer interface to Initialize TQueryCtx for external frontends > - > > Key: IMPALA-10655 > URL: https://issues.apache.org/jira/browse/IMPALA-10655 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Kurt Deschler >Assignee: Kurt Deschler >Priority: Major > > During query initialization, the backend populates the TQueryCtx thrift > structure based on parameter settings, startup flags and other metadata. A > new interface needs to be added so that external frontends can share this > logic by retrieving an initialized TQueryCtx structure. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10672) Fix non-deterministic RANK queries in tests
Kurt Deschler created IMPALA-10672: -- Summary: Fix non-deterministic RANK queries in tests Key: IMPALA-10672 URL: https://issues.apache.org/jira/browse/IMPALA-10672 Project: IMPALA Issue Type: Bug Components: Infrastructure Reporter: Kurt Deschler Assignee: Kurt Deschler The following queries were producing sporadic failures with an external frontend. On inspection, the rank ordering is not strong enough to be deterministic. functional-query/queries/limit-pushdown-analytic.test select tinyint_col, string_col, id, rnk from ( select *, rank() over (partition by tinyint_col *order by string_col*) rnk from alltypestiny) v where rnk <= 5 order by tinyint_col, string_col desc, id desc limit 10 select tinyint_col, string_col, id, rnk from ( select *, rank() over (partition by tinyint_col *order by string_col*) rnk from alltypestiny) v where rnk <= 5 order by tinyint_col, string_col desc, id desc limit 5 Ordering needs to include id tpch/queries/limit-pushdown-analytic.test select l_orderkey, l_partkey, l_suppkey, l_linenumber, l_shipmode, rnk from ( select *, rank() over (partition by l_partkey *order by l_shipmode*) rnk from lineitem) v where rnk <= 50 order by l_partkey, l_orderkey, l_suppkey, l_linenumber, l_shipmode limit 50 Ordering needs to include l_orderkey and l_linenumber -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org