[jira] [Resolved] (IMPALA-10469) Support pushing quickstart images to Apache repo
[ https://issues.apache.org/jira/browse/IMPALA-10469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-10469. Fix Version/s: Impala 4.0 Resolution: Fixed > Support pushing quickstart images to Apache repo > > > Key: IMPALA-10469 > URL: https://issues.apache.org/jira/browse/IMPALA-10469 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 4.0 > > > We need a naming scheme and maybe a script to do the push. We've so far > assumed a different repository for each image, but in the Apache docker, we > only have a single repository and need to encode the image type and version > into the tag > See https://hub.docker.com/repository/docker/apache/kudu for an example. > They have: > apache/kudu: > apache/kudu:kudu-python- > apache/kudu:impala-latest > Airflow does the opposite, and this might be easier to use with > IMPALA_QUICKSTART_IMAGE_PREFIX: > https://hub.docker.com/repository/registry-1.docker.io/apache/airflow/tags?page=1=last_updated -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10469) Support pushing quickstart images to Apache repo
[ https://issues.apache.org/jira/browse/IMPALA-10469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282270#comment-17282270 ] ASF subversion and git services commented on IMPALA-10469: -- Commit 79bee3befbc6cdcd358373822a0a3b4d19ab5ce0 in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=79bee3b ] IMPALA-10469: push quickstart to apache repo This adds a script, docker/publish_images_to_apache.sh, that allows uploading images to the apache/impala docker hub repo, prefixed with a version string. E.g. with the following commands: ninja docker_images quickstart_docker_images ./docker/publish_images_to_apache.sh -v 81d5377c2 The uploaded images can then be used for the quickstart cluster, as documented in docker/README. Updated docs for quickstart to use a prefix from apache/impala Remove IMPALA_QUICKSTART_VERSION, which doesn't interact well with the tagging since the image name and version are now encoded in the tag. Fix an incorrect image name added to docker-images.txt: impala_profile_tool_image. Testing: Ran Impala quickstart with data loading using instructions in README. export IMPALA_QUICKSTART_IMAGE_PREFIX="apache/impala:81d5377c2-" docker network create -d bridge quickstart-network export QUICKSTART_IP=$(docker network inspect quickstart-network -f '{{(index .IPAM.Config 0).Gateway}}') export QUICKSTART_LISTEN_ADDR=$QUICKSTART_IP docker-compose -f docker/quickstart.yml \ -f docker/quickstart-kudu-minimal.yml \ -f docker/quickstart-load-data.yml up -d docker run --network=quickstart-network -it \ ${IMPALA_QUICKSTART_IMAGE_PREFIX}impala_quickstart_client impala-shell Change-Id: I535d77e565b73d732ae511d7525193467086c76a Reviewed-on: http://gerrit.cloudera.org:8080/17030 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Support pushing quickstart images to Apache repo > > > Key: IMPALA-10469 > URL: https://issues.apache.org/jira/browse/IMPALA-10469 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 4.0 > > > We need a naming scheme and maybe a script to do the push. We've so far > assumed a different repository for each image, but in the Apache docker, we > only have a single repository and need to encode the image type and version > into the tag > See https://hub.docker.com/repository/docker/apache/kudu for an example. > They have: > apache/kudu: > apache/kudu:kudu-python- > apache/kudu:impala-latest > Airflow does the opposite, and this might be easier to use with > IMPALA_QUICKSTART_IMAGE_PREFIX: > https://hub.docker.com/repository/registry-1.docker.io/apache/airflow/tags?page=1=last_updated -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10397) TestAutoScaling.test_single_workload failed in exhaustive release build
[ https://issues.apache.org/jira/browse/IMPALA-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282269#comment-17282269 ] ASF subversion and git services commented on IMPALA-10397: -- Commit f888d362951454b273114c98686193498c0d3fe0 in impala's branch refs/heads/master from Bikramjeet Vig [ https://gitbox.apache.org/repos/asf?p=impala.git;h=f888d36 ] IMPALA-10397 : Reduce flakiness in test_single_workload This test failed recently due to a timeout waiting for executors to come up. The logs showed that the executors came up on time but it was not recognized by the coordinator. This patch attempts to reduce flakiness by increasing the timeout and adding more logging in case this happens in the future. Testing: Ran in a loop on my local for a few hours. Change-Id: I73ea5eb663db6d03832b19ed323670590946f514 Reviewed-on: http://gerrit.cloudera.org:8080/17028 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > TestAutoScaling.test_single_workload failed in exhaustive release build > --- > > Key: IMPALA-10397 > URL: https://issues.apache.org/jira/browse/IMPALA-10397 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Zoltán Borók-Nagy >Assignee: Bikramjeet Vig >Priority: Major > Labels: broken-build > > TestAutoScaling.test_single_workload failed in an exhaustive release build. > *Error details* > AssertionError: Number of backends did not reach 5 within 45 s assert > any( at 0x7f772c155e10>) > *Stack trace* > {noformat} > custom_cluster/test_auto_scaling.py:95: in test_single_workload > assert any(self._get_num_backends() >= cluster_size or sleep(1) > E AssertionError: Number of backends did not reach 5 within 45 s > E assert any( at 0x7f772c155e10>){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10469) Support pushing quickstart images to Apache repo
[ https://issues.apache.org/jira/browse/IMPALA-10469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-10469. Fix Version/s: Impala 4.0 Resolution: Fixed > Support pushing quickstart images to Apache repo > > > Key: IMPALA-10469 > URL: https://issues.apache.org/jira/browse/IMPALA-10469 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 4.0 > > > We need a naming scheme and maybe a script to do the push. We've so far > assumed a different repository for each image, but in the Apache docker, we > only have a single repository and need to encode the image type and version > into the tag > See https://hub.docker.com/repository/docker/apache/kudu for an example. > They have: > apache/kudu: > apache/kudu:kudu-python- > apache/kudu:impala-latest > Airflow does the opposite, and this might be easier to use with > IMPALA_QUICKSTART_IMAGE_PREFIX: > https://hub.docker.com/repository/registry-1.docker.io/apache/airflow/tags?page=1=last_updated -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IMPALA-10497) test_no_fd_caching_on_cached_data failing
[ https://issues.apache.org/jira/browse/IMPALA-10497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282208#comment-17282208 ] Riza Suminto commented on IMPALA-10497: --- I'm able to reproduce this test failure after several retry. Still not sure what is the root cause of this flakiness though. I'll try to increase the sleep time and see if the flakiness is gone. > test_no_fd_caching_on_cached_data failing > - > > Key: IMPALA-10497 > URL: https://issues.apache.org/jira/browse/IMPALA-10497 > Project: IMPALA > Issue Type: Bug >Reporter: Bikramjeet Vig >Assignee: Riza Suminto >Priority: Major > Labels: broken-build > > {noformat} > Error Message > assert 1 == 0 + where 1 = >() + > where > = > 0x7f22dfe5aa10>.cached_handles > Stacktrace > custom_cluster/test_hdfs_fd_caching.py:202: in > test_no_fd_caching_on_cached_data > assert self.cached_handles() == 0 > E assert 1 == 0 > E+ where 1 = >() > E+where > = > 0x7f22dfe5aa10>.cached_handles > Standard Error > -- 2021-02-08 06:40:41,413 INFO MainThread: Starting cluster with > command: > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/bin/start-impala-cluster.py > '--state_store_args=--statestore_update_frequency_ms=50 > --statestore_priority_update_frequency_ms=50 > --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 > --log_dir=/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests > --log_level=1 '--impalad_args=--max_cached_file_handles=16 > --unused_file_handle_timeout_sec=5 --data_cache=/tmp:500MB > --always_use_data_cache=true ' '--state_store_args=None ' > '--catalogd_args=--load_catalog_in_background=false ' > --impalad_args=--default_query_options= > 06:40:42 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es) > 06:40:42 MainThread: Starting State Store logging to > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/statestored.INFO > 06:40:42 MainThread: Starting Catalog Service logging to > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/catalogd.INFO > 06:40:42 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad.INFO > 06:40:42 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO > 06:40:42 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO > 06:40:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > 06:40:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > 06:40:45 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000 > 06:40:45 MainThread: Debug webpage not yet available: ('Connection aborted.', > error(111, 'Connection refused')) > 06:40:47 MainThread: Debug webpage did not become available in expected time. > 06:40:47 MainThread: Waiting for num_known_live_backends=3. Current value: > None > 06:40:48 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > 06:40:48 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000 > 06:40:48 MainThread: Waiting for num_known_live_backends=3. Current value: 0 > 06:40:49 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > 06:40:49 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000 > 06:40:49 MainThread: num_known_live_backends has reached value: 3 > 06:40:49 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > 06:40:49 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25001 > 06:40:49 MainThread: num_known_live_backends has reached value: 3 > 06:40:50 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > 06:40:50 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25002 > 06:40:50 MainThread: num_known_live_backends has reached value: 3 > 06:40:50 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 > executors). > -- 2021-02-08 06:40:51,049 DEBUGMainThread: Found 3 impalad/1 > statestored/1 catalogd process(es) > -- 2021-02-08 06:40:51,049 INFO MainThread: Getting metric: > statestore.live-backends from > impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25010 > -- 2021-02-08 06:40:51,050 INFO MainThread: Starting
[jira] [Updated] (IMPALA-9955) Internal error for a query with large rows and spilling
[ https://issues.apache.org/jira/browse/IMPALA-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanlong Huang updated IMPALA-9955: --- Component/s: Backend > Internal error for a query with large rows and spilling > --- > > Key: IMPALA-9955 > URL: https://issues.apache.org/jira/browse/IMPALA-9955 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0, Impala 3.3.0, Impala 3.4.0 >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Major > Fix For: Impala 4.0 > > Attachments: impalad.INFO, impalad_node1.INFO, impalad_node2.INFO > > > Encounter a query failure due to internal error: > {code:java} > create table bigstrs stored as parquet as select *, repeat(uuid(), > cast(random() * 10 as int)) as bigstr from functional.alltypes; > set MAX_ROW_SIZE=3.5MB; > set MEM_LIMIT=4GB; > set DISABLE_CODEGEN=true; > create table my_cnt stored as parquet as select count(*) as cnt, bigstr from > bigstrs group by bigstr; > {code} > The error is > {code:java} > ERROR: Internal error: couldn't pin large page of 4194304 bytes, client only > had 2097152 bytes of unused reservation: > 0xcf9dae0 internal state: { > 0xbdf6ac0 name: GroupingAggregator id=3 ptr=0xcf9d900 write_status: buffers > allocated 2097152 num_pages: 2094 pinned_bytes: 41943040 > dirty_unpinned_bytes: 0 in_flight_write_bytes: 0 reservation: > {: reservation_limit 9223372036854775807 reservation > 46137344 used_reservation 44040192 child_reservations 0 parent: > : reservation_limit 9223372036854775807 reservation > 46137344 used_reservation 0 child_reservations 46137344 parent: > : reservation_limit 9223372036854775807 reservation > 46137344 used_reservation 0 child_reservations 46137344 parent: > : reservation_limit 3435973836 reservation 46137344 > used_reservation 0 child_reservations 46137344 parent: > : reservation_limit 6647046144 reservation 46137344 > used_reservation 0 child_reservations 46137344 parent: > NULL} > 12 pinned pages: 0xc9160a0 len: 2097152 pin_count: 1 > buf: 0xc916118 client: 0xcf9dae0/0xbdf6ac0 data: > 0x1320 len: 2097152 > 0xc919d40 len: 4194304 pin_count: 1 buf: > 0xc919db8 client: 0xcf9dae0/0xbdf6ac0 data: > 0x12460 len: 4194304 > 0xd42aaa0 len: 4194304 pin_count: 1 buf: > 0xd42ab18 client: 0xcf9dae0/0xbdf6ac0 data: > 0x12b20 len: 4194304 > 0xd42b900 len: 4194304 pin_count: 1 buf: > 0xd42b978 client: 0xcf9dae0/0xbdf6ac0 data: > 0x132a0 len: 4194304 > 0xd42d3e0 len: 2097152 pin_count: 1 buf: > 0xd42d458 client: 0xcf9dae0/0xbdf6ac0 data: > 0xc6a0 len: 2097152 > 0xd42dd40 len: 4194304 pin_count: 1 buf: > 0xd42ddb8 client: 0xcf9dae0/0xbdf6ac0 data: > 0x132e0 len: 4194304 > 0xd42de80 len: 4194304 pin_count: 1 buf: > 0xd42def8 client: 0xcf9dae0/0xbdf6ac0 data: > 0x137c0 len: 4194304 > 0x12d48320 len: 4194304 pin_count: 1 buf: > 0x12d48398 client: 0xcf9dae0/0xbdf6ac0 data: > 0x102c0 len: 4194304 > 0x12d483c0 len: 4194304 pin_count: 1 buf: > 0x12d48438 client: 0xcf9dae0/0xbdf6ac0 data: > 0x108a0 len: 4194304 > 0x12d48780 len: 4194304 pin_count: 1 buf: > 0x12d487f8 client: 0xcf9dae0/0xbdf6ac0 data: > 0x108e0 len: 4194304 > 0x12d492c0 len: 2097152 pin_count: 1 buf: > 0x12d49338 client: 0xcf9dae0/0xbdf6ac0 data: > 0x12760 len: 2097152 > 0x12d4a9e0 len: 2097152 pin_count: 1 buf: > 0x12d4aa58 client: 0xcf9dae0/0xbdf6ac0 data: > 0x12d20 len: 2097152 > 0 dirty unpinned pages: > 0 in flight write pages: } > {code} > Found the stacktrace from the log: > {code} > @ 0x1c9dfbe impala::Status::Status() > @ 0x1ca5a78 impala::Status::Status() > @ 0x2bfe4ec impala::BufferedTupleStream::NextReadPage() > @ 0x2c04b72 impala::BufferedTupleStream::GetNextInternal<>() > @ 0x2c029e6 impala::BufferedTupleStream::GetNextInternal<>() > @ 0x2bffd19 impala::BufferedTupleStream::GetNext() > @ 0x28aa43f impala::GroupingAggregator::ProcessStream<>() > @ 0x28a2e17 impala::GroupingAggregator::BuildSpilledPartition() > @ 0x28a2401 impala::GroupingAggregator::NextPartition() > @ 0x289df5a impala::GroupingAggregator::GetRowsFromPartition() > @ 0x289db20 impala::GroupingAggregator::GetNext() > @ 0x28dbfc7 impala::AggregationNode::GetNext() > @ 0x2259dfc impala::FragmentInstanceState::ExecInternal() > @ 0x22564a0 impala::FragmentInstanceState::Exec() > @ 0x22801ed impala::QueryState::ExecFInstance() > @ 0x227e5ef > _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv > @ 0x2281d8e >
[jira] [Updated] (IMPALA-9957) Impalad crashes when serializing large rows in aggregation spilling
[ https://issues.apache.org/jira/browse/IMPALA-9957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanlong Huang updated IMPALA-9957: --- Component/s: Backend > Impalad crashes when serializing large rows in aggregation spilling > --- > > Key: IMPALA-9957 > URL: https://issues.apache.org/jira/browse/IMPALA-9957 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0, Impala 3.3.0, Impala 3.4.0 >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Critical > Fix For: Impala 4.0 > > > Queries to reproduce the crash using the testdata: > {code:sql} > create table bigstrs stored as parquet as > select *, repeat(uuid(), cast(random() * 10 as int)) as bigstr > from functional.alltypes; > set MAX_ROW_SIZE=3.5MB; > set MEM_LIMIT=4GB; > create table my_str_group stored as parquet as > select group_concat(string_col) as ss, bigstr > from bigstrs group by bigstr; > {code} > The last query 1) has large rows, 2) needs spilling in aggregation 3) has > aggregation on functions needs serialize (e.g. group_concat, appx_median, > min(string), etc). With these 3 conditions, it will trigger this bug. > The crash stacktraces are different in different build modes. Crash > stacktrace in RELEASE build with codegen enabled: > {code:java} > Thread 316 (crashed) > 0 impalad!impala::HashTable::Close() [hash-table.cc : 512 + 0x0] > 1 impalad!impala::GroupingAggregator::Partition::Spill(bool) > [grouping-aggregator-partition.cc : 180 + 0x9] > 2 impalad!impala::GroupingAggregator::SpillPartition(bool) > [grouping-aggregator.cc : 904 + 0x10] > 3 0x7f5fba83db3c > 4 impalad!impala::GroupingAggregator::AddBatch(impala::RuntimeState*, > impala::RowBatch*) [grouping-aggregator.cc : 437 + 0x2] > 5 impalad!impala::AggregationNode::Open(impala::RuntimeState*) > [aggregation-node.cc : 70 + 0x6] > 6 libstdc++.so.6.0.24 + 0x120b28 > 7 > impalad!apache::hive::service::cli::thrift::TColumnValue::printTo(std::ostream&) > const [converter_lexical_streams.hpp : 161 + 0x8] > 8 impalad!impala::FragmentInstanceState::Open() [fragment-instance-state.cc > : 396 + 0x11] > 9 impalad!tc_newarray + 0x171 > {code} > Crash stacktrace in RELEASE build with codegen disabled (set > DISABLE_CODEGEN=true): > {code:java} > Thread 320 (crashed) > 0 impalad!impala::HashTable::Close() [hash-table.cc : 512 + 0x0] > 1 impalad!impala::GroupingAggregator::Partition::Spill(bool) > [grouping-aggregator-partition.cc : 180 + 0x9] > 2 impalad!impala::GroupingAggregator::SpillPartition(bool) > [grouping-aggregator.cc : 904 + 0x10] > 3 impalad!impala::Status > impala::GroupingAggregator::AddBatchImpl(impala::RowBatch*, > impala::TPrefetchMode::type, impala::HashTableCtx*) > [grouping-aggregator-ir.cc : 148 + 0x11] > 4 impalad!impala::GroupingAggregator::AddBatch(impala::RuntimeState*, > impala::RowBatch*) [grouping-aggregator.cc : 439 + 0x5] > 5 impalad!impala::AggregationNode::Open(impala::RuntimeState*) > [aggregation-node.cc : 70 + 0x6] > 6 impalad!impala::FragmentInstanceState::Open() [fragment-instance-state.cc > : 396 + 0x11] > 7 impalad!impala::FragmentInstanceState::Exec() [fragment-instance-state.cc > : 97 + 0x12] > 8 impalad!impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) > [query-state.cc : 815 + 0x19] > 9 impalad!impala::Thread::SuperviseThread(std::__cxx11::basic_string std::char_traits, std::allocator > const&, > std::__cxx11::basic_string, std::allocator > > const&, boost::function, impala::ThreadDebugInfo const*, > impala::Promise*) [function_template.hpp : 770 > + 0x7] > 10 impalad!boost::detail::thread_data (*)(std::__cxx11::basic_string, > std::allocator > const&, std::__cxx11::basic_string std::char_traits, std::allocator > const&, boost::function ()>, impala::ThreadDebugInfo const*, impala::Promise (impala::PromiseMode)0>*), > boost::_bi::list5 std::char_traits, std::allocator > >, > boost::_bi::value, > std::allocator > >, boost::_bi::value >, > boost::_bi::value, > boost::_bi::value*> > > > >::run() [bind.hpp : 531 + 0xc] > 11 impalad!thread_proxy + 0x72 > 12 libpthread-2.23.so + 0x76ba > 13 libc-2.23.so + 0x1074dd > {code} > Crash stacktrace in DEBUG build with codegen disabled is a bit ealier - > crashed at a DCHECK: > {code:java} > F0715 20:29:24.389505 16868 grouping-aggregator-partition.cc:125] > 1d4b40df02e6ad76:433ed5740003] Check failed: !status.ok() Stream was > unpinned - AddRow() only fails on error > *** Check failure stack trace: *** > @ 0x513f31c google::LogMessage::Fail() > @ 0x5140c0c google::LogMessage::SendToLog() > @ 0x513ec7a google::LogMessage::Flush() > @ 0x5142878
[jira] [Updated] (IMPALA-9955) Internal error for a query with large rows and spilling
[ https://issues.apache.org/jira/browse/IMPALA-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanlong Huang updated IMPALA-9955: --- Affects Version/s: Impala 3.2.0 Impala 3.3.0 Impala 3.4.0 > Internal error for a query with large rows and spilling > --- > > Key: IMPALA-9955 > URL: https://issues.apache.org/jira/browse/IMPALA-9955 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.2.0, Impala 3.3.0, Impala 3.4.0 >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Major > Fix For: Impala 4.0 > > Attachments: impalad.INFO, impalad_node1.INFO, impalad_node2.INFO > > > Encounter a query failure due to internal error: > {code:java} > create table bigstrs stored as parquet as select *, repeat(uuid(), > cast(random() * 10 as int)) as bigstr from functional.alltypes; > set MAX_ROW_SIZE=3.5MB; > set MEM_LIMIT=4GB; > set DISABLE_CODEGEN=true; > create table my_cnt stored as parquet as select count(*) as cnt, bigstr from > bigstrs group by bigstr; > {code} > The error is > {code:java} > ERROR: Internal error: couldn't pin large page of 4194304 bytes, client only > had 2097152 bytes of unused reservation: > 0xcf9dae0 internal state: { > 0xbdf6ac0 name: GroupingAggregator id=3 ptr=0xcf9d900 write_status: buffers > allocated 2097152 num_pages: 2094 pinned_bytes: 41943040 > dirty_unpinned_bytes: 0 in_flight_write_bytes: 0 reservation: > {: reservation_limit 9223372036854775807 reservation > 46137344 used_reservation 44040192 child_reservations 0 parent: > : reservation_limit 9223372036854775807 reservation > 46137344 used_reservation 0 child_reservations 46137344 parent: > : reservation_limit 9223372036854775807 reservation > 46137344 used_reservation 0 child_reservations 46137344 parent: > : reservation_limit 3435973836 reservation 46137344 > used_reservation 0 child_reservations 46137344 parent: > : reservation_limit 6647046144 reservation 46137344 > used_reservation 0 child_reservations 46137344 parent: > NULL} > 12 pinned pages: 0xc9160a0 len: 2097152 pin_count: 1 > buf: 0xc916118 client: 0xcf9dae0/0xbdf6ac0 data: > 0x1320 len: 2097152 > 0xc919d40 len: 4194304 pin_count: 1 buf: > 0xc919db8 client: 0xcf9dae0/0xbdf6ac0 data: > 0x12460 len: 4194304 > 0xd42aaa0 len: 4194304 pin_count: 1 buf: > 0xd42ab18 client: 0xcf9dae0/0xbdf6ac0 data: > 0x12b20 len: 4194304 > 0xd42b900 len: 4194304 pin_count: 1 buf: > 0xd42b978 client: 0xcf9dae0/0xbdf6ac0 data: > 0x132a0 len: 4194304 > 0xd42d3e0 len: 2097152 pin_count: 1 buf: > 0xd42d458 client: 0xcf9dae0/0xbdf6ac0 data: > 0xc6a0 len: 2097152 > 0xd42dd40 len: 4194304 pin_count: 1 buf: > 0xd42ddb8 client: 0xcf9dae0/0xbdf6ac0 data: > 0x132e0 len: 4194304 > 0xd42de80 len: 4194304 pin_count: 1 buf: > 0xd42def8 client: 0xcf9dae0/0xbdf6ac0 data: > 0x137c0 len: 4194304 > 0x12d48320 len: 4194304 pin_count: 1 buf: > 0x12d48398 client: 0xcf9dae0/0xbdf6ac0 data: > 0x102c0 len: 4194304 > 0x12d483c0 len: 4194304 pin_count: 1 buf: > 0x12d48438 client: 0xcf9dae0/0xbdf6ac0 data: > 0x108a0 len: 4194304 > 0x12d48780 len: 4194304 pin_count: 1 buf: > 0x12d487f8 client: 0xcf9dae0/0xbdf6ac0 data: > 0x108e0 len: 4194304 > 0x12d492c0 len: 2097152 pin_count: 1 buf: > 0x12d49338 client: 0xcf9dae0/0xbdf6ac0 data: > 0x12760 len: 2097152 > 0x12d4a9e0 len: 2097152 pin_count: 1 buf: > 0x12d4aa58 client: 0xcf9dae0/0xbdf6ac0 data: > 0x12d20 len: 2097152 > 0 dirty unpinned pages: > 0 in flight write pages: } > {code} > Found the stacktrace from the log: > {code} > @ 0x1c9dfbe impala::Status::Status() > @ 0x1ca5a78 impala::Status::Status() > @ 0x2bfe4ec impala::BufferedTupleStream::NextReadPage() > @ 0x2c04b72 impala::BufferedTupleStream::GetNextInternal<>() > @ 0x2c029e6 impala::BufferedTupleStream::GetNextInternal<>() > @ 0x2bffd19 impala::BufferedTupleStream::GetNext() > @ 0x28aa43f impala::GroupingAggregator::ProcessStream<>() > @ 0x28a2e17 impala::GroupingAggregator::BuildSpilledPartition() > @ 0x28a2401 impala::GroupingAggregator::NextPartition() > @ 0x289df5a impala::GroupingAggregator::GetRowsFromPartition() > @ 0x289db20 impala::GroupingAggregator::GetNext() > @ 0x28dbfc7 impala::AggregationNode::GetNext() > @ 0x2259dfc impala::FragmentInstanceState::ExecInternal() > @ 0x22564a0 impala::FragmentInstanceState::Exec() > @ 0x22801ed impala::QueryState::ExecFInstance() > @ 0x227e5ef > _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv > @ 0x2281d8e >
[jira] [Updated] (IMPALA-9957) Impalad crashes when serializing large rows in aggregation spilling
[ https://issues.apache.org/jira/browse/IMPALA-9957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanlong Huang updated IMPALA-9957: --- Affects Version/s: Impala 3.2.0 Impala 3.3.0 Impala 3.4.0 > Impalad crashes when serializing large rows in aggregation spilling > --- > > Key: IMPALA-9957 > URL: https://issues.apache.org/jira/browse/IMPALA-9957 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.2.0, Impala 3.3.0, Impala 3.4.0 >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Critical > Fix For: Impala 4.0 > > > Queries to reproduce the crash using the testdata: > {code:sql} > create table bigstrs stored as parquet as > select *, repeat(uuid(), cast(random() * 10 as int)) as bigstr > from functional.alltypes; > set MAX_ROW_SIZE=3.5MB; > set MEM_LIMIT=4GB; > create table my_str_group stored as parquet as > select group_concat(string_col) as ss, bigstr > from bigstrs group by bigstr; > {code} > The last query 1) has large rows, 2) needs spilling in aggregation 3) has > aggregation on functions needs serialize (e.g. group_concat, appx_median, > min(string), etc). With these 3 conditions, it will trigger this bug. > The crash stacktraces are different in different build modes. Crash > stacktrace in RELEASE build with codegen enabled: > {code:java} > Thread 316 (crashed) > 0 impalad!impala::HashTable::Close() [hash-table.cc : 512 + 0x0] > 1 impalad!impala::GroupingAggregator::Partition::Spill(bool) > [grouping-aggregator-partition.cc : 180 + 0x9] > 2 impalad!impala::GroupingAggregator::SpillPartition(bool) > [grouping-aggregator.cc : 904 + 0x10] > 3 0x7f5fba83db3c > 4 impalad!impala::GroupingAggregator::AddBatch(impala::RuntimeState*, > impala::RowBatch*) [grouping-aggregator.cc : 437 + 0x2] > 5 impalad!impala::AggregationNode::Open(impala::RuntimeState*) > [aggregation-node.cc : 70 + 0x6] > 6 libstdc++.so.6.0.24 + 0x120b28 > 7 > impalad!apache::hive::service::cli::thrift::TColumnValue::printTo(std::ostream&) > const [converter_lexical_streams.hpp : 161 + 0x8] > 8 impalad!impala::FragmentInstanceState::Open() [fragment-instance-state.cc > : 396 + 0x11] > 9 impalad!tc_newarray + 0x171 > {code} > Crash stacktrace in RELEASE build with codegen disabled (set > DISABLE_CODEGEN=true): > {code:java} > Thread 320 (crashed) > 0 impalad!impala::HashTable::Close() [hash-table.cc : 512 + 0x0] > 1 impalad!impala::GroupingAggregator::Partition::Spill(bool) > [grouping-aggregator-partition.cc : 180 + 0x9] > 2 impalad!impala::GroupingAggregator::SpillPartition(bool) > [grouping-aggregator.cc : 904 + 0x10] > 3 impalad!impala::Status > impala::GroupingAggregator::AddBatchImpl(impala::RowBatch*, > impala::TPrefetchMode::type, impala::HashTableCtx*) > [grouping-aggregator-ir.cc : 148 + 0x11] > 4 impalad!impala::GroupingAggregator::AddBatch(impala::RuntimeState*, > impala::RowBatch*) [grouping-aggregator.cc : 439 + 0x5] > 5 impalad!impala::AggregationNode::Open(impala::RuntimeState*) > [aggregation-node.cc : 70 + 0x6] > 6 impalad!impala::FragmentInstanceState::Open() [fragment-instance-state.cc > : 396 + 0x11] > 7 impalad!impala::FragmentInstanceState::Exec() [fragment-instance-state.cc > : 97 + 0x12] > 8 impalad!impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) > [query-state.cc : 815 + 0x19] > 9 impalad!impala::Thread::SuperviseThread(std::__cxx11::basic_string std::char_traits, std::allocator > const&, > std::__cxx11::basic_string, std::allocator > > const&, boost::function, impala::ThreadDebugInfo const*, > impala::Promise*) [function_template.hpp : 770 > + 0x7] > 10 impalad!boost::detail::thread_data (*)(std::__cxx11::basic_string, > std::allocator > const&, std::__cxx11::basic_string std::char_traits, std::allocator > const&, boost::function ()>, impala::ThreadDebugInfo const*, impala::Promise (impala::PromiseMode)0>*), > boost::_bi::list5 std::char_traits, std::allocator > >, > boost::_bi::value, > std::allocator > >, boost::_bi::value >, > boost::_bi::value, > boost::_bi::value*> > > > >::run() [bind.hpp : 531 + 0xc] > 11 impalad!thread_proxy + 0x72 > 12 libpthread-2.23.so + 0x76ba > 13 libc-2.23.so + 0x1074dd > {code} > Crash stacktrace in DEBUG build with codegen disabled is a bit ealier - > crashed at a DCHECK: > {code:java} > F0715 20:29:24.389505 16868 grouping-aggregator-partition.cc:125] > 1d4b40df02e6ad76:433ed5740003] Check failed: !status.ok() Stream was > unpinned - AddRow() only fails on error > *** Check failure stack trace: *** > @ 0x513f31c google::LogMessage::Fail() > @ 0x5140c0c google::LogMessage::SendToLog() > @ 0x513ec7a
[jira] [Resolved] (IMPALA-8721) Wrong result when Impala reads a Hive written parquet TimeStamp column
[ https://issues.apache.org/jira/browse/IMPALA-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8721. --- Fix Version/s: Impala 4.0 Resolution: Fixed > Wrong result when Impala reads a Hive written parquet TimeStamp column > -- > > Key: IMPALA-8721 > URL: https://issues.apache.org/jira/browse/IMPALA-8721 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Abhishek Rawat >Assignee: Tim Armstrong >Priority: Critical > Labels: Interoperability, correctness, hive, impala, parquet, > timestamp > Fix For: Impala 4.0 > > > > Easy to repro on latest upstream: > {code:java} > hive> create table t1_hive(c1 timestamp) stored as parquet; > hive> insert into t1_hive values('2009-03-09 01:20:03.6'); > hive> select * from t1_hive; > OK > 2009-03-09 01:20:03.6 > [localhost:21000] default> invalidate metadata t1_hive; > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 09:55:36 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=b34f85cb5da29c26:d4dfcb24 > +---+ > | c1 | > +---+ > | 2009-03-09 09:20:03.6 | +---+ > bin/start-impala-cluster.py > --impalad_args='-convert_legacy_hive_parquet_utc_timestamps=true' > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 10:00:22 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=d5428bb21fb259b9:7b107034 > +---+ > | c1 | > +---+ > | 2009-03-09 02:20:03.6 |. < +---+ > > {code} > > This issue is causing testcase test_hive_impala_interop to fail. Untill this > issue is fixed, the testcase will be updated to not include a timestamp > column. The test case should be updated to include a timestamp column once > this issue is fixed. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8721) Wrong result when Impala reads a Hive written parquet TimeStamp column
[ https://issues.apache.org/jira/browse/IMPALA-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8721. --- Fix Version/s: Impala 4.0 Resolution: Fixed > Wrong result when Impala reads a Hive written parquet TimeStamp column > -- > > Key: IMPALA-8721 > URL: https://issues.apache.org/jira/browse/IMPALA-8721 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Abhishek Rawat >Assignee: Tim Armstrong >Priority: Critical > Labels: Interoperability, correctness, hive, impala, parquet, > timestamp > Fix For: Impala 4.0 > > > > Easy to repro on latest upstream: > {code:java} > hive> create table t1_hive(c1 timestamp) stored as parquet; > hive> insert into t1_hive values('2009-03-09 01:20:03.6'); > hive> select * from t1_hive; > OK > 2009-03-09 01:20:03.6 > [localhost:21000] default> invalidate metadata t1_hive; > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 09:55:36 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=b34f85cb5da29c26:d4dfcb24 > +---+ > | c1 | > +---+ > | 2009-03-09 09:20:03.6 | +---+ > bin/start-impala-cluster.py > --impalad_args='-convert_legacy_hive_parquet_utc_timestamps=true' > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 10:00:22 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=d5428bb21fb259b9:7b107034 > +---+ > | c1 | > +---+ > | 2009-03-09 02:20:03.6 |. < +---+ > > {code} > > This issue is causing testcase test_hive_impala_interop to fail. Untill this > issue is fixed, the testcase will be updated to not include a timestamp > column. The test case should be updated to include a timestamp column once > this issue is fixed. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IMPALA-8721) Wrong result when Impala reads a Hive written parquet TimeStamp column
[ https://issues.apache.org/jira/browse/IMPALA-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282137#comment-17282137 ] ASF subversion and git services commented on IMPALA-8721: - Commit 1f7b413d11321bd74aaa1a9ea9ed30e4d80d in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=1f7b413 ] IMPALA-8721: re-enable test_hive_impala_interop The test now passes because HIVE-21290 was fixed. Revert "IMPALA-8689: test_hive_impala_interop failing with "Timeout >7200s"" This reverts commit 5d8c99ce74c45a7d04f11e1f252b346d654f02bf. Change-Id: I7e2beabd7082a45a0fc3b60d318cf698079768ff Reviewed-on: http://gerrit.cloudera.org:8080/17042 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Wrong result when Impala reads a Hive written parquet TimeStamp column > -- > > Key: IMPALA-8721 > URL: https://issues.apache.org/jira/browse/IMPALA-8721 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Abhishek Rawat >Assignee: Tim Armstrong >Priority: Critical > Labels: Interoperability, correctness, hive, impala, parquet, > timestamp > > > Easy to repro on latest upstream: > {code:java} > hive> create table t1_hive(c1 timestamp) stored as parquet; > hive> insert into t1_hive values('2009-03-09 01:20:03.6'); > hive> select * from t1_hive; > OK > 2009-03-09 01:20:03.6 > [localhost:21000] default> invalidate metadata t1_hive; > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 09:55:36 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=b34f85cb5da29c26:d4dfcb24 > +---+ > | c1 | > +---+ > | 2009-03-09 09:20:03.6 | +---+ > bin/start-impala-cluster.py > --impalad_args='-convert_legacy_hive_parquet_utc_timestamps=true' > [localhost:21000] default> select * from t1_hive; > Query: select * from t1_hive > Query submitted at: 2019-06-24 10:00:22 (Coordinator: > http://optimus-prime:25000) > Query progress can be monitored at: > http://optimus-prime:25000/query_plan?query_id=d5428bb21fb259b9:7b107034 > +---+ > | c1 | > +---+ > | 2009-03-09 02:20:03.6 |. < +---+ > > {code} > > This issue is causing testcase test_hive_impala_interop to fail. Untill this > issue is fixed, the testcase will be updated to not include a timestamp > column. The test case should be updated to include a timestamp column once > this issue is fixed. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-9641) Query hang when containing alias names as empty backticks
[ https://issues.apache.org/jira/browse/IMPALA-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282135#comment-17282135 ] ASF subversion and git services commented on IMPALA-9641: - Commit 701714b10a77aee62cf2ad3e25db9e2dfd418780 in impala's branch refs/heads/master from Tamas Mate [ https://gitbox.apache.org/repos/asf?p=impala.git;h=701714b ] IMPALA-10379: Add missing HiveLexer classes to shared-deps HIVE-19064 introduced additional lexer classes that are required during runtime. This commit adds the missing HiveLexer lexer classes to the shared-deps. Without these classes queries such as 'select 1 as "``"' would fail with 'NoClassDefFoundError'. Testing: - added a misc.test to verify that the classes are available and that IMPALA-9641 is fixed by HIVE-19064 Change-Id: I6e3a00335983f26498c1130ab9f109f6e67256f5 Reviewed-on: http://gerrit.cloudera.org:8080/17019 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Query hang when containing alias names as empty backticks > - > > Key: IMPALA-9641 > URL: https://issues.apache.org/jira/browse/IMPALA-9641 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.4.0 >Reporter: Quanlong Huang >Assignee: Tamas Mate >Priority: Blocker > Labels: hang > Fix For: Impala 4.0 > > > The following query will hang in an infinite loop: > {code:java} > select 1 as "``"; > {code} > Stacktrace of its compiler thread: > {code:java} > "Thread-19" #34 prio=5 os_prio=0 tid=0x12fc nid=0x5514 runnable > [0x7f2abda41000] >java.lang.Thread.State: RUNNABLE > at java.io.FileOutputStream.writeBytes(Native Method) > at java.io.FileOutputStream.write(FileOutputStream.java:326) > at > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > - locked <0x0005cc90f7b8> (a java.io.BufferedOutputStream) > at java.io.PrintStream.write(PrintStream.java:482) > - locked <0x0005cc90f798> (a java.io.PrintStream) > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) > at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291) > at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104) > - locked <0x0005cc90f8d8> (a java.io.OutputStreamWriter) > at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:185) > at java.io.PrintStream.write(PrintStream.java:527) > - locked <0x0005cc90f798> (a java.io.PrintStream) > at java.io.PrintStream.print(PrintStream.java:669) > at java.io.PrintStream.println(PrintStream.java:806) > - locked <0x0005cc90f798> (a java.io.PrintStream) > at > org.antlr.runtime.BaseRecognizer.emitErrorMessage(BaseRecognizer.java:344) > at > org.antlr.runtime.BaseRecognizer.displayRecognitionError(BaseRecognizer.java:194) > at org.antlr.runtime.Lexer.reportError(Lexer.java:261) > at org.antlr.runtime.Lexer.nextToken(Lexer.java:103) > at > org.apache.impala.analysis.ToSqlUtils.hiveNeedsQuotes(ToSqlUtils.java:145) > at > org.apache.impala.analysis.ToSqlUtils.getIdentSql(ToSqlUtils.java:199) > at org.apache.impala.analysis.SlotRef.(SlotRef.java:58) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyzeSelectClause(SelectStmt.java:283) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyze(SelectStmt.java:215) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.access$100(SelectStmt.java:199) > at org.apache.impala.analysis.SelectStmt.analyze(SelectStmt.java:192) > at > org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:473) > at > org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:437) > at > org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1530) > at > org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1497) > at > org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1467) > at > org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:154) > {code} > org.antlr.runtime.Lexer keeps emitting the same error message to stderr > (which is redirected to impalad.ERROR): > {code:java} > line 1:0 rule Identifier failed predicate: {allowQuotedId()}? > line 1:0 rule Identifier failed predicate: {allowQuotedId()}? > line 1:0 rule Identifier failed predicate: {allowQuotedId()}? > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IMPALA-10379) NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/Quotation
[ https://issues.apache.org/jira/browse/IMPALA-10379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282134#comment-17282134 ] ASF subversion and git services commented on IMPALA-10379: -- Commit 701714b10a77aee62cf2ad3e25db9e2dfd418780 in impala's branch refs/heads/master from Tamas Mate [ https://gitbox.apache.org/repos/asf?p=impala.git;h=701714b ] IMPALA-10379: Add missing HiveLexer classes to shared-deps HIVE-19064 introduced additional lexer classes that are required during runtime. This commit adds the missing HiveLexer lexer classes to the shared-deps. Without these classes queries such as 'select 1 as "``"' would fail with 'NoClassDefFoundError'. Testing: - added a misc.test to verify that the classes are available and that IMPALA-9641 is fixed by HIVE-19064 Change-Id: I6e3a00335983f26498c1130ab9f109f6e67256f5 Reviewed-on: http://gerrit.cloudera.org:8080/17019 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/Quotation > --- > > Key: IMPALA-10379 > URL: https://issues.apache.org/jira/browse/IMPALA-10379 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 4.0 >Reporter: Quanlong Huang >Assignee: Tamas Mate >Priority: Major > Fix For: Impala 4.0 > > Attachments: org.apache.hadoop.hive.ql.parse.txt > > > Found a NoClassDefFoundError when reexamining IMPALA-9641: > {code} > [localhost:21050] default> select 1 as "``"; > Query: select 1 as "``" > Query submitted at: 2020-12-07 15:30:26 (Coordinator: > http://quanlong-OptiPlex-BJ:25000) > ERROR: NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/Quotation > {code} > Logs: > {code} > I1207 15:30:26.218670 9245 Frontend.java:1581] > bc464dbe4cf418b9:7173a0bd] Analyzing query: select 1 as "``" db: > default > I1207 15:30:26.220055 9245 jni-util.cc:288] > bc464dbe4cf418b9:7173a0bd] java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/ql/parse/Quotation > at > org.apache.hadoop.hive.ql.parse.GenericHiveLexer.allowQuotedId(GenericHiveLexer.java:75) > at > org.apache.hadoop.hive.ql.parse.HiveLexer_HiveLexerParent.mIdentifier(HiveLexer_HiveLexerParent.java:10075) > at > org.apache.hadoop.hive.ql.parse.HiveLexer_HiveLexerParent.mTokens(HiveLexer_HiveLexerParent.java:13028) > at > org.apache.hadoop.hive.ql.parse.HiveLexer.mTokens(HiveLexer.java:671) > at org.antlr.runtime.Lexer.nextToken(Lexer.java:89) > at > org.apache.impala.analysis.ToSqlUtils.hiveNeedsQuotes(ToSqlUtils.java:163) > at > org.apache.impala.analysis.ToSqlUtils.getIdentSql(ToSqlUtils.java:217) > at org.apache.impala.analysis.SlotRef.(SlotRef.java:58) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyzeSelectClause(SelectStmt.java:370) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyze(SelectStmt.java:286) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.access$100(SelectStmt.java:270) > at org.apache.impala.analysis.SelectStmt.analyze(SelectStmt.java:263) > at > org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:481) > at > org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:445) > at > org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1621) > at > org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1588) > at > org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1558) > at > org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:159) > I1207 15:30:26.220113 9245 status.cc:129] bc464dbe4cf418b9:7173a0bd] > NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/Quotation > @ 0x1d88eff impala::Status::Status() > @ 0x27436c3 impala::JniUtil::GetJniExceptionMsg() > @ 0x2540aa4 impala::JniCall::Call<>() > @ 0x253d793 impala::JniUtil::CallJniMethod<>() > @ 0x253b9f6 impala::Frontend::GetExecRequest() > @ 0x2debc9b impala::QueryDriver::RunFrontendPlanner() > @ 0x256d6de impala::ImpalaServer::ExecuteInternal() > @ 0x256d09c impala::ImpalaServer::Execute() > @ 0x2616082 impala::ImpalaServer::ExecuteStatement() > @ 0x2c44ec9 > apache::hive::service::cli::thrift::TCLIServiceProcessor::process_ExecuteStatement() > @ 0x2c4359d > apache::hive::service::cli::thrift::TCLIServiceProcessor::dispatchCall() > @ 0x2c02d48 > impala::ImpalaHiveServer2ServiceProcessor::dispatchCall() > @ 0x1d35d81
[jira] [Commented] (IMPALA-9586) Update query option docs to account for interactions with mt_dop
[ https://issues.apache.org/jira/browse/IMPALA-9586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282136#comment-17282136 ] ASF subversion and git services commented on IMPALA-9586: - Commit 8551805875fcd2c701d988df887c1173520fca12 in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=8551805 ] IMPALA-9586: update query option docs for mt_dop There are interactions between mt_dop and num_nodes and num_scanner_threads. Mention these in the docs. Change-Id: I3d9a6f56ffaf211d7d3ca1fad506ee83d516ccbd Reviewed-on: http://gerrit.cloudera.org:8080/17043 Tested-by: Impala Public Jenkins Reviewed-by: Joe McDonnell > Update query option docs to account for interactions with mt_dop > > > Key: IMPALA-9586 > URL: https://issues.apache.org/jira/browse/IMPALA-9586 > Project: IMPALA > Issue Type: Improvement > Components: Docs >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 4.0 > > > in some cases mt_dop changes the behaviour of other options or makes them a > no-op. We need to update docs to reflect this. > * Setting NUM_NODES=1 along with MT_DOP >=1 effectively reduces MT_DOP to 1, > i.e. only one thread is used. > * NUM_SCANNER_THREADS has no effect when MT_DOP>=1 > * Maybe other changes? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8689) test_hive_impala_interop failing with "Timeout >7200s"
[ https://issues.apache.org/jira/browse/IMPALA-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282138#comment-17282138 ] ASF subversion and git services commented on IMPALA-8689: - Commit 1f7b413d11321bd74aaa1a9ea9ed30e4d80d in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=1f7b413 ] IMPALA-8721: re-enable test_hive_impala_interop The test now passes because HIVE-21290 was fixed. Revert "IMPALA-8689: test_hive_impala_interop failing with "Timeout >7200s"" This reverts commit 5d8c99ce74c45a7d04f11e1f252b346d654f02bf. Change-Id: I7e2beabd7082a45a0fc3b60d318cf698079768ff Reviewed-on: http://gerrit.cloudera.org:8080/17042 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > test_hive_impala_interop failing with "Timeout >7200s" > -- > > Key: IMPALA-8689 > URL: https://issues.apache.org/jira/browse/IMPALA-8689 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.3.0 >Reporter: Andrew Sherman >Assignee: Abhishek Rawat >Priority: Critical > Labels: broken-build > Fix For: Impala 3.3.0 > > > I think this is the new test added in IMPALA-8617 > {code} > custom_cluster/test_hive_parquet_codec_interop.py:78: in > test_hive_impala_interop > .format(codec, hive_table, impala_table)) > common/impala_test_suite.py:871: in run_stmt_in_hive > (stdout, stderr) = call.communicate() > /usr/lib64/python2.7/subprocess.py:800: in communicate > return self._communicate(input) > /usr/lib64/python2.7/subprocess.py:1401: in _communicate > stdout, stderr = self._communicate_with_poll(input) > /usr/lib64/python2.7/subprocess.py:1455: in _communicate_with_poll > ready = poller.poll() > E Failed: Timeout >7200s > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10480) heap-use-after-free crash in ASAN build
[ https://issues.apache.org/jira/browse/IMPALA-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikramjeet Vig resolved IMPALA-10480. - Resolution: Duplicate > heap-use-after-free crash in ASAN build > --- > > Key: IMPALA-10480 > URL: https://issues.apache.org/jira/browse/IMPALA-10480 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 4.0 >Reporter: Bikramjeet Vig >Assignee: Bikramjeet Vig >Priority: Major > Labels: broken-build > > Likely candidates that triggered this: > {noformat} > > query_test.test_tpch_nested_queries.TestTpchNestedQuery.test_tpch_q20[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > orc/def/block] 8.4 sec 1 > query_test.test_tpch_queries.TestTpchQuery.test_tpch[protocol: beeswax | > exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > orc/def/block-TPC-H: Q2]8.4 sec 1 > query_test.test_queries.TestHdfsQueries.test_hdfs_scan_node[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > rc/snap/block] 8.4 sec 1 > {noformat} > Error: > {noformat} > ==28216==ERROR: AddressSanitizer: heap-use-after-free on address > 0x7fb838f33800 at pc 0x01b74b61 bp 0x7fb91d19f0c0 sp 0x7fb91d19e870 > READ of size 1048576 at 0x7fb838f33800 thread T82 (rpc reactor-287) > #0 0x1b74b60 in read_iovec(void*, __sanitizer::__sanitizer_iovec*, > unsigned long, unsigned long) > /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904 > #1 0x1b8b1c1 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, > long) > /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781 > #2 0x1b8daa3 in __interceptor_sendmsg > /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796 > #3 0x3b1fc7c in kudu::Socket::Writev(iovec const*, int, long*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3 > #4 0x36ef1d5 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26 > #5 0x36f7c90 in kudu::rpc::Connection::WriteHandler(ev::io&, int) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31 > #6 0x598c3d2 in ev_invoke_pending > (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x598c3d2) > #7 0x3681ffc in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3 > #8 0x598fa7f in ev_run > (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x598fa7f) > #9 0x36821f1 in kudu::rpc::ReactorThread::RunThread() > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9 > #10 0x369392b in boost::_bi::bind_t kudu::rpc::ReactorThread>, > boost::_bi::list1 > > >::operator()() > /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16 > #11 0x23f26b6 in boost::function0::operator()() const > /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14 > #12 0x23eef29 in kudu::Thread::SuperviseThread(void*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/thread.cc:675:3 > #13 0x7fc169a0fe24 in start_thread (/lib64/libpthread.so.0+0x7e24) > #14 0x7fc16645934c in __clone (/lib64/libc.so.6+0xf834c) > 0x7fb838f33800 is located 0 bytes inside of 1048577-byte region > [0x7fb838f33800,0x7fb839033801) > freed by thread T117 here: > #0 0x1bfab40 in operator delete(void*) > /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/asan_new_delete.cc:137 > #1 0x7fc166d5c5a9 in __gnu_cxx::new_allocator::deallocate(char*, > unsigned long) > /mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:125 > #2 0x7fc166d5c5a9 in std::allocator_traits >
[jira] [Commented] (IMPALA-10480) heap-use-after-free crash in ASAN build
[ https://issues.apache.org/jira/browse/IMPALA-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282125#comment-17282125 ] Bikramjeet Vig commented on IMPALA-10480: - [~fangyurao] Thanks for letting me know, I'll mark this as a duplicate > heap-use-after-free crash in ASAN build > --- > > Key: IMPALA-10480 > URL: https://issues.apache.org/jira/browse/IMPALA-10480 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 4.0 >Reporter: Bikramjeet Vig >Assignee: Bikramjeet Vig >Priority: Major > Labels: broken-build > > Likely candidates that triggered this: > {noformat} > > query_test.test_tpch_nested_queries.TestTpchNestedQuery.test_tpch_q20[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > orc/def/block] 8.4 sec 1 > query_test.test_tpch_queries.TestTpchQuery.test_tpch[protocol: beeswax | > exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > orc/def/block-TPC-H: Q2]8.4 sec 1 > query_test.test_queries.TestHdfsQueries.test_hdfs_scan_node[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > rc/snap/block] 8.4 sec 1 > {noformat} > Error: > {noformat} > ==28216==ERROR: AddressSanitizer: heap-use-after-free on address > 0x7fb838f33800 at pc 0x01b74b61 bp 0x7fb91d19f0c0 sp 0x7fb91d19e870 > READ of size 1048576 at 0x7fb838f33800 thread T82 (rpc reactor-287) > #0 0x1b74b60 in read_iovec(void*, __sanitizer::__sanitizer_iovec*, > unsigned long, unsigned long) > /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904 > #1 0x1b8b1c1 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, > long) > /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781 > #2 0x1b8daa3 in __interceptor_sendmsg > /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796 > #3 0x3b1fc7c in kudu::Socket::Writev(iovec const*, int, long*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3 > #4 0x36ef1d5 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26 > #5 0x36f7c90 in kudu::rpc::Connection::WriteHandler(ev::io&, int) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31 > #6 0x598c3d2 in ev_invoke_pending > (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x598c3d2) > #7 0x3681ffc in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3 > #8 0x598fa7f in ev_run > (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x598fa7f) > #9 0x36821f1 in kudu::rpc::ReactorThread::RunThread() > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9 > #10 0x369392b in boost::_bi::bind_t kudu::rpc::ReactorThread>, > boost::_bi::list1 > > >::operator()() > /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16 > #11 0x23f26b6 in boost::function0::operator()() const > /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14 > #12 0x23eef29 in kudu::Thread::SuperviseThread(void*) > /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/thread.cc:675:3 > #13 0x7fc169a0fe24 in start_thread (/lib64/libpthread.so.0+0x7e24) > #14 0x7fc16645934c in __clone (/lib64/libc.so.6+0xf834c) > 0x7fb838f33800 is located 0 bytes inside of 1048577-byte region > [0x7fb838f33800,0x7fb839033801) > freed by thread T117 here: > #0 0x1bfab40 in operator delete(void*) > /mnt/source/llvm/llvm-5.0.1.src-p3/projects/compiler-rt/lib/asan/asan_new_delete.cc:137 > #1 0x7fc166d5c5a9 in __gnu_cxx::new_allocator::deallocate(char*, > unsigned long) >
[jira] [Created] (IMPALA-10497) test_no_fd_caching_on_cached_data failing
Bikramjeet Vig created IMPALA-10497: --- Summary: test_no_fd_caching_on_cached_data failing Key: IMPALA-10497 URL: https://issues.apache.org/jira/browse/IMPALA-10497 Project: IMPALA Issue Type: Bug Reporter: Bikramjeet Vig Assignee: Riza Suminto {noformat} Error Message assert 1 == 0 + where 1 = >() + where > = .cached_handles Stacktrace custom_cluster/test_hdfs_fd_caching.py:202: in test_no_fd_caching_on_cached_data assert self.cached_handles() == 0 E assert 1 == 0 E+ where 1 = >() E+where > = .cached_handles Standard Error -- 2021-02-08 06:40:41,413 INFO MainThread: Starting cluster with command: /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/bin/start-impala-cluster.py '--state_store_args=--statestore_update_frequency_ms=50 --statestore_priority_update_frequency_ms=50 --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 --log_dir=/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests --log_level=1 '--impalad_args=--max_cached_file_handles=16 --unused_file_handle_timeout_sec=5 --data_cache=/tmp:500MB --always_use_data_cache=true ' '--state_store_args=None ' '--catalogd_args=--load_catalog_in_background=false ' --impalad_args=--default_query_options= 06:40:42 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es) 06:40:42 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/statestored.INFO 06:40:42 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/catalogd.INFO 06:40:42 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad.INFO 06:40:42 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO 06:40:42 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO 06:40:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 06:40:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 06:40:45 MainThread: Getting num_known_live_backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000 06:40:45 MainThread: Debug webpage not yet available: ('Connection aborted.', error(111, 'Connection refused')) 06:40:47 MainThread: Debug webpage did not become available in expected time. 06:40:47 MainThread: Waiting for num_known_live_backends=3. Current value: None 06:40:48 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 06:40:48 MainThread: Getting num_known_live_backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000 06:40:48 MainThread: Waiting for num_known_live_backends=3. Current value: 0 06:40:49 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 06:40:49 MainThread: Getting num_known_live_backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000 06:40:49 MainThread: num_known_live_backends has reached value: 3 06:40:49 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 06:40:49 MainThread: Getting num_known_live_backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25001 06:40:49 MainThread: num_known_live_backends has reached value: 3 06:40:50 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 06:40:50 MainThread: Getting num_known_live_backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25002 06:40:50 MainThread: num_known_live_backends has reached value: 3 06:40:50 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 executors). -- 2021-02-08 06:40:51,049 DEBUGMainThread: Found 3 impalad/1 statestored/1 catalogd process(es) -- 2021-02-08 06:40:51,049 INFO MainThread: Getting metric: statestore.live-backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25010 -- 2021-02-08 06:40:51,050 INFO MainThread: Starting new HTTP connection (1): impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com -- 2021-02-08 06:40:51,052 INFO MainThread: Metric 'statestore.live-backends' has reached desired value: 4 -- 2021-02-08 06:40:51,052 DEBUGMainThread: Getting num_known_live_backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000 -- 2021-02-08 06:40:51,053 INFO MainThread: Starting new HTTP connection (1): impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com -- 2021-02-08 06:40:51,054 INFO MainThread: num_known_live_backends has reached value: 3 -- 2021-02-08 06:40:51,054 DEBUG
[jira] [Created] (IMPALA-10497) test_no_fd_caching_on_cached_data failing
Bikramjeet Vig created IMPALA-10497: --- Summary: test_no_fd_caching_on_cached_data failing Key: IMPALA-10497 URL: https://issues.apache.org/jira/browse/IMPALA-10497 Project: IMPALA Issue Type: Bug Reporter: Bikramjeet Vig Assignee: Riza Suminto {noformat} Error Message assert 1 == 0 + where 1 = >() + where > = .cached_handles Stacktrace custom_cluster/test_hdfs_fd_caching.py:202: in test_no_fd_caching_on_cached_data assert self.cached_handles() == 0 E assert 1 == 0 E+ where 1 = >() E+where > = .cached_handles Standard Error -- 2021-02-08 06:40:41,413 INFO MainThread: Starting cluster with command: /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/bin/start-impala-cluster.py '--state_store_args=--statestore_update_frequency_ms=50 --statestore_priority_update_frequency_ms=50 --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 --log_dir=/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests --log_level=1 '--impalad_args=--max_cached_file_handles=16 --unused_file_handle_timeout_sec=5 --data_cache=/tmp:500MB --always_use_data_cache=true ' '--state_store_args=None ' '--catalogd_args=--load_catalog_in_background=false ' --impalad_args=--default_query_options= 06:40:42 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es) 06:40:42 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/statestored.INFO 06:40:42 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/catalogd.INFO 06:40:42 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad.INFO 06:40:42 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO 06:40:42 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO 06:40:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 06:40:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 06:40:45 MainThread: Getting num_known_live_backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000 06:40:45 MainThread: Debug webpage not yet available: ('Connection aborted.', error(111, 'Connection refused')) 06:40:47 MainThread: Debug webpage did not become available in expected time. 06:40:47 MainThread: Waiting for num_known_live_backends=3. Current value: None 06:40:48 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 06:40:48 MainThread: Getting num_known_live_backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000 06:40:48 MainThread: Waiting for num_known_live_backends=3. Current value: 0 06:40:49 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 06:40:49 MainThread: Getting num_known_live_backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000 06:40:49 MainThread: num_known_live_backends has reached value: 3 06:40:49 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 06:40:49 MainThread: Getting num_known_live_backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25001 06:40:49 MainThread: num_known_live_backends has reached value: 3 06:40:50 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 06:40:50 MainThread: Getting num_known_live_backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25002 06:40:50 MainThread: num_known_live_backends has reached value: 3 06:40:50 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 executors). -- 2021-02-08 06:40:51,049 DEBUGMainThread: Found 3 impalad/1 statestored/1 catalogd process(es) -- 2021-02-08 06:40:51,049 INFO MainThread: Getting metric: statestore.live-backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25010 -- 2021-02-08 06:40:51,050 INFO MainThread: Starting new HTTP connection (1): impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com -- 2021-02-08 06:40:51,052 INFO MainThread: Metric 'statestore.live-backends' has reached desired value: 4 -- 2021-02-08 06:40:51,052 DEBUGMainThread: Getting num_known_live_backends from impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com:25000 -- 2021-02-08 06:40:51,053 INFO MainThread: Starting new HTTP connection (1): impala-ec2-centos74-r5-4xlarge-ondemand-02df.vpc.cloudera.com -- 2021-02-08 06:40:51,054 INFO MainThread: num_known_live_backends has reached value: 3 -- 2021-02-08 06:40:51,054 DEBUG
[jira] [Commented] (IMPALA-7092) Re-enable EC tests broken by HDFS-13539
[ https://issues.apache.org/jira/browse/IMPALA-7092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281953#comment-17281953 ] Tim Armstrong commented on IMPALA-7092: --- These seem to be marked by @SkipIfEC.oom > Re-enable EC tests broken by HDFS-13539 > > > Key: IMPALA-7092 > URL: https://issues.apache.org/jira/browse/IMPALA-7092 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend, Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: Tianyi Wang >Priority: Major > > With HDFS-13539 and HDFS-13540 fixed, we should be able to re-enable some > tests and diagnose the causes of the remaining failed tests without much > noise. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10496) Support SAML authentication in Impyla
Csaba Ringhofer created IMPALA-10496: Summary: Support SAML authentication in Impyla Key: IMPALA-10496 URL: https://issues.apache.org/jira/browse/IMPALA-10496 Project: IMPALA Issue Type: Improvement Components: Clients Reporter: Csaba Ringhofer IMPALA-10437 adds SAML2 browser profile support to Impala. Supporting it in Impyla would allow implementing SAML auth for Impala shell, and make SAML related EE tests simpler. The simplest way would be to allow passing a bearer token in https://github.com/cloudera/impyla/blob/0914895830609001b9d4f535573cba8db487d45e/impala/hiveserver2.py#L796 I am not sure about the other parts of the SAML logic (e.g. communication with the browser) - it could be added to Impyla too or reside in Impala shell. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-10496) Support SAML authentication in Impyla
Csaba Ringhofer created IMPALA-10496: Summary: Support SAML authentication in Impyla Key: IMPALA-10496 URL: https://issues.apache.org/jira/browse/IMPALA-10496 Project: IMPALA Issue Type: Improvement Components: Clients Reporter: Csaba Ringhofer IMPALA-10437 adds SAML2 browser profile support to Impala. Supporting it in Impyla would allow implementing SAML auth for Impala shell, and make SAML related EE tests simpler. The simplest way would be to allow passing a bearer token in https://github.com/cloudera/impyla/blob/0914895830609001b9d4f535573cba8db487d45e/impala/hiveserver2.py#L796 I am not sure about the other parts of the SAML logic (e.g. communication with the browser) - it could be added to Impyla too or reside in Impala shell. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10495) Computing correlation coefficient for certain columns can be useful to min/max filters
[ https://issues.apache.org/jira/browse/IMPALA-10495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen updated IMPALA-10495: Description: Selecting good min/max filters for a query during query compilation can be a difficult task as the benefit of such a filter may not be known. However, when a column C is known to have a strong correlation with a partition or a sort column, a min/max filter on C can be chosen and very useful. (was: Selecting good min/max filters for a query during query compilation can be a difficult task as the benefit of such a filter may not known. However, when a column C is known to have a strong correlation with a partition or a sort column, a min/max filter on C can be very useful.) > Computing correlation coefficient for certain columns can be useful to > min/max filters > -- > > Key: IMPALA-10495 > URL: https://issues.apache.org/jira/browse/IMPALA-10495 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Qifan Chen >Priority: Major > > Selecting good min/max filters for a query during query compilation can be a > difficult task as the benefit of such a filter may not be known. However, > when a column C is known to have a strong correlation with a partition or a > sort column, a min/max filter on C can be chosen and very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-9586) Update query option docs to account for interactions with mt_dop
[ https://issues.apache.org/jira/browse/IMPALA-9586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-9586. --- Fix Version/s: Impala 4.0 Resolution: Fixed > Update query option docs to account for interactions with mt_dop > > > Key: IMPALA-9586 > URL: https://issues.apache.org/jira/browse/IMPALA-9586 > Project: IMPALA > Issue Type: Improvement > Components: Docs >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 4.0 > > > in some cases mt_dop changes the behaviour of other options or makes them a > no-op. We need to update docs to reflect this. > * Setting NUM_NODES=1 along with MT_DOP >=1 effectively reduces MT_DOP to 1, > i.e. only one thread is used. > * NUM_SCANNER_THREADS has no effect when MT_DOP>=1 > * Maybe other changes? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-9382) Prototype denser runtime profile implementation
[ https://issues.apache.org/jira/browse/IMPALA-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-9382 started by Tim Armstrong. - > Prototype denser runtime profile implementation > --- > > Key: IMPALA-9382 > URL: https://issues.apache.org/jira/browse/IMPALA-9382 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Attachments: profile_504b379400cba9f2_2d2cf007, > tpcds_q10_profile_v1.txt, tpcds_q10_profile_v2.txt, tpcds_q10_profile_v2.txt > > > RuntimeProfile trees can potentially stress the memory allocator and use up a > lot more memory and cache than is really necessary: > * std::map is used throughout, and allocates a node per map entry. We do > depend on the counters being displayed in-order, but we would probably be > better of storing the counters in a vector and lazily sorting when needed > (since the set of counters is generally static after Prepare()). > * We store the same counter names redundantly all over the place. We'd > probably be best off using a pool of constant counter names (we could just > require registering them upfront). > There may be a small gain from switching thrift to using unordered_map, e.g. > for the info strings that appear with some frequency in profiles. > However, I think we need to restructure the thrift representation and > in-memory representation to get significant gains. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-9586) Update query option docs to account for interactions with mt_dop
[ https://issues.apache.org/jira/browse/IMPALA-9586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-9586. --- Fix Version/s: Impala 4.0 Resolution: Fixed > Update query option docs to account for interactions with mt_dop > > > Key: IMPALA-9586 > URL: https://issues.apache.org/jira/browse/IMPALA-9586 > Project: IMPALA > Issue Type: Improvement > Components: Docs >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 4.0 > > > in some cases mt_dop changes the behaviour of other options or makes them a > no-op. We need to update docs to reflect this. > * Setting NUM_NODES=1 along with MT_DOP >=1 effectively reduces MT_DOP to 1, > i.e. only one thread is used. > * NUM_SCANNER_THREADS has no effect when MT_DOP>=1 > * Maybe other changes? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-10495) Computing correlation coefficient for certain columns can be useful to min/max filters
Qifan Chen created IMPALA-10495: --- Summary: Computing correlation coefficient for certain columns can be useful to min/max filters Key: IMPALA-10495 URL: https://issues.apache.org/jira/browse/IMPALA-10495 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Qifan Chen Selecting good min/max filters for a query during query compilation can be a difficult task as the benefit of such a filter may not known. However, when a column C is known to have a strong correlation with a partition or a sort column, a min/max filter on C can be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10495) Computing correlation coefficient for certain columns can be useful to min/max filters
Qifan Chen created IMPALA-10495: --- Summary: Computing correlation coefficient for certain columns can be useful to min/max filters Key: IMPALA-10495 URL: https://issues.apache.org/jira/browse/IMPALA-10495 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Qifan Chen Selecting good min/max filters for a query during query compilation can be a difficult task as the benefit of such a filter may not known. However, when a column C is known to have a strong correlation with a partition or a sort column, a min/max filter on C can be very useful. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IMPALA-9382) Prototype denser runtime profile implementation
[ https://issues.apache.org/jira/browse/IMPALA-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281873#comment-17281873 ] Tim Armstrong commented on IMPALA-9382: --- Actually I should reduce the verbosity of the default option a bit as part 3 > Prototype denser runtime profile implementation > --- > > Key: IMPALA-9382 > URL: https://issues.apache.org/jira/browse/IMPALA-9382 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Attachments: profile_504b379400cba9f2_2d2cf007, > tpcds_q10_profile_v1.txt, tpcds_q10_profile_v2.txt, tpcds_q10_profile_v2.txt > > > RuntimeProfile trees can potentially stress the memory allocator and use up a > lot more memory and cache than is really necessary: > * std::map is used throughout, and allocates a node per map entry. We do > depend on the counters being displayed in-order, but we would probably be > better of storing the counters in a vector and lazily sorting when needed > (since the set of counters is generally static after Prepare()). > * We store the same counter names redundantly all over the place. We'd > probably be best off using a pool of constant counter names (we could just > require registering them upfront). > There may be a small gain from switching thrift to using unordered_map, e.g. > for the info strings that appear with some frequency in profiles. > However, I think we need to restructure the thrift representation and > in-memory representation to get significant gains. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Reopened] (IMPALA-9382) Prototype denser runtime profile implementation
[ https://issues.apache.org/jira/browse/IMPALA-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong reopened IMPALA-9382: --- > Prototype denser runtime profile implementation > --- > > Key: IMPALA-9382 > URL: https://issues.apache.org/jira/browse/IMPALA-9382 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 4.0 > > Attachments: profile_504b379400cba9f2_2d2cf007, > tpcds_q10_profile_v1.txt, tpcds_q10_profile_v2.txt, tpcds_q10_profile_v2.txt > > > RuntimeProfile trees can potentially stress the memory allocator and use up a > lot more memory and cache than is really necessary: > * std::map is used throughout, and allocates a node per map entry. We do > depend on the counters being displayed in-order, but we would probably be > better of storing the counters in a vector and lazily sorting when needed > (since the set of counters is generally static after Prepare()). > * We store the same counter names redundantly all over the place. We'd > probably be best off using a pool of constant counter names (we could just > require registering them upfront). > There may be a small gain from switching thrift to using unordered_map, e.g. > for the info strings that appear with some frequency in profiles. > However, I think we need to restructure the thrift representation and > in-memory representation to get significant gains. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9382) Prototype denser runtime profile implementation
[ https://issues.apache.org/jira/browse/IMPALA-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-9382: -- Fix Version/s: (was: Impala 4.0) > Prototype denser runtime profile implementation > --- > > Key: IMPALA-9382 > URL: https://issues.apache.org/jira/browse/IMPALA-9382 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Attachments: profile_504b379400cba9f2_2d2cf007, > tpcds_q10_profile_v1.txt, tpcds_q10_profile_v2.txt, tpcds_q10_profile_v2.txt > > > RuntimeProfile trees can potentially stress the memory allocator and use up a > lot more memory and cache than is really necessary: > * std::map is used throughout, and allocates a node per map entry. We do > depend on the counters being displayed in-order, but we would probably be > better of storing the counters in a vector and lazily sorting when needed > (since the set of counters is generally static after Prepare()). > * We store the same counter names redundantly all over the place. We'd > probably be best off using a pool of constant counter names (we could just > require registering them upfront). > There may be a small gain from switching thrift to using unordered_map, e.g. > for the info strings that appear with some frequency in profiles. > However, I think we need to restructure the thrift representation and > in-memory representation to get significant gains. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10494) Making use of the min/max column stats to improve min/max filters
Qifan Chen created IMPALA-10494: --- Summary: Making use of the min/max column stats to improve min/max filters Key: IMPALA-10494 URL: https://issues.apache.org/jira/browse/IMPALA-10494 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Qifan Chen HMS (hive metastore) API offers means to store the minimal and maximal value per column (https://hive.apache.org/javadocs/r3.0.0/api/org/apache/hadoop/hive/metastore/api/ColumnStatisticsData.html). For example, such stats for an integer column can be captured via a LongColumnStatsData object (https://hive.apache.org/javadocs/r3.0.0/api/org/apache/hadoop/hive/metastore/api/LongColumnStatsData.html). It is desirable to use the min and max stats per column to help the formation of useful min/max filters that can help reduce the data scanned for Parquet tables. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10494) Making use of the min/max column stats to improve min/max filters
Qifan Chen created IMPALA-10494: --- Summary: Making use of the min/max column stats to improve min/max filters Key: IMPALA-10494 URL: https://issues.apache.org/jira/browse/IMPALA-10494 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Qifan Chen HMS (hive metastore) API offers means to store the minimal and maximal value per column (https://hive.apache.org/javadocs/r3.0.0/api/org/apache/hadoop/hive/metastore/api/ColumnStatisticsData.html). For example, such stats for an integer column can be captured via a LongColumnStatsData object (https://hive.apache.org/javadocs/r3.0.0/api/org/apache/hadoop/hive/metastore/api/LongColumnStatsData.html). It is desirable to use the min and max stats per column to help the formation of useful min/max filters that can help reduce the data scanned for Parquet tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IMPALA-10493) Using JOIN ON syntax to join two full ACID collections produces wrong results
[ https://issues.apache.org/jira/browse/IMPALA-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281831#comment-17281831 ] Zoltán Borók-Nagy commented on IMPALA-10493: The problem is in AcidRewriter.splitCollectionRef(). When it creates the new collection ref it doesn't copy every information from the old collection ref, e.g. it drops the ON caluse. > Using JOIN ON syntax to join two full ACID collections produces wrong results > - > > Key: IMPALA-10493 > URL: https://issues.apache.org/jira/browse/IMPALA-10493 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Zoltán Borók-Nagy >Assignee: Zoltán Borók-Nagy >Priority: Major > Labels: CorrectnessBug, impala-acid > > The following query produces wrong results: > {noformat} > use functional_orc_def; // use full ACID tables > select a1.item, a2.item > from complextypestbl.int_array a1 join complextypestbl.int_array a2 > on a1.item=a2.item > where a1.item<2;{noformat} > It creates a CROSS JOIN without the predicate "a1.item = a2.item", generating > too many rows. The expected plan node would be an INNER JOIN on "a1.item = > a2.item". > If we put the JOIN condition to the WHERE clause we get the correct plan: > {noformat} > select a1.item, a2.item > from complextypestbl.int_array a1 join complextypestbl.int_array a2 > where a1.item=a2.item and a1.item<2{noformat} > We also get a correct plan if the right table is non-ACID: > {noformat} > select a1.item, a2.item > from complextypestbl.int_array a1 join > functional_parquet.complextypestbl.int_array a2 > on a1.item=a2.item > where a1.item<2;{noformat} > Or ACID table but the column is non-collection: > {noformat} > select c.id, a1.item > from complextypestbl.int_array a1 join complextypestbl c > on c.id=a1.item > where c.id<2;{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-10493) Using JOIN ON syntax to join two full ACID collections produces wrong results
[ https://issues.apache.org/jira/browse/IMPALA-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy reassigned IMPALA-10493: -- Assignee: Zoltán Borók-Nagy > Using JOIN ON syntax to join two full ACID collections produces wrong results > - > > Key: IMPALA-10493 > URL: https://issues.apache.org/jira/browse/IMPALA-10493 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Zoltán Borók-Nagy >Assignee: Zoltán Borók-Nagy >Priority: Major > Labels: CorrectnessBug, impala-acid > > The following query produces wrong results: > {noformat} > use functional_orc_def; // use full ACID tables > select a1.item, a2.item > from complextypestbl.int_array a1 join complextypestbl.int_array a2 > on a1.item=a2.item > where a1.item<2;{noformat} > It creates a CROSS JOIN without the predicate "a1.item = a2.item", generating > too many rows. The expected plan node would be an INNER JOIN on "a1.item = > a2.item". > If we put the JOIN condition to the WHERE clause we get the correct plan: > {noformat} > select a1.item, a2.item > from complextypestbl.int_array a1 join complextypestbl.int_array a2 > where a1.item=a2.item and a1.item<2{noformat} > We also get a correct plan if the right table is non-ACID: > {noformat} > select a1.item, a2.item > from complextypestbl.int_array a1 join > functional_parquet.complextypestbl.int_array a2 > on a1.item=a2.item > where a1.item<2;{noformat} > Or ACID table but the column is non-collection: > {noformat} > select c.id, a1.item > from complextypestbl.int_array a1 join complextypestbl c > on c.id=a1.item > where c.id<2;{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10464) Performance comparison between ndv() ds_hll_* and ds_theat_* functions
[ https://issues.apache.org/jira/browse/IMPALA-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281796#comment-17281796 ] Gabor Kaszab commented on IMPALA-10464: --- I'd also include the recently introduced CPC sketch as well to these measurements as it serves the same purpose as HLL and Theta > Performance comparison between ndv() ds_hll_* and ds_theat_* functions > -- > > Key: IMPALA-10464 > URL: https://issues.apache.org/jira/browse/IMPALA-10464 > Project: IMPALA > Issue Type: New Feature > Components: Perf Investigation >Reporter: Fucun Chu >Assignee: Fucun Chu >Priority: Major > Fix For: Not Applicable > > > Perf comparison doc: > [https://docs.google.com/spreadsheets/d/1ew7XCENs7wLxIVlr70bymUBP59YAGUCnNXTR0i_uzv0/edit#gid=0] > General observations included in the doc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10464) Performance comparison between ndv() ds_hll_* and ds_theta_* functions
[ https://issues.apache.org/jira/browse/IMPALA-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Kaszab updated IMPALA-10464: -- Summary: Performance comparison between ndv() ds_hll_* and ds_theta_* functions (was: Performance comparison between ndv() ds_hll_* and ds_theat_* functions) > Performance comparison between ndv() ds_hll_* and ds_theta_* functions > -- > > Key: IMPALA-10464 > URL: https://issues.apache.org/jira/browse/IMPALA-10464 > Project: IMPALA > Issue Type: New Feature > Components: Perf Investigation >Reporter: Fucun Chu >Assignee: Fucun Chu >Priority: Major > Fix For: Not Applicable > > > Perf comparison doc: > [https://docs.google.com/spreadsheets/d/1ew7XCENs7wLxIVlr70bymUBP59YAGUCnNXTR0i_uzv0/edit#gid=0] > General observations included in the doc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10493) Using JOIN ON syntax to join two full ACID collections produces wrong results
[ https://issues.apache.org/jira/browse/IMPALA-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy updated IMPALA-10493: --- Component/s: Frontend > Using JOIN ON syntax to join two full ACID collections produces wrong results > - > > Key: IMPALA-10493 > URL: https://issues.apache.org/jira/browse/IMPALA-10493 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Zoltán Borók-Nagy >Priority: Major > > The following query produces wrong results: > {noformat} > use functional_orc_def; // use full ACID tables > select a1.item, a2.item > from complextypestbl.int_array a1 join complextypestbl.int_array a2 > on a1.item=a2.item > where a1.item<2;{noformat} > It creates a CROSS JOIN without the predicate "a1.item = a2.item", generating > too many rows. The expected plan node would be an INNER JOIN on "a1.item = > a2.item". > If we put the JOIN condition to the WHERE clause we get the correct plan: > {noformat} > select a1.item, a2.item > from complextypestbl.int_array a1 join complextypestbl.int_array a2 > where a1.item=a2.item and a1.item<2{noformat} > We also get a correct plan if the right table is non-ACID: > {noformat} > select a1.item, a2.item > from complextypestbl.int_array a1 join > functional_parquet.complextypestbl.int_array a2 > on a1.item=a2.item > where a1.item<2;{noformat} > Or ACID table but the column is non-collection: > {noformat} > select c.id, a1.item > from complextypestbl.int_array a1 join complextypestbl c > on c.id=a1.item > where c.id<2;{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10493) Using JOIN ON syntax to join two full ACID collections produces wrong results
[ https://issues.apache.org/jira/browse/IMPALA-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy updated IMPALA-10493: --- Labels: CorrectnessBug impala-acid (was: ) > Using JOIN ON syntax to join two full ACID collections produces wrong results > - > > Key: IMPALA-10493 > URL: https://issues.apache.org/jira/browse/IMPALA-10493 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Zoltán Borók-Nagy >Priority: Major > Labels: CorrectnessBug, impala-acid > > The following query produces wrong results: > {noformat} > use functional_orc_def; // use full ACID tables > select a1.item, a2.item > from complextypestbl.int_array a1 join complextypestbl.int_array a2 > on a1.item=a2.item > where a1.item<2;{noformat} > It creates a CROSS JOIN without the predicate "a1.item = a2.item", generating > too many rows. The expected plan node would be an INNER JOIN on "a1.item = > a2.item". > If we put the JOIN condition to the WHERE clause we get the correct plan: > {noformat} > select a1.item, a2.item > from complextypestbl.int_array a1 join complextypestbl.int_array a2 > where a1.item=a2.item and a1.item<2{noformat} > We also get a correct plan if the right table is non-ACID: > {noformat} > select a1.item, a2.item > from complextypestbl.int_array a1 join > functional_parquet.complextypestbl.int_array a2 > on a1.item=a2.item > where a1.item<2;{noformat} > Or ACID table but the column is non-collection: > {noformat} > select c.id, a1.item > from complextypestbl.int_array a1 join complextypestbl c > on c.id=a1.item > where c.id<2;{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10493) Using JOIN ON syntax to join two full ACID collections produces wrong results
Zoltán Borók-Nagy created IMPALA-10493: -- Summary: Using JOIN ON syntax to join two full ACID collections produces wrong results Key: IMPALA-10493 URL: https://issues.apache.org/jira/browse/IMPALA-10493 Project: IMPALA Issue Type: Bug Reporter: Zoltán Borók-Nagy The following query produces wrong results: {noformat} use functional_orc_def; // use full ACID tables select a1.item, a2.item from complextypestbl.int_array a1 join complextypestbl.int_array a2 on a1.item=a2.item where a1.item<2;{noformat} It creates a CROSS JOIN without the predicate "a1.item = a2.item", generating too many rows. The expected plan node would be an INNER JOIN on "a1.item = a2.item". If we put the JOIN condition to the WHERE clause we get the correct plan: {noformat} select a1.item, a2.item from complextypestbl.int_array a1 join complextypestbl.int_array a2 where a1.item=a2.item and a1.item<2{noformat} We also get a correct plan if the right table is non-ACID: {noformat} select a1.item, a2.item from complextypestbl.int_array a1 join functional_parquet.complextypestbl.int_array a2 on a1.item=a2.item where a1.item<2;{noformat} Or ACID table but the column is non-collection: {noformat} select c.id, a1.item from complextypestbl.int_array a1 join complextypestbl c on c.id=a1.item where c.id<2;{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10493) Using JOIN ON syntax to join two full ACID collections produces wrong results
Zoltán Borók-Nagy created IMPALA-10493: -- Summary: Using JOIN ON syntax to join two full ACID collections produces wrong results Key: IMPALA-10493 URL: https://issues.apache.org/jira/browse/IMPALA-10493 Project: IMPALA Issue Type: Bug Reporter: Zoltán Borók-Nagy The following query produces wrong results: {noformat} use functional_orc_def; // use full ACID tables select a1.item, a2.item from complextypestbl.int_array a1 join complextypestbl.int_array a2 on a1.item=a2.item where a1.item<2;{noformat} It creates a CROSS JOIN without the predicate "a1.item = a2.item", generating too many rows. The expected plan node would be an INNER JOIN on "a1.item = a2.item". If we put the JOIN condition to the WHERE clause we get the correct plan: {noformat} select a1.item, a2.item from complextypestbl.int_array a1 join complextypestbl.int_array a2 where a1.item=a2.item and a1.item<2{noformat} We also get a correct plan if the right table is non-ACID: {noformat} select a1.item, a2.item from complextypestbl.int_array a1 join functional_parquet.complextypestbl.int_array a2 on a1.item=a2.item where a1.item<2;{noformat} Or ACID table but the column is non-collection: {noformat} select c.id, a1.item from complextypestbl.int_array a1 join complextypestbl c on c.id=a1.item where c.id<2;{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (IMPALA-10482) Select-star query on unrelative collection column of transactional table hits IllegalStateException
[ https://issues.apache.org/jira/browse/IMPALA-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy reassigned IMPALA-10482: -- Assignee: Zoltán Borók-Nagy > Select-star query on unrelative collection column of transactional table hits > IllegalStateException > --- > > Key: IMPALA-10482 > URL: https://issues.apache.org/jira/browse/IMPALA-10482 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 4.0 >Reporter: Quanlong Huang >Assignee: Zoltán Borók-Nagy >Priority: Critical > > {{SELECT *}} query on unrelative collection column of transactional ORC table > will hit IllegalStateException. > Reproduce the bug by: > {code:sql} > create table my_complex_orc (id int, int_array array) stored as orc > tblproperties('transactional'='true'); > select * from my_complex_orc.int_array; > {code} > FE stacktrace: > {code:java} > I0206 16:04:42.212499 15294 Frontend.java:1587] > 7e42f06526f5791a:e18eb18e] Analyzing query: select * from > my_complex_orc.int_array db: default > I0206 16:04:42.213887 15294 jni-util.cc:288] > 7e42f06526f5791a:e18eb18e] java.lang.IllegalStateException > at > com.google.common.base.Preconditions.checkState(Preconditions.java:492) > at > org.apache.impala.analysis.StatementBase.castResultExprs(StatementBase.java:114) > at > org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:561) > at > org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:445) > at > org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1627) > at > org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1594) > at > org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1564) > at > org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:159) > {code} > cc [~boroknagyz] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10379) NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/Quotation
[ https://issues.apache.org/jira/browse/IMPALA-10379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Mate resolved IMPALA-10379. - Fix Version/s: Impala 4.0 Resolution: Fixed > NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/Quotation > --- > > Key: IMPALA-10379 > URL: https://issues.apache.org/jira/browse/IMPALA-10379 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 4.0 >Reporter: Quanlong Huang >Assignee: Tamas Mate >Priority: Major > Fix For: Impala 4.0 > > Attachments: org.apache.hadoop.hive.ql.parse.txt > > > Found a NoClassDefFoundError when reexamining IMPALA-9641: > {code} > [localhost:21050] default> select 1 as "``"; > Query: select 1 as "``" > Query submitted at: 2020-12-07 15:30:26 (Coordinator: > http://quanlong-OptiPlex-BJ:25000) > ERROR: NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/Quotation > {code} > Logs: > {code} > I1207 15:30:26.218670 9245 Frontend.java:1581] > bc464dbe4cf418b9:7173a0bd] Analyzing query: select 1 as "``" db: > default > I1207 15:30:26.220055 9245 jni-util.cc:288] > bc464dbe4cf418b9:7173a0bd] java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/ql/parse/Quotation > at > org.apache.hadoop.hive.ql.parse.GenericHiveLexer.allowQuotedId(GenericHiveLexer.java:75) > at > org.apache.hadoop.hive.ql.parse.HiveLexer_HiveLexerParent.mIdentifier(HiveLexer_HiveLexerParent.java:10075) > at > org.apache.hadoop.hive.ql.parse.HiveLexer_HiveLexerParent.mTokens(HiveLexer_HiveLexerParent.java:13028) > at > org.apache.hadoop.hive.ql.parse.HiveLexer.mTokens(HiveLexer.java:671) > at org.antlr.runtime.Lexer.nextToken(Lexer.java:89) > at > org.apache.impala.analysis.ToSqlUtils.hiveNeedsQuotes(ToSqlUtils.java:163) > at > org.apache.impala.analysis.ToSqlUtils.getIdentSql(ToSqlUtils.java:217) > at org.apache.impala.analysis.SlotRef.(SlotRef.java:58) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyzeSelectClause(SelectStmt.java:370) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyze(SelectStmt.java:286) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.access$100(SelectStmt.java:270) > at org.apache.impala.analysis.SelectStmt.analyze(SelectStmt.java:263) > at > org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:481) > at > org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:445) > at > org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1621) > at > org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1588) > at > org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1558) > at > org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:159) > I1207 15:30:26.220113 9245 status.cc:129] bc464dbe4cf418b9:7173a0bd] > NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/Quotation > @ 0x1d88eff impala::Status::Status() > @ 0x27436c3 impala::JniUtil::GetJniExceptionMsg() > @ 0x2540aa4 impala::JniCall::Call<>() > @ 0x253d793 impala::JniUtil::CallJniMethod<>() > @ 0x253b9f6 impala::Frontend::GetExecRequest() > @ 0x2debc9b impala::QueryDriver::RunFrontendPlanner() > @ 0x256d6de impala::ImpalaServer::ExecuteInternal() > @ 0x256d09c impala::ImpalaServer::Execute() > @ 0x2616082 impala::ImpalaServer::ExecuteStatement() > @ 0x2c44ec9 > apache::hive::service::cli::thrift::TCLIServiceProcessor::process_ExecuteStatement() > @ 0x2c4359d > apache::hive::service::cli::thrift::TCLIServiceProcessor::dispatchCall() > @ 0x2c02d48 > impala::ImpalaHiveServer2ServiceProcessor::dispatchCall() > @ 0x1d35d81 apache::thrift::TDispatchProcessor::process() > @ 0x226573a > apache::thrift::server::TAcceptQueueServer::Task::run() > @ 0x225ab4e impala::ThriftThread::RunRunnable() > @ 0x225c18a boost::_mfi::mf2<>::operator()() > @ 0x225c01e boost::_bi::list3<>::operator()<>() > @ 0x225bd64 boost::_bi::bind_t<>::operator()() > @ 0x225bc76 > boost::detail::function::void_function_obj_invoker0<>::invoke() > @ 0x21d45f5 boost::function0<>::operator()() > @ 0x27f34f3 impala::Thread::SuperviseThread() > @ 0x27fb490 boost::_bi::list5<>::operator()<>() > @ 0x27fb3b4 boost::_bi::bind_t<>::operator()() > @ 0x27fb375 boost::detail::thread_data<>::run() > @
[jira] [Resolved] (IMPALA-10379) NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/Quotation
[ https://issues.apache.org/jira/browse/IMPALA-10379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Mate resolved IMPALA-10379. - Fix Version/s: Impala 4.0 Resolution: Fixed > NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/Quotation > --- > > Key: IMPALA-10379 > URL: https://issues.apache.org/jira/browse/IMPALA-10379 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 4.0 >Reporter: Quanlong Huang >Assignee: Tamas Mate >Priority: Major > Fix For: Impala 4.0 > > Attachments: org.apache.hadoop.hive.ql.parse.txt > > > Found a NoClassDefFoundError when reexamining IMPALA-9641: > {code} > [localhost:21050] default> select 1 as "``"; > Query: select 1 as "``" > Query submitted at: 2020-12-07 15:30:26 (Coordinator: > http://quanlong-OptiPlex-BJ:25000) > ERROR: NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/Quotation > {code} > Logs: > {code} > I1207 15:30:26.218670 9245 Frontend.java:1581] > bc464dbe4cf418b9:7173a0bd] Analyzing query: select 1 as "``" db: > default > I1207 15:30:26.220055 9245 jni-util.cc:288] > bc464dbe4cf418b9:7173a0bd] java.lang.NoClassDefFoundError: > org/apache/hadoop/hive/ql/parse/Quotation > at > org.apache.hadoop.hive.ql.parse.GenericHiveLexer.allowQuotedId(GenericHiveLexer.java:75) > at > org.apache.hadoop.hive.ql.parse.HiveLexer_HiveLexerParent.mIdentifier(HiveLexer_HiveLexerParent.java:10075) > at > org.apache.hadoop.hive.ql.parse.HiveLexer_HiveLexerParent.mTokens(HiveLexer_HiveLexerParent.java:13028) > at > org.apache.hadoop.hive.ql.parse.HiveLexer.mTokens(HiveLexer.java:671) > at org.antlr.runtime.Lexer.nextToken(Lexer.java:89) > at > org.apache.impala.analysis.ToSqlUtils.hiveNeedsQuotes(ToSqlUtils.java:163) > at > org.apache.impala.analysis.ToSqlUtils.getIdentSql(ToSqlUtils.java:217) > at org.apache.impala.analysis.SlotRef.(SlotRef.java:58) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyzeSelectClause(SelectStmt.java:370) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyze(SelectStmt.java:286) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.access$100(SelectStmt.java:270) > at org.apache.impala.analysis.SelectStmt.analyze(SelectStmt.java:263) > at > org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:481) > at > org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:445) > at > org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1621) > at > org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1588) > at > org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1558) > at > org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:159) > I1207 15:30:26.220113 9245 status.cc:129] bc464dbe4cf418b9:7173a0bd] > NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/Quotation > @ 0x1d88eff impala::Status::Status() > @ 0x27436c3 impala::JniUtil::GetJniExceptionMsg() > @ 0x2540aa4 impala::JniCall::Call<>() > @ 0x253d793 impala::JniUtil::CallJniMethod<>() > @ 0x253b9f6 impala::Frontend::GetExecRequest() > @ 0x2debc9b impala::QueryDriver::RunFrontendPlanner() > @ 0x256d6de impala::ImpalaServer::ExecuteInternal() > @ 0x256d09c impala::ImpalaServer::Execute() > @ 0x2616082 impala::ImpalaServer::ExecuteStatement() > @ 0x2c44ec9 > apache::hive::service::cli::thrift::TCLIServiceProcessor::process_ExecuteStatement() > @ 0x2c4359d > apache::hive::service::cli::thrift::TCLIServiceProcessor::dispatchCall() > @ 0x2c02d48 > impala::ImpalaHiveServer2ServiceProcessor::dispatchCall() > @ 0x1d35d81 apache::thrift::TDispatchProcessor::process() > @ 0x226573a > apache::thrift::server::TAcceptQueueServer::Task::run() > @ 0x225ab4e impala::ThriftThread::RunRunnable() > @ 0x225c18a boost::_mfi::mf2<>::operator()() > @ 0x225c01e boost::_bi::list3<>::operator()<>() > @ 0x225bd64 boost::_bi::bind_t<>::operator()() > @ 0x225bc76 > boost::detail::function::void_function_obj_invoker0<>::invoke() > @ 0x21d45f5 boost::function0<>::operator()() > @ 0x27f34f3 impala::Thread::SuperviseThread() > @ 0x27fb490 boost::_bi::list5<>::operator()<>() > @ 0x27fb3b4 boost::_bi::bind_t<>::operator()() > @ 0x27fb375 boost::detail::thread_data<>::run() > @
[jira] [Closed] (IMPALA-9641) Query hang when containing alias names as empty backticks
[ https://issues.apache.org/jira/browse/IMPALA-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Mate closed IMPALA-9641. -- Resolution: Fixed > Query hang when containing alias names as empty backticks > - > > Key: IMPALA-9641 > URL: https://issues.apache.org/jira/browse/IMPALA-9641 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.4.0 >Reporter: Quanlong Huang >Assignee: Tamas Mate >Priority: Blocker > Labels: hang > Fix For: Impala 4.0 > > > The following query will hang in an infinite loop: > {code:java} > select 1 as "``"; > {code} > Stacktrace of its compiler thread: > {code:java} > "Thread-19" #34 prio=5 os_prio=0 tid=0x12fc nid=0x5514 runnable > [0x7f2abda41000] >java.lang.Thread.State: RUNNABLE > at java.io.FileOutputStream.writeBytes(Native Method) > at java.io.FileOutputStream.write(FileOutputStream.java:326) > at > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > - locked <0x0005cc90f7b8> (a java.io.BufferedOutputStream) > at java.io.PrintStream.write(PrintStream.java:482) > - locked <0x0005cc90f798> (a java.io.PrintStream) > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) > at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291) > at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104) > - locked <0x0005cc90f8d8> (a java.io.OutputStreamWriter) > at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:185) > at java.io.PrintStream.write(PrintStream.java:527) > - locked <0x0005cc90f798> (a java.io.PrintStream) > at java.io.PrintStream.print(PrintStream.java:669) > at java.io.PrintStream.println(PrintStream.java:806) > - locked <0x0005cc90f798> (a java.io.PrintStream) > at > org.antlr.runtime.BaseRecognizer.emitErrorMessage(BaseRecognizer.java:344) > at > org.antlr.runtime.BaseRecognizer.displayRecognitionError(BaseRecognizer.java:194) > at org.antlr.runtime.Lexer.reportError(Lexer.java:261) > at org.antlr.runtime.Lexer.nextToken(Lexer.java:103) > at > org.apache.impala.analysis.ToSqlUtils.hiveNeedsQuotes(ToSqlUtils.java:145) > at > org.apache.impala.analysis.ToSqlUtils.getIdentSql(ToSqlUtils.java:199) > at org.apache.impala.analysis.SlotRef.(SlotRef.java:58) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyzeSelectClause(SelectStmt.java:283) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyze(SelectStmt.java:215) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.access$100(SelectStmt.java:199) > at org.apache.impala.analysis.SelectStmt.analyze(SelectStmt.java:192) > at > org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:473) > at > org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:437) > at > org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1530) > at > org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1497) > at > org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1467) > at > org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:154) > {code} > org.antlr.runtime.Lexer keeps emitting the same error message to stderr > (which is redirected to impalad.ERROR): > {code:java} > line 1:0 rule Identifier failed predicate: {allowQuotedId()}? > line 1:0 rule Identifier failed predicate: {allowQuotedId()}? > line 1:0 rule Identifier failed predicate: {allowQuotedId()}? > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Closed] (IMPALA-9641) Query hang when containing alias names as empty backticks
[ https://issues.apache.org/jira/browse/IMPALA-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Mate closed IMPALA-9641. -- Resolution: Fixed > Query hang when containing alias names as empty backticks > - > > Key: IMPALA-9641 > URL: https://issues.apache.org/jira/browse/IMPALA-9641 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.4.0 >Reporter: Quanlong Huang >Assignee: Tamas Mate >Priority: Blocker > Labels: hang > Fix For: Impala 4.0 > > > The following query will hang in an infinite loop: > {code:java} > select 1 as "``"; > {code} > Stacktrace of its compiler thread: > {code:java} > "Thread-19" #34 prio=5 os_prio=0 tid=0x12fc nid=0x5514 runnable > [0x7f2abda41000] >java.lang.Thread.State: RUNNABLE > at java.io.FileOutputStream.writeBytes(Native Method) > at java.io.FileOutputStream.write(FileOutputStream.java:326) > at > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > - locked <0x0005cc90f7b8> (a java.io.BufferedOutputStream) > at java.io.PrintStream.write(PrintStream.java:482) > - locked <0x0005cc90f798> (a java.io.PrintStream) > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) > at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291) > at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104) > - locked <0x0005cc90f8d8> (a java.io.OutputStreamWriter) > at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:185) > at java.io.PrintStream.write(PrintStream.java:527) > - locked <0x0005cc90f798> (a java.io.PrintStream) > at java.io.PrintStream.print(PrintStream.java:669) > at java.io.PrintStream.println(PrintStream.java:806) > - locked <0x0005cc90f798> (a java.io.PrintStream) > at > org.antlr.runtime.BaseRecognizer.emitErrorMessage(BaseRecognizer.java:344) > at > org.antlr.runtime.BaseRecognizer.displayRecognitionError(BaseRecognizer.java:194) > at org.antlr.runtime.Lexer.reportError(Lexer.java:261) > at org.antlr.runtime.Lexer.nextToken(Lexer.java:103) > at > org.apache.impala.analysis.ToSqlUtils.hiveNeedsQuotes(ToSqlUtils.java:145) > at > org.apache.impala.analysis.ToSqlUtils.getIdentSql(ToSqlUtils.java:199) > at org.apache.impala.analysis.SlotRef.(SlotRef.java:58) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyzeSelectClause(SelectStmt.java:283) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyze(SelectStmt.java:215) > at > org.apache.impala.analysis.SelectStmt$SelectAnalyzer.access$100(SelectStmt.java:199) > at org.apache.impala.analysis.SelectStmt.analyze(SelectStmt.java:192) > at > org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:473) > at > org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:437) > at > org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1530) > at > org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1497) > at > org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1467) > at > org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:154) > {code} > org.antlr.runtime.Lexer keeps emitting the same error message to stderr > (which is redirected to impalad.ERROR): > {code:java} > line 1:0 rule Identifier failed predicate: {allowQuotedId()}? > line 1:0 rule Identifier failed predicate: {allowQuotedId()}? > line 1:0 rule Identifier failed predicate: {allowQuotedId()}? > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IMPALA-10492) Lower default MAX_CNF_EXPRS query option
[ https://issues.apache.org/jira/browse/IMPALA-10492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Mate updated IMPALA-10492: Description: Currently MAX_CNF_EXPRS is set to unlimited by default, with a complex querie that container many predicates the CNF rewrite can lead to significant frontend memory usage and eventually OutOfMemory. A potential default value could be in the range of 1000, but it should work well with the current TCP-DS queries. cc.: [~amansinha], [~drorke] was: Currently MAX_CNF_EXPRS is set to unlimited by default, with a complex querie that container many predicates the CNF rewrite can lead to significant frontend memory usage and eventually OutOfMemory. A potential default value could be in the range of 1000. cc.: [~amansinha], [~drorke] > Lower default MAX_CNF_EXPRS query option > > > Key: IMPALA-10492 > URL: https://issues.apache.org/jira/browse/IMPALA-10492 > Project: IMPALA > Issue Type: Improvement > Components: Backend, Frontend >Affects Versions: Impala 4.0 >Reporter: Tamas Mate >Priority: Minor > Labels: ramp-up > > Currently MAX_CNF_EXPRS is set to unlimited by default, with a complex querie > that container many predicates the CNF rewrite can lead to significant > frontend memory usage and eventually OutOfMemory. A potential default value > could be in the range of 1000, but it should work well with the current > TCP-DS queries. > cc.: [~amansinha], [~drorke] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10492) Lower default MAX_CNF_EXPRS query option
[ https://issues.apache.org/jira/browse/IMPALA-10492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Mate updated IMPALA-10492: Description: Currently MAX_CNF_EXPRS is set to unlimited by default, with a complex querie that container many predicates the CNF rewrite can lead to significant frontend memory usage and eventually OutOfMemory. A potential default value could be in the range of 1000. cc.: [~amansinha], [~drorke] was: Currently MAX_CNF_EXPRS is set to unlimited by default, with complex queries that container many predicates the CNF rewrite can lead to significant frontend memory usage and eventually OutOfMemory. A potential default value could be in the range of 1000. cc.: [~amansinha], [~drorke] > Lower default MAX_CNF_EXPRS query option > > > Key: IMPALA-10492 > URL: https://issues.apache.org/jira/browse/IMPALA-10492 > Project: IMPALA > Issue Type: Improvement > Components: Backend, Frontend >Affects Versions: Impala 4.0 >Reporter: Tamas Mate >Priority: Minor > Labels: ramp-up > > Currently MAX_CNF_EXPRS is set to unlimited by default, with a complex querie > that container many predicates the CNF rewrite can lead to significant > frontend memory usage and eventually OutOfMemory. A potential default value > could be in the range of 1000. > cc.: [~amansinha], [~drorke] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10492) Lower default MAX_CNF_EXPRS query option
[ https://issues.apache.org/jira/browse/IMPALA-10492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Mate updated IMPALA-10492: Description: Currently MAX_CNF_EXPRS is set to unlimited by default, with complex queries that container many predicates the CNF rewrite can lead to significant frontend memory usage and eventually OutOfMemory. A potential default value could be in the range of 1000. cc.: [~amansinha], [~drorke] was: Currently MAX_CNF_EXPRS is set to unlimited by default, with complex queries the CNF rewrite can lead to significant frontend memory usage and eventually OutOfMemory. A potential default value could be in the range of 1000. cc.: [~amansinha], [~drorke] > Lower default MAX_CNF_EXPRS query option > > > Key: IMPALA-10492 > URL: https://issues.apache.org/jira/browse/IMPALA-10492 > Project: IMPALA > Issue Type: Improvement > Components: Backend, Frontend >Affects Versions: Impala 4.0 >Reporter: Tamas Mate >Priority: Minor > Labels: ramp-up > > Currently MAX_CNF_EXPRS is set to unlimited by default, with complex queries > that container many predicates the CNF rewrite can lead to significant > frontend memory usage and eventually OutOfMemory. A potential default value > could be in the range of 1000. > cc.: [~amansinha], [~drorke] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10492) Lower default MAX_CNF_EXPRS query option
Tamas Mate created IMPALA-10492: --- Summary: Lower default MAX_CNF_EXPRS query option Key: IMPALA-10492 URL: https://issues.apache.org/jira/browse/IMPALA-10492 Project: IMPALA Issue Type: Improvement Components: Backend, Frontend Affects Versions: Impala 4.0 Reporter: Tamas Mate Currently MAX_CNF_EXPRS is set to unlimited by default, with complex queries the CNF rewrite can lead to significant frontend memory usage and eventually OutOfMemory. A potential default value could be in the range of 1000. cc.: [~amansinha], [~drorke] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-10492) Lower default MAX_CNF_EXPRS query option
Tamas Mate created IMPALA-10492: --- Summary: Lower default MAX_CNF_EXPRS query option Key: IMPALA-10492 URL: https://issues.apache.org/jira/browse/IMPALA-10492 Project: IMPALA Issue Type: Improvement Components: Backend, Frontend Affects Versions: Impala 4.0 Reporter: Tamas Mate Currently MAX_CNF_EXPRS is set to unlimited by default, with complex queries the CNF rewrite can lead to significant frontend memory usage and eventually OutOfMemory. A potential default value could be in the range of 1000. cc.: [~amansinha], [~drorke] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org