[jira] [Created] (IMPALA-8116) Impala Doc: Create Impala Limitations doc
Alex Rodoni created IMPALA-8116: --- Summary: Impala Doc: Create Impala Limitations doc Key: IMPALA-8116 URL: https://issues.apache.org/jira/browse/IMPALA-8116 Project: IMPALA Issue Type: Improvement Components: Docs Affects Versions: Impala 3.1.0 Reporter: Alex Rodoni Assignee: Alex Rodoni Create a separate document that focuses on design limitations more than bugs. It could also include functional limitations like "cannot write nested types", etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-8090) DiskIoMgrTest.SyncReadTest hits file_ != nullptr DCHECK in LocalFileReader::ReadFromPos()
[ https://issues.apache.org/jira/browse/IMPALA-8090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8090. --- Resolution: Fixed Fix Version/s: Impala 3.2.0 > DiskIoMgrTest.SyncReadTest hits file_ != nullptr DCHECK in > LocalFileReader::ReadFromPos() > - > > Key: IMPALA-8090 > URL: https://issues.apache.org/jira/browse/IMPALA-8090 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: David Knupp >Assignee: Tim Armstrong >Priority: Critical > Fix For: Impala 3.2.0 > > > *Test output*: > {noformat} > 45/99 Test #45: disk-io-mgr-test .***Exception: Other 43.29 > sec > Turning perftools heap leak checking off > [==] Running 25 tests from 1 test case. > [--] Global test environment set-up. > [--] 25 tests from DiskIoMgrTest > [ RUN ] DiskIoMgrTest.SingleWriter > 19/01/16 15:57:09 INFO util.JvmPauseMonitor: Starting JVM pause monitor > [ OK ] DiskIoMgrTest.SingleWriter (3407 ms) > [ RUN ] DiskIoMgrTest.InvalidWrite > [ OK ] DiskIoMgrTest.InvalidWrite (281 ms) > [ RUN ] DiskIoMgrTest.WriteErrors > [ OK ] DiskIoMgrTest.WriteErrors (235 ms) > [ RUN ] DiskIoMgrTest.SingleWriterCancel > [ OK ] DiskIoMgrTest.SingleWriterCancel (1165 ms) > [ RUN ] DiskIoMgrTest.SingleReader > [ OK ] DiskIoMgrTest.SingleReader (5835 ms) > [ RUN ] DiskIoMgrTest.SingleReaderSubRanges > [ OK ] DiskIoMgrTest.SingleReaderSubRanges (16404 ms) > [ RUN ] DiskIoMgrTest.AddScanRangeTest > [ OK ] DiskIoMgrTest.AddScanRangeTest (1210 ms) > [ RUN ] DiskIoMgrTest.SyncReadTest > *** Check failure stack trace: *** > @ 0x4825dcc > @ 0x4827671 > @ 0x48257a6 > @ 0x4828d6d > @ 0x1af39ec > @ 0x1ae90a4 > @ 0x1ac30ea > @ 0x1accad3 > @ 0x1acc660 > @ 0x1acbf3e > @ 0x1acb62d > @ 0x1b03671 > @ 0x1f79988 > @ 0x1f82b60 > @ 0x1f82a84 > @ 0x1f82a47 > @ 0x3751579 > @ 0x3ea4807850 > @ 0x3ea44e894c > Wrote minidump to > /data/jenkins/workspace/<...>/repos/Impala/logs/be_tests/minidumps/disk-io-mgr-test/5bbf76f7-e5d6-4ac9-bdae9d9b-065c32ec.dmp > {noformat} > *Error*: > {noformat} > Operating system: Linux > 0.0.0 Linux 2.6.32-358.14.1.el6.centos.plus.x86_64 #1 SMP > Tue Jul 16 21:33:24 UTC 2013 x86_64 > CPU: amd64 > family 6 model 45 stepping 7 > 8 CPUs > GPU: UNKNOWN > Crash reason: SIGABRT > Crash address: 0x4522fa1 > Process uptime: not available > Thread 205 (crashed) > 0 libc-2.12.so + 0x328e5 > rax = 0x rdx = 0x0006 > rcx = 0x rbx = 0x06adf9c0 > rsi = 0x0563 rdi = 0x2fa1 > rbp = 0x7f8009b8ffe0 rsp = 0x7f8009b8fc78 > r8 = 0x7f8009b8fd00r9 = 0x0563 > r10 = 0x0008 r11 = 0x0202 > r12 = 0x06adfa40 r13 = 0x001f > r14 = 0x06ae7384 r15 = 0x06adf9c0 > rip = 0x003ea44328e5 > Found by: given as instruction pointer in context > 1 libc-2.12.so + 0x340c5 > rbp = 0x7f8009b8ffe0 rsp = 0x7f8009b8fc80 > rip = 0x003ea44340c5 > Found by: stack scanning > 2 disk-io-mgr-test!boost::_bi::bind_t impala::io::DiskQueue, impala::io::DiskIoMgr*>, > boost::_bi::list2, > boost::_bi::value > >::operator()() > [bind_template.hpp : 20 + 0x21] > rbp = 0x7f8009b8ffe0 rsp = 0x7f8009b8fc88 > rip = 0x01acbf3e > Found by: stack scanning > 3 disk-io-mgr-test!google::LogMessage::Flush() + 0x157 > rbx = 0x0007 rbp = 0x06adf980 > rsp = 0x7f8009b8fff0 rip = 0x048257a7 > Found by: call frame info > 4 disk-io-mgr-test!google::LogMessageFatal::~LogMessageFatal() + 0xe > rbx = 0x7f8009b90110 rbp = 0x7f8009b903f0 > rsp = 0x7f8009b90070 r12 = 0x0001 > r13 = 0x06aee8b8 r14 = 0x0c213538 > r15 = 0x0007 rip = 0x04828d6e > Found by: call frame info > 5 disk-io-mgr-test!impala::io::LocalFileReader::ReadFromPos(long, unsigned > char*, long, long*, bool*) [local-file-reader.cc : 67 + 0x10] > rbx = 0x0001 rbp = 0x7f8009b903f0 > rsp = 0x7f8009b90090 r12 = 0x0001 > r13 = 0x06aee8b8 r14 = 0x0c213538 > r15 = 0x0007 rip = 0x01af39ed > Found by: call frame in
[jira] [Resolved] (IMPALA-8107) Support EXEC_TIME_LIMIT_S in resource pool setting
[ https://issues.apache.org/jira/browse/IMPALA-8107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanlong Huang resolved IMPALA-8107. Resolution: Resolved Already supported by IMPALA-2538. Close this JIRA. > Support EXEC_TIME_LIMIT_S in resource pool setting > -- > > Key: IMPALA-8107 > URL: https://issues.apache.org/jira/browse/IMPALA-8107 > Project: IMPALA > Issue Type: New Feature >Reporter: Quanlong Huang >Priority: Major > Labels: admission-control > > Timeout limit should be different for different kinds of queries. For > example, resource pool for adhoc queries may set EXEC_TIME_LIMIT_S to 60s. > Resource pool for building pre-aggregaions or other ETL may need a larger > EXEC_TIME_LIMIT_S like 30 minutes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8115) some jenkins workers slow to spawn to to dpkg lock conflicts
Michael Brown created IMPALA-8115: - Summary: some jenkins workers slow to spawn to to dpkg lock conflicts Key: IMPALA-8115 URL: https://issues.apache.org/jira/browse/IMPALA-8115 Project: IMPALA Issue Type: Bug Components: Infrastructure Reporter: Michael Brown A Jenkins worker for label {{ubuntu-16.04}} took about 15 minutes to start doing real work. I noticed that it was retrying {{apt-get update}}: {noformat} ++ sudo apt-get --yes install openjdk-8-jdk E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily unavailable) E: Unable to lock the administration directory (/var/lib/dpkg/), is another process using it? ++ date Thu Jan 24 23:37:33 UTC 2019 ++ sudo apt-get update ++ sleep 10 ++ sudo apt-get --yes install openjdk-8-jdk [etc] {noformat} I ssh'd into a host and saw that, yes, something else was holding onto the dpkg log (confirmed with lsof and not pasted here. dpkg process PID 11459 was the culprit) {noformat} root 1750 0.0 0.0 4508 1664 ?Ss 23:21 0:00 /bin/sh /usr/lib/apt/apt.systemd.daily root 1804 12.3 0.1 141076 80452 ?S23:22 1:24 \_ /usr/bin/python3 /usr/bin/unattended-upgrade root 3263 0.0 0.1 140960 72896 ?S23:23 0:00 \_ /usr/bin/python3 /usr/bin/unattended-upgrade root 11459 0.6 0.0 45920 25184 pts/1Ss+ 23:24 0:03 \_ /usr/bin/dpkg --status-fd 10 --unpack --auto-deconfigure /var/cache/apt/archives/tzdata_2018i-0ubuntu0.16.04_all.deb /var/cache/apt/archives/distro-info-data_0.28ubuntu0.9_all.deb /var/cache/apt/archives/file_1%3a5.25-2ubuntu1.1_amd64.deb /var/cache/apt/archives/libmagic1_1%3a5.25-2ubuntu1.1_amd64.deb /var/cache/apt/archives/libisc-export160_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb /var/cache/apt/archives/libdns-export162_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb /var/cache/apt/archives/isc-dhcp-client_4.3.3-5ubuntu12.9_amd64.deb /var/cache/apt/archives/isc-dhcp-common_4.3.3-5ubuntu12.9_amd64.deb /var/cache/apt/archives/libidn11_1.32-3ubuntu1.2_amd64.deb /var/cache/apt/archives/libpng12-0_1.2.54-1ubuntu1.1_amd64.deb /var/cache/apt/archives/libtasn1-6_4.7-3ubuntu0.16.04.3_amd64.deb /var/cache/apt/archives/libapparmor-perl_2.10.95-0ubuntu2.10_amd64.deb /var/cache/apt/archives/apparmor_2.10.95-0ubuntu2.10_amd64.deb /var/cache/apt/archives/curl_7.47.0-1ubuntu2.11_amd64.deb /var/cache/apt/archives/libgssapi-krb5-2_1.13.2+dfsg-5ubuntu2.1_amd64.deb /var/cache/apt/archives/libkrb5-3_1.13.2+dfsg-5ubuntu2.1_amd64.deb /var/cache/apt/archives/libkrb5support0_1.13.2+dfsg-5ubuntu2.1_amd64.deb /var/cache/apt/archives/libk5crypto3_1.13.2+dfsg-5ubuntu2.1_amd64.deb /var/cache/apt/archives/libcurl3-gnutls_7.47.0-1ubuntu2.11_amd64.deb /var/cache/apt/archives/apt-transport-https_1.2.29ubuntu0.1_amd64.deb /var/cache/apt/archives/libicu55_55.1-7ubuntu0.4_amd64.deb /var/cache/apt/archives/libxml2_2.9.3+dfsg1-1ubuntu0.6_amd64.deb /var/cache/apt/archives/bind9-host_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb /var/cache/apt/archives/dnsutils_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb /var/cache/apt/archives/libisc160_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb /var/cache/apt/archives/libdns162_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb /var/cache/apt/archives/libisccc140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb /var/cache/apt/archives/libisccfg140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb /var/cache/apt/archives/liblwres141_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb /var/cache/apt/archives/libbind9-140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb /var/cache/apt/archives/openssl_1.0.2g-1ubuntu4.14_amd64.deb /var/cache/apt/archives/ca-certificates_20170717~16.04.1_all.deb /var/cache/apt/archives/libasprintf0v5_0.19.7-2ubuntu3.1_amd64.deb /var/cache/apt/archives/gettext-base_0.19.7-2ubuntu3.1_amd64.deb /var/cache/apt/archives/krb5-locales_1.13.2+dfsg-5ubuntu2.1_all.deb /var/cache/apt/archives/libelf1_0.165-3ubuntu1.1_amd64.deb /var/cache/apt/archives/libglib2.0-data_2.48.2-0ubuntu4.1_all.deb /var/cache/apt/archives/libnuma1_2.0.11-1ubuntu1.1_amd64.deb /var/cache/apt/archives/libpolkit-gobject-1-0_0.105-14.1ubuntu0.4_amd64.deb /var/cache/apt/archives/libx11-data_2%3a1.6.3-1ubuntu2.1_all.deb /var/cache/apt/archives/libx11-6_2%3a1.6.3-1ubuntu2.1_amd64.deb /var/cache/apt/archives/openssh-sftp-server_1%3a7.2p2-4ubuntu2.6_amd64.deb /var/cache/apt/archives/openssh-server_1%3a7.2p2-4ubuntu2.6_amd64.deb /var/cache/apt/archives/openssh-client_1%3a7.2p2-4ubuntu2.6_amd64.deb /var/cache/apt/archives/rsync_3.1.1-3ubuntu1.2_amd64.deb /var/cache/apt/archives/tcpdump_4.9.2-0ubuntu0.16.04.1_amd64.deb /var/cache/apt/archives/wget_1.17.1-1ubuntu1.4_amd64.deb /var/cache/apt/archives/python3-problem-report_2.20.1-0ubuntu2.18_all.deb /var/cache/apt/archives/python3-apport_2.20.1-0ubuntu2.18_all.deb /var/cache/apt/archives/apport_2.20.1-0ubuntu2
[jira] [Created] (IMPALA-8114) Build test failure in test_breakpad.py
Paul Rogers created IMPALA-8114: --- Summary: Build test failure in test_breakpad.py Key: IMPALA-8114 URL: https://issues.apache.org/jira/browse/IMPALA-8114 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.1.0 Reporter: Paul Rogers Assignee: Tim Armstrong Recent builds have failed due to a failure in {{test_breakpad.py}}. Assigning to Tim as the person who most recently touched this file. Test output: {noformat} 09:04:35 ERRORS 09:04:35 ___ ERROR at teardown of TestBreakpadExhaustive.test_minidump_cleanup_thread ___ 09:04:35 custom_cluster/test_breakpad.py:49: in teardown_method 09:04:35 self.kill_cluster(SIGKILL) 09:04:35 custom_cluster/test_breakpad.py:80: in kill_cluster 09:04:35 self.kill_processes(processes, signal) 09:04:35 custom_cluster/test_breakpad.py:85: in kill_processes 09:04:35 process.kill(signal) 09:04:35 common/impala_cluster.py:330: in kill 09:04:35 assert 0, "No processes %s found" % self.cmd 09:04:35 E AssertionError: No processes ['/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/impalad', '-kudu_client_rpc_timeout_ms', '0', '-kudu_master_hosts', 'localhost', '--mem_limit=12884901888', '-logbufsecs=5', '-v=1', '-max_log_files=0', '-log_filename=impalad', '-log_dir=/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/logs/custom_cluster_tests', '-beeswax_port=21000', '-hs2_port=21050', '-be_port=22000', '-krpc_port=27000', '-state_store_subscriber_port=23000', '-webserver_port=25000', '-max_minidumps=2', '-logbufsecs=1', '-minidump_path=/tmp/tmpKaSw_w', '--default_query_options='] found {noformat} Distilled {{TEST-impala-custom-cluster.xml}} output: {noformat} -- 2019-01-23 08:00:43,585 INFO MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) … -- 2019-01-23 08:00:43,667 INFO MainThread: Killing: /data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/statestored -logbufsecs=5 -v=1 -max_log_files=0 -log_filename=statestored -log_dir=/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/logs/custom_cluster_tests -max_minidumps=2 -logbufsecs=1 -minidump_path=/tmp/tmpKaSw_w (PID: 16809) with signal 10 -- 2019-01-23 08:00:43,692 INFO MainThread: Found 6 impalad/1 statestored/1 catalogd process(es) ... E AssertionError: No processes ['/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/impalad {noformat} Notice that the main thread appaars to be killing statestore, but fails to kill impalad. Notice that a message appears that says that all impalads are running in the midst of the code that tries to shut down the cluster. Is this test multi-threaded? Is there more than one “main thread” Are these main threads working at cross purposes? What recent change may have caused this? Also, looks like the script is sending signal 10 (SIGUSR1) while the statestore (in its log) says it got a SIGTERM (15): {noformat} I0123 08:00:44.086009 16868 thrift-client.cc:78] Couldn't open transport for impala-ec2-centoCaught signal: SIGTERM. Daemon will exit. {noformat} Not terribly familiar with this area of the product, so bumping it over to the BE team. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8113) test_aggregation and test_avro_primitive_in_list fail in S3
Michael Brown created IMPALA-8113: - Summary: test_aggregation and test_avro_primitive_in_list fail in S3 Key: IMPALA-8113 URL: https://issues.apache.org/jira/browse/IMPALA-8113 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.2.0 Reporter: Michael Brown Assignee: Michael Brown Likely more victims of our infra in S3. {noformat} query_test/test_aggregation.py:138: in test_aggregation result = self.execute_query(query, vector.get_value('exec_option')) common/impala_test_suite.py:597: in wrapper return function(*args, **kwargs) common/impala_test_suite.py:628: in execute_query return self.__execute_query(self.client, query, query_options) common/impala_test_suite.py:695: in __execute_query return impalad_client.execute(query, user=user) common/impala_connection.py:174: in execute return self.__beeswax_client.execute(sql_stmt, user=user) beeswax/impala_beeswax.py:182: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:359: in __execute_query self.wait_for_finished(handle) beeswax/impala_beeswax.py:380: in wait_for_finished raise ImpalaBeeswaxException("Query aborted:" + error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: EQuery aborted:Disk I/O error: Error reading from HDFS file: s3a://impala-test-uswest2-1/test-warehouse/alltypesagg_parquet/year=2010/month=1/day=8/5642b2da93dae1ad-494132e5_592013737_data.0.parq E Error(255): Unknown error 255 E Root cause: SdkClientException: Data read has a different length than the expected: dataLength=0; expectedLength=45494; includeSkipped=true; in.getClass()=class com.amazonaws.services.s3.AmazonS3Client$2; markedSupported=false; marked=0; resetSinceLastMarked=false; markCount=0; resetCount=0 {noformat} {noformat} query_test/test_nested_types.py:263: in test_avro_primitive_in_list "AvroPrimitiveInList.parquet", vector) query_test/test_nested_types.py:287: in __test_primitive_in_list result = self.execute_query("select item from %s.col1" % full_name, qopts) common/impala_test_suite.py:597: in wrapper return function(*args, **kwargs) common/impala_test_suite.py:628: in execute_query return self.__execute_query(self.client, query, query_options) common/impala_test_suite.py:695: in __execute_query return impalad_client.execute(query, user=user) common/impala_connection.py:174: in execute return self.__beeswax_client.execute(sql_stmt, user=user) beeswax/impala_beeswax.py:182: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:359: in __execute_query self.wait_for_finished(handle) beeswax/impala_beeswax.py:380: in wait_for_finished raise ImpalaBeeswaxException("Query aborted:" + error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: EQuery aborted:Disk I/O error: Failed to open HDFS file s3a://impala-test-uswest2-1/test-warehouse/test_avro_primitive_in_list_38f182c4.db/AvroPrimitiveInList/AvroPrimitiveInList.parquet E Error(2): No such file or directory E Root cause: FileNotFoundException: No such file or directory: s3a://impala-test-uswest2-1/test-warehouse/test_avro_primitive_in_list_38f182c4.db/AvroPrimitiveInList/AvroPrimitiveInList.parquet {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8112) test_cancel_select with debug action failed with unexpected error
Michael Brown created IMPALA-8112: - Summary: test_cancel_select with debug action failed with unexpected error Key: IMPALA-8112 URL: https://issues.apache.org/jira/browse/IMPALA-8112 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.2.0 Reporter: Michael Brown Assignee: Andrew Sherman Stacktrace {noformat} query_test/test_cancellation.py:241: in test_cancel_select self.execute_cancel_test(vector) query_test/test_cancellation.py:213: in execute_cancel_test assert 'Cancelled' in str(thread.fetch_results_error) E assert 'Cancelled' in "ImpalaBeeswaxException:\n INNER EXCEPTION: \n MESSAGE: Unable to open Kudu table: Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected (error 107)\n" E+ where "ImpalaBeeswaxException:\n INNER EXCEPTION: \n MESSAGE: Unable to open Kudu table: Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected (error 107)\n" = str(ImpalaBeeswaxException()) E+where ImpalaBeeswaxException() = .fetch_results_error {noformat} Standard Error {noformat} SET client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action; -- executing against localhost:21000 use tpch_kudu; -- 2019-01-18 17:50:03,100 INFO MainThread: Started query 4e4b3ab4cc7d:11efc3f5 SET client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action; SET batch_size=0; SET num_nodes=0; SET disable_codegen_rows_threshold=0; SET disable_codegen=False; SET abort_on_error=1; SET cpu_limit_s=10; SET debug_action=0:GETNEXT:WAIT|COORD_CANCEL_QUERY_FINSTANCES_RPC:FAIL; SET exec_single_node_rows_threshold=0; SET buffer_pool_limit=0; -- executing async: localhost:21000 select l_returnflag from lineitem; -- 2019-01-18 17:50:03,139 INFO MainThread: Started query fa4ddb9e62a01240:54c86ad SET client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action; -- connecting to: localhost:21000 -- fetching results from: -- getting state for operation: -- canceling operation: -- 2019-01-18 17:50:08,196 INFO Thread-4: Starting new HTTP connection (1): localhost -- closing query for operation handle: {noformat} [~asherman] please take a look since it looks like you touched code around this area last. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8111) Document workaround for some authentication issues with KRPC
Michael Ho created IMPALA-8111: -- Summary: Document workaround for some authentication issues with KRPC Key: IMPALA-8111 URL: https://issues.apache.org/jira/browse/IMPALA-8111 Project: IMPALA Issue Type: Task Components: Docs Affects Versions: Impala 3.1.0, Impala 2.12.0 Reporter: Michael Ho Assignee: Alex Rodoni There have been complaints from users about not being able to use Impala after upgrading to Impala version with KRPC enabled due to authentication issues. Please document them in the known issues or best practice guide. 1. https://issues.apache.org/jira/browse/IMPALA-7585: *Symptoms*: When using Impala with LDAP enabled, a user may hit the following: {noformat} Not authorized: Client connection negotiation failed: client connection to 127.0.0.1:27000: SASL(-1): generic failure: All-whitespace username. {noformat} *Root cause*: The following sequence can lead to the user "impala" not being created in /etc/passwd. {quote}time 1: no impala in LDAP; things get installed; impala created in /etc/passwd time 2: impala added to LDAP time 3: new machine added {quote} *Workaround*: - Manually edit /etc/passwd to add the impala user - Upgrade to a version of Impala with the patch IMPALA-7585 2. https://issues.apache.org/jira/browse/IMPALA-7298 *Symptoms*: When running with Kerberos enabled, a user may hit the following error: {noformat} WARNINGS: TransmitData() to X.X.X.X:27000 failed: Not authorized: Client connection negotiation failed: client connection to X.X.X.X:27000: Server impala/x.x@vpc.cloudera.com not found in Kerberos database {noformat} *Root cause*: KrpcDataStreamSender passes a resolved IP address when creating a proxy. Instead, we should pass both the resolved address and the hostname when creating the proxy so that we won't end up using the IP address as the hostname in the Kerberos principal. *Workaround*: - Set rdns=true in /etc/krb5.conf - Upgrade to a version of Impala with the fix of IMPALA-7298 3. https://issues.apache.org/jira/browse/KUDU-2198 *Symptoms*: When running with Kerberos enabled, a user may hit the following error message where is some random string which doesn't match the primary in the Kerberos principal {noformat} WARNINGS: TransmitData() to X.X.X.X:27000 failed: Remote error: Not authorized: {username='', principal='impala/redacted'} is not allowed to access DataStreamService {noformat} *Root cause*: Due to system "auth_to_local" mapping, the principal may be mapped to some local name. *Workaround*: - Start Impala with the flag {{--use_system_auth_to_local=false}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-7832) Support IF NOT EXISTS in alter table add columns
[ https://issues.apache.org/jira/browse/IMPALA-7832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fredy Wijaya resolved IMPALA-7832. -- Resolution: Fixed Fix Version/s: Impala 3.2.0 > Support IF NOT EXISTS in alter table add columns > > > Key: IMPALA-7832 > URL: https://issues.apache.org/jira/browse/IMPALA-7832 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: Thomas Tauber-Marshall >Assignee: Fredy Wijaya >Priority: Minor > Labels: ramp-up > Fix For: Impala 3.2.0 > > > alter table add [if not exists] columns ( [, > ...]) > would add the column only if a column of the same name does not already exist > Probably worth checking out what other databases do in different situations, > eg. if the column already exists but with a different type, if "replace" is > used instead of "add", etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8110) Parquet stat filtering does not handle narrowed int types correctly
Csaba Ringhofer created IMPALA-8110: --- Summary: Parquet stat filtering does not handle narrowed int types correctly Key: IMPALA-8110 URL: https://issues.apache.org/jira/browse/IMPALA-8110 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Csaba Ringhofer Impala can read int32 Parquet columns as tiny/smallint SQL columns. If the value does not fit into the 8/16 bit signed int's range, the value will overflow, e.g writing 128 as int32 and then rereading it as int8 will return -128. This is normal as far as I understand, but min/max stat filtering does not handle this case correctly: create table tnarrow (i int) stored as parquet; insert into tnarrow values (1), (201); alter table tnarrow change column i i tinyint; set PARQUET_READ_STATISTICS=0; select * from tnarrow where i < 0; -> returns 1 row: -56 set PARQUET_READ_STATISTICS=1; -> returns 0 row -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB
hakki created IMPALA-8109: - Summary: Impala cannot read the gzip files bigger than 2 GB Key: IMPALA-8109 URL: https://issues.apache.org/jira/browse/IMPALA-8109 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.12.0 Reporter: hakki When querying a partition containing gzip files, the query fails with the error below: WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: Error(255): Unknown error 255 Root cause: EOFException: Cannot seek to negative offset hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file has a size of bigger than 2 GB (approx: 2.4 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8108) Impala query returns TIMESTAMP values in different types
Robbie Zhang created IMPALA-8108: Summary: Impala query returns TIMESTAMP values in different types Key: IMPALA-8108 URL: https://issues.apache.org/jira/browse/IMPALA-8108 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Robbie Zhang When a timestamp has a .000 or .00 or .0 (when fraction value is zeros) the timestamp is displayed with no fraction of second. For example: {code:java} select cast(ts as timestamp) from (values ('2019-01-11 10:40:18' as ts), ('2019-01-11 10:40:19.0'), ('2019-01-11 10:40:19.00'), ('2019-01-11 10:40:19.000'), ('2019-01-11 10:40:19.'), ('2019-01-11 10:40:19.0'), ('2019-01-11 10:40:19.00'), ('2019-01-11 10:40:19.000'), ('2019-01-11 10:40:19.'), ('2019-01-11 10:40:19.0'), ('2019-01-11 10:40:19.1') ) t;{code} The output is: {code:java} +---+ |cast(ts as timestamp)| +---+ |2019-01-11 10:40:18| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19.1| +---+ {code} As we can see, values of the same column are returned in two different types. The inconsistency breaks some downstream use cases. The reason is that impala uses function boost::posix_time::to_simple_string(time_duration) to convert timestamp to a string and to_simple_string() remove fractional seconds if they are all zeros. Perhaps we can append ".0" if the length of the string is 8 (HH:MM:SS). For now we can work around it by using function from_timestamp(ts, '-mm-dd hh:mm.ss.s') to unify the output (convert to string), or using function millisecond(ts) to get fractional seconds. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8107) Support EXEC_TIME_LIMIT_S in resource pool setting
Quanlong Huang created IMPALA-8107: -- Summary: Support EXEC_TIME_LIMIT_S in resource pool setting Key: IMPALA-8107 URL: https://issues.apache.org/jira/browse/IMPALA-8107 Project: IMPALA Issue Type: New Feature Reporter: Quanlong Huang Timeout limit should be different for different kinds of queries. For example, resource pool for adhoc queries may set EXEC_TIME_LIMIT_S to 60s. Resource pool for building pre-aggregaions or other ETL may need a larger EXEC_TIME_LIMIT_S like 30 minutes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)