[jira] [Commented] (IMPALA-8250) Impala crashes with -Xcheck:jni

2019-02-26 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16778687#comment-16778687
 ] 

Philip Zeyliger commented on IMPALA-8250:
-

The first few of these were easy and https://gerrit.cloudera.org/#/c/12582/ 
captures them.

To look at the rest, I've been compiling OpenJDK8 
(http://cr.openjdk.java.net/~ihse/demo-new-build-readme/common/doc/building.html#tldr-instructions-for-the-impatient
 is surprisingly reasonably) with the following patch applied to the hotspot 
subdirectory:
{code}
$hg diff
diff -r 76a9c9cf14f1 src/share/vm/prims/jniCheck.cpp
--- a/src/share/vm/prims/jniCheck.cpp   Tue Jan 15 10:43:31 2019 +
+++ b/src/share/vm/prims/jniCheck.cpp   Tue Feb 26 15:22:56 2019 -0800
@@ -143,11 +143,18 @@
 static const char * fatal_instance_field_mismatch = "Field type (instance) 
mismatch in JNI get/set field operations";
 static const char * fatal_non_string = "JNI string operation received a 
non-string";

+// PHIL
+static inline void dump_native_stack(JavaThread* thr) {
+  frame fr = os::current_frame();
+  char buf[6000];
+  print_native_stack(tty, fr, thr, buf, sizeof(buf));
+}

 // When in VM state:
 static void ReportJNIWarning(JavaThread* thr, const char *msg) {
   tty->print_cr("WARNING in native method: %s", msg);
   thr->print_stack();
+  dump_native_stack(thr);
 }

 // When in NATIVE state:
@@ -199,11 +206,15 @@
   tty->print_cr("WARNING in native method: JNI call made without checking 
exceptions when required to from %s",
 thr->get_pending_jni_exception_check());
   thr->print_stack();
+
+  dump_native_stack(thr);
 )
 thr->clear_pending_jni_exception_check(); // Just complain once
   }
 }

+
+
 /**
  * Add to the planned number of handles. I.e. plus current live & warning 
threshold
  */
@@ -254,6 +265,7 @@
   tty->print_cr("WARNING: JNI local refs: %zu, exceeds capacity: %zu",
   live_handles, planned_capacity);
   thr->print_stack();
+  dump_native_stack(thr);
 )
 // Complain just the once, reset to current + warn threshold
 add_planned_handle_capacity(handles, 0);
{code}

This told me that the vast majority of issues are in Impala's HBase code. 

> Impala crashes with -Xcheck:jni
> ---
>
> Key: IMPALA-8250
> URL: https://issues.apache.org/jira/browse/IMPALA-8250
> Project: IMPALA
>  Issue Type: Task
>Reporter: Philip Zeyliger
>Priority: Major
>
> The JVM has a checker for JNI usage, and Impala (and libhdfs) have some 
> violations. This ticket captures figuring that out. At least one of the 
> issues can crash Impala.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8250) Impala crashes with -Xcheck:jni

2019-02-26 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-8250:
---

 Summary: Impala crashes with -Xcheck:jni
 Key: IMPALA-8250
 URL: https://issues.apache.org/jira/browse/IMPALA-8250
 Project: IMPALA
  Issue Type: Task
Reporter: Philip Zeyliger


The JVM has a checker for JNI usage, and Impala (and libhdfs) have some 
violations. This ticket captures figuring that out. At least one of the issues 
can crash Impala.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8188) Some SSDs are not properly detected as non-rotational

2019-02-11 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765552#comment-16765552
 ] 

Philip Zeyliger commented on IMPALA-8188:
-

https://unix.stackexchange.com/posts/308724/revisions seems to point to 
following {{/sys/class/block/nvme0n1p1}} (or similar) to look at this. It's all 
messy.

> Some SSDs are not properly detected as non-rotational
> -
>
> Key: IMPALA-8188
> URL: https://issues.apache.org/jira/browse/IMPALA-8188
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Joe McDonnell
>Priority: Critical
>
> Here is an example Impala log:
>  
> {noformat}
> I0211 10:50:40.650727 18344 init.cc:288] Disk Info: 
>   Num disks 2: 
> nvme0n (rotational=true)
> nvme0n1p (rotational=true){noformat}
> I logged into an equivalent machine, and the OS sees these as not rotational:
>  
>  
> {noformat}
> # cat /sys/block/nvme0n1/queue/rotational
> 0
> {noformat}
> Device names that end in a number get trimmed (i.e. /dev/sda2 becomes 
> /dev/sda). See 
> [https://github.com/apache/impala/blob/master/be/src/util/disk-info.cc#L73-L74]
> These devices don't follow that pattern, so we don't find the right files. 
> Neither /sys/block/nvme0n nor /sys/block/nvme0n1p exist, so both fall back to 
> being rotational.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6590) Disable expr rewrites and codegen for VALUES() statements

2019-02-09 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16764266#comment-16764266
 ] 

Philip Zeyliger commented on IMPALA-6590:
-

For purposes of reproduction, the following shows how not linear we are in 
number of columns in the VALUES statement:
{code}
$for i in 256 512 1024 2048 4096 8192 16384 32768; do (echo 'VALUES ('; for x 
in $(seq $i); do echo  "cast($x as string),"; done; echo "NULL); profile;") | 
time impala-shell.sh -f /dev/stdin |& grep Analysis; done   

   - Analysis finished: 35.027ms (34.359ms) 

   - Analysis finished: 76.808ms (75.678ms) 

   - Analysis finished: 188.936ms 
(186.829ms) 
 - Analysis finished: 
499.325ms (494.968ms)   
   - Analysis 
finished: 1s606ms (1s598ms) 
 - 
Analysis finished: 6s663ms (6s647ms)

  - Analysis finished: 29s844ms (29s812ms)
- Analysis finished: 2m37s (2m37s)
{code}

My ad-hoc jstacking suggests that there's an issue below as well as calling 
into the native code (serially, thereby encountering possibly a lot of JNI 
overhead). Looking the source, SelectStmt.java:291 is in a loop for every 
expression in the statement, and it ends up inserting it into a List. So, the 
number of {{equals()}} calls is quadratic.

{code}
"Thread-50" #70 prio=5 os_prio=0 tid=0x0b471000 nid=0x10cc runnable 
[0x7ff90190a000]
   java.lang.Thread.State: RUNNABLE
at org.apache.impala.analysis.SlotRef.localEquals(SlotRef.java:193)
at org.apache.impala.analysis.SlotRef$1.matches(SlotRef.java:206)
at org.apache.impala.analysis.Expr.matches(Expr.java:841)
at org.apache.impala.analysis.Expr.equals(Expr.java:865)
at 
org.apache.impala.analysis.ExprSubstitutionMap.get(ExprSubstitutionMap.java:67)
at 
org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyzeSelectClause(SelectStmt.java:291)
at 
org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyze(SelectStmt.java:223)
at 
org.apache.impala.analysis.SelectStmt$SelectAnalyzer.access$100(SelectStmt.java:207)
at org.apache.impala.analysis.SelectStmt.analyze(SelectStmt.java:200)
at 
org.apache.impala.analysis.UnionStmt$UnionOperand.analyze(UnionStmt.java:88)
at 
org.apache.impala.analysis.UnionStmt.analyzeOperands(UnionStmt.java:280)
at org.apache.impala.analysis.UnionStmt.analyze(UnionStmt.java:219)
at 
org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:448)
at 
org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:418)
at 
org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1282)
at 
org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1249)
at 
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1219)
at 
org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:168)
{code}

> Disable expr rewrites and codegen for VALUES() statements
> -
>
> Key: IMPALA-6590
> URL: https://issues.apache.org/jira/browse/IMPALA-6590
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 2.11.0
>Reporter: Alexander Behm
>Priority: Major
>  Labels: perf, planner, ramp-up, regression
>
> The analysis of statements with big VALUES clauses like INSERT INTO  
> VALUES is slow due to expression rewrites like constant folding. The 
> performance of such statements has regressed since the introduction of expr 
> rewrites and constant folding in IMPALA-1788.
> We should skip expr rewrites for VALUES altogether since it mostly provides 
> no benefit but can have a large overhead due to evaluation of expressions in 
> the backend (constant folding). These expressions are ultimately evaluated 
> and materialized in the backend anyway, so there's no point in folding them 
> during anal

[jira] [Commented] (IMPALA-8140) Grouping aggregation with limit breaks asan build

2019-01-31 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757501#comment-16757501
 ] 

Philip Zeyliger commented on IMPALA-8140:
-

If it's helpful, this reproduces in test-with-docker if using "--all-suites".

> Grouping aggregation with limit breaks asan build
> -
>
> Key: IMPALA-8140
> URL: https://issues.apache.org/jira/browse/IMPALA-8140
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0, Impala 3.2.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Blocker
>  Labels: asan, crash
>
> Commit 4af3a7853e9 for IMPALA-7333 breaks the following query on ASAN:
> {code:sql}
> select count(*) from tpch_parquet.orders o group by o.o_clerk limit 10;
> {code}
> {noformat}
> ==30219==ERROR: AddressSanitizer: use-after-poison on address 0x631000c4569c 
> at pc 0x020163cc bp 0x7f73a12a5700 sp 0x7f73a12a56f8
> READ of size 1 at 0x631000c4569c thread T276
> #0 0x20163cb in impala::Tuple::IsNull(impala::NullIndicatorOffset const&) 
> const /tmp/be/src/runtime/tuple.h:241:13
> #1 0x280c3d1 in 
> impala::AggFnEvaluator::SerializeOrFinalize(impala::Tuple*, 
> impala::SlotDescriptor const&, impala::Tuple*, void*) 
> /tmp/be/src/exprs/agg-fn-evaluator.cc:393:29
> #2 0x2777bc8 in 
> impala::AggFnEvaluator::Finalize(std::vector std::allocator > const&, impala::Tuple*, 
> impala::Tuple*) /tmp/be/src/exprs/agg-fn-evaluator.h:307:15
> #3 0x27add96 in 
> impala::GroupingAggregator::CleanupHashTbl(std::vector  std::allocator > const&, 
> impala::HashTable::Iterator) /tmp/be/src/exec/grouping-aggregator.cc:351:7
> #4 0x27ae2b2 in impala::GroupingAggregator::ClosePartitions() 
> /tmp/be/src/exec/grouping-aggregator.cc:930:5
> #5 0x27ae5f4 in impala::GroupingAggregator::Close(impala::RuntimeState*) 
> /tmp/be/src/exec/grouping-aggregator.cc:383:3
> #6 0x27637f7 in impala::AggregationNode::Close(impala::RuntimeState*) 
> /tmp/be/src/exec/aggregation-node.cc:139:32
> #7 0x206b7e9 in impala::FragmentInstanceState::Close() 
> /tmp/be/src/runtime/fragment-instance-state.cc:368:42
> #8 0x2066b1a in impala::FragmentInstanceState::Exec() 
> /tmp/be/src/runtime/fragment-instance-state.cc:99:3
> #9 0x2080e12 in 
> impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) 
> /tmp/be/src/runtime/query-state.cc:584:24
> #10 0x1d79036 in boost::function0::operator()() const 
> /opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:766:14
> #11 0x24bbe06 in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) 
> /tmp/be/src/util/thread.cc:359:3
> #12 0x24c72f8 in void boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), boost::_bi::list0&, int) 
> /opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525:9
> #13 0x24c714b in boost::_bi::bind_t std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> > 
> >::operator()() 
> /opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20:16
> #14 0x3c83949 in thread_proxy 
> (/home/lv/i4/be/build/debug/service/impalad+0x3c83949)
> #15 0x7f768ce73183 in start_thread 
> /build/eglibc-ripdx6/eglibc-2.19/nptl/pthread_create.c:312
> #16 0x7f768c98a03c in clone 
> /build/eglibc-ripdx6/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> {noformat}
> The problem seems to be that we call 
> {{output_partition_->aggregated_row_stream->Close()}} in 
> be/src/exec/grouping-aggregator.cc:284 when hitting the limit, and then later 
> the tuple creation in {{CleanupHashTbl()}} in 
> be/src/exec/grouping-aggregator.cc:341 reads from poisoned memory.
> A similar query does not show the crash:
> {code:sql}
> select count(*) from functional_parquet.alltypes a group by a.string_col 
> limit 2;
> {code}
> [~tarmstrong] - Do you have an idea why the query on a much smaller dataset 
> wouldn't crash?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8115) some jenkins workers slow to do real work due to dpkg lock conflicts

2019-01-28 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754462#comment-16754462
 ] 

Philip Zeyliger commented on IMPALA-8115:
-

The unattended upgrades seem to be triggered by a cron 
("usr/lib/apt/apt.systemd.daily"). It's possible to disable them (Googling has 
a few options), but the question is how to get that code to run before they 
run. I think Jenkins lets you configure "user-data" which lets you run custom 
stuff, assuming the instance has cloud-init, which it probably does. See 
https://aws.amazon.com/premiumsupport/knowledge-center/execute-user-data-ec2/ 
for example.

> some jenkins workers slow to do real work due to dpkg lock conflicts
> 
>
> Key: IMPALA-8115
> URL: https://issues.apache.org/jira/browse/IMPALA-8115
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Michael Brown
>Priority: Major
>
> A Jenkins worker for label {{ubuntu-16.04}} took about 15 minutes to start 
> doing real work. I noticed that it was retrying {{apt-get update}}:
> {noformat}
> ++ sudo apt-get --yes install openjdk-8-jdk
> E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily 
> unavailable)
> E: Unable to lock the administration directory (/var/lib/dpkg/), is another 
> process using it?
> ++ date
> Thu Jan 24 23:37:33 UTC 2019
> ++ sudo apt-get update
> ++ sleep 10
> ++ sudo apt-get --yes install openjdk-8-jdk
> [etc]
> {noformat}
> I ssh'd into a host and saw that, yes, something else was holding onto the 
> dpkg log (confirmed with lsof and not pasted here. dpkg process PID 11459 was 
> the culprit)
> {noformat}
> root   1750  0.0  0.0   4508  1664 ?Ss   23:21   0:00 /bin/sh 
> /usr/lib/apt/apt.systemd.daily
> root   1804 12.3  0.1 141076 80452 ?S23:22   1:24  \_ 
> /usr/bin/python3 /usr/bin/unattended-upgrade
> root   3263  0.0  0.1 140960 72896 ?S23:23   0:00  \_ 
> /usr/bin/python3 /usr/bin/unattended-upgrade
> root  11459  0.6  0.0  45920 25184 pts/1Ss+  23:24   0:03  \_ 
> /usr/bin/dpkg --status-fd 10 --unpack --auto-deconfigure 
> /var/cache/apt/archives/tzdata_2018i-0ubuntu0.16.04_all.deb 
> /var/cache/apt/archives/distro-info-data_0.28ubuntu0.9_all.deb 
> /var/cache/apt/archives/file_1%3a5.25-2ubuntu1.1_amd64.deb 
> /var/cache/apt/archives/libmagic1_1%3a5.25-2ubuntu1.1_amd64.deb 
> /var/cache/apt/archives/libisc-export160_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb
>  
> /var/cache/apt/archives/libdns-export162_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb
>  /var/cache/apt/archives/isc-dhcp-client_4.3.3-5ubuntu12.9_amd64.deb 
> /var/cache/apt/archives/isc-dhcp-common_4.3.3-5ubuntu12.9_amd64.deb 
> /var/cache/apt/archives/libidn11_1.32-3ubuntu1.2_amd64.deb 
> /var/cache/apt/archives/libpng12-0_1.2.54-1ubuntu1.1_amd64.deb 
> /var/cache/apt/archives/libtasn1-6_4.7-3ubuntu0.16.04.3_amd64.deb 
> /var/cache/apt/archives/libapparmor-perl_2.10.95-0ubuntu2.10_amd64.deb 
> /var/cache/apt/archives/apparmor_2.10.95-0ubuntu2.10_amd64.deb 
> /var/cache/apt/archives/curl_7.47.0-1ubuntu2.11_amd64.deb 
> /var/cache/apt/archives/libgssapi-krb5-2_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
> /var/cache/apt/archives/libkrb5-3_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
> /var/cache/apt/archives/libkrb5support0_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
> /var/cache/apt/archives/libk5crypto3_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
> /var/cache/apt/archives/libcurl3-gnutls_7.47.0-1ubuntu2.11_amd64.deb 
> /var/cache/apt/archives/apt-transport-https_1.2.29ubuntu0.1_amd64.deb 
> /var/cache/apt/archives/libicu55_55.1-7ubuntu0.4_amd64.deb 
> /var/cache/apt/archives/libxml2_2.9.3+dfsg1-1ubuntu0.6_amd64.deb 
> /var/cache/apt/archives/bind9-host_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/dnsutils_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/libisc160_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/libdns162_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/libisccc140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/libisccfg140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/liblwres141_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/libbind9-140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/openssl_1.0.2g-1ubuntu4.14_amd64.deb 
> /var/cache/apt/archives/ca-certificates_20170717~16.04.1_all.deb 
> /var/cache/apt/archives/libasprintf0v5_0.19.7-2ubuntu3.1_amd64.deb 
> /var/cache/apt/archives/gettext-base_0.19.7-2ubuntu3.1_amd64.deb 
> /var/cache/apt/archives/krb5-locales_1.13.2+dfsg-5ubuntu2.1_all.deb 
> /var/cache/apt/archives/libelf1_0.165-3ubuntu1.1_amd64.deb 
> /var/cache/apt/archives/libglib2.0-data_2.48.2-0ubuntu4

[jira] [Assigned] (IMPALA-5212) consider switching to pread by default

2019-01-28 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger reassigned IMPALA-5212:
---

Assignee: Sahil Takiar  (was: Philip Zeyliger)

> consider switching to pread by default
> --
>
> Key: IMPALA-5212
> URL: https://issues.apache.org/jira/browse/IMPALA-5212
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Silvius Rus
>Assignee: Sahil Takiar
>Priority: Major
>
> 1) Review the current HDFS tests. Validate that we have sufficient coverage, 
> add coverage if need be.
> 2) Switch to pread as default.
> 3) Consider deprecating the old read path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-5212) consider switching to pread by default

2019-01-28 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger reassigned IMPALA-5212:
---

Assignee: Philip Zeyliger

> consider switching to pread by default
> --
>
> Key: IMPALA-5212
> URL: https://issues.apache.org/jira/browse/IMPALA-5212
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Silvius Rus
>Assignee: Philip Zeyliger
>Priority: Major
>
> 1) Review the current HDFS tests. Validate that we have sufficient coverage, 
> add coverage if need be.
> 2) Switch to pread as default.
> 3) Consider deprecating the old read path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8120) Build Failure: Undetermined data load error for tpcds for Kudu

2019-01-28 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754399#comment-16754399
 ] 

Philip Zeyliger commented on IMPALA-8120:
-

As Joe mentioned, this smells a lot like IMPALA-8091. 

> Build Failure: Undetermined data load error for tpcds for Kudu
> --
>
> Key: IMPALA-8120
> URL: https://issues.apache.org/jira/browse/IMPALA-8120
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Paul Rogers
>Assignee: Lenisha Gandhi
>Priority: Blocker
>  Labels: build-failure
> Fix For: Impala 3.1.0
>
>
> The build for the latest master failed with an unspecified error when loading 
> data:
> {noformat}
> 03:07:58 Loading TPC-DS data (logging to 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/logs/data_loading/load-tpcds.log)...
>  
> 03:07:58 + echo 'Log for command '\''load-data' tpcds 'core'\'''
> 03:07:58 + START_TIME=38
> 03:07:58 + load-data tpcds core
> 03:11:31 + ELAPSED_TIME=213
> 03:11:31 + echo 'FAILED (Took: 3 min 33 sec)'
> 03:11:31 FAILED (Took: 3 min 33 sec)
> {noformat}
> Also:
> {noformat}
> 03:17:43 + echo '  Loading workload '\''tpcds'\'' using exploration strategy 
> '\''core'\'' OK (Took: 9 min 45 sec)'
> 03:17:43   Loading workload 'tpcds' using exploration strategy 'core' OK 
> (Took: 9 min 45 sec)
> 03:19:39 + ELAPSED_TIME=701
> 03:19:39 + echo 'FAILED (Took: 11 min 41 sec)'
> 03:19:39 FAILED (Took: 11 min 41 sec)
> 03:19:39 + echo ''\''load-data' functional-query 'exhaustive'\'' failed. 
> Tail of log:'
> {noformat}
> Tail of log:
> {noformat}
> 03:19:39 03:19:39 Error executing impala SQL: 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/logs/data_loading/sql/functional/create-functional-query-exhaustive-impala-generated-kudu-none-none.sql
>  See: 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/logs/data_loading/sql/functional/create-functional-query-exhaustive-impala-generated-kudu-none-none.sql.log
> ...
> 03:19:39 ++ report_build_error 85
> 03:19:39 +++ cd 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala
> 03:19:39 +++ awk 'NR == 85' 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/testdata/bin/create-load-data.sh
> 03:19:39 ++ ERROR_MSG='-cm_host)'
> 03:19:39 +++ basename -- 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/testdata/bin/create-load-data.sh
> 03:19:39 ++ FILENAME=create-load-data.sh
> 03:19:39 ++ echo ERROR in 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/testdata/bin/create-load-data.sh
>  at line 85: '-cm_host)'
> ...
> 03:19:39 + [[ RELEASE = \A\S\A\N ]]
> {noformat}
> From the catalogd log:
> {noformat}
> I0125 03:11:31.808512  9906 jni-util.cc:256] 
> org.apache.impala.common.ImpalaRuntimeException: Error creating Kudu table 
> 'impala::tpch_kudu.lineitem'
> at 
> org.apache.impala.service.KuduCatalogOpExecutor.createManagedTable(KuduCatalogOpExecutor.java:94)
> at 
> org.apache.impala.service.CatalogOpExecutor.createKuduTable(CatalogOpExecutor.java:1768)
> at 
> org.apache.impala.service.CatalogOpExecutor.createTable(CatalogOpExecutor.java:1686)
> at 
> org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:287)
> at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:154)
> Caused by: org.apache.kudu.client.NonRecoverableException: can not complete 
> before timeout: KuduRpc(method=ListTables, tablet=null, attempt=95, 
> DeadlineTracker(timeout=18, elapsed=178163), Traces: [0ms] querying 
> master, [41ms] Sub rpc: ConnectToMaster sending RPC to server 
> master-localhost:7051, [91ms] Sub rpc: ConnectToMaster received from server 
> master-localhost:7051 response Network error: java.net.ConnectException: 
> Connection refused: localhost/127.0.0.1:7051, [98ms] delaying RPC due to 
> Service unavailable: Master config (localhost:7051) has no leader. Exceptions 
> received: org.apache.kudu.client.RecoverableException: 
> java.net.ConnectException: Connection refused: localhost/127.0.0.1:7051, 
> [118ms] querying master, [121ms] Sub rpc: ConnectToMaster sending RPC to 
> server master-localhost:7051, [122ms] Sub rpc: ConnectToMaster received from 
> server master-localhost:7051 response Network error: 
> java.net.ConnectException: Connection refused: localhost/127.0.0.1:7051, 
> [122ms] delaying RPC due to Service unavailable: Master config 
> (localhost:7051) has no leader. Exceptions received: 
> org.apache.kudu.client.RecoverableException: java.net.ConnectException: 
> Connection refused: localhost/127.0.0.1:7051, [138ms] querying 

[jira] [Resolved] (IMPALA-6664) Tag log statements with query-ids

2019-01-25 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger resolved IMPALA-6664.
-
Resolution: Fixed

> Tag log statements with query-ids
> -
>
> Key: IMPALA-6664
> URL: https://issues.apache.org/jira/browse/IMPALA-6664
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Frontend
>Affects Versions: Impala 2.12.0
>Reporter: bharath v
>Assignee: Philip Zeyliger
>Priority: Major
>  Labels: supportability
>
> The idea is to include query-id in each logged statement in 
> coordinator/finsts/planner. This helps to collect query specific logs by just 
> grep'ing through the logs with the query-id. 
> This is hopefully easier to implement, now that we have the query debug info 
> available in the thread context [1].
> [1] https://issues.apache.org/jira/browse/IMPALA-3703



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8055) run-tests.py reports tests as passed even if the did not

2019-01-23 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16750438#comment-16750438
 ] 

Philip Zeyliger commented on IMPALA-8055:
-

I think this is the key:

bq. That is, run-tests.py when run with only this one test, produces both lines 
of output.

run-tests.py calls pytest multiple times, and you only sort of see the last 
" 2 passed ===" output. The other one is probably there, but it's easily 
missed. To give you an example from a recent run:
{code}
$curl --silent 
'https://jenkins.impala.io/job/gerrit-verify-dryrun/3660/consoleText' | grep -i 
'===.*in.*seconds'
] === 70 skipped, 140 xfailed in 98.21 seconds 
===
] === 1 failed, 2225 passed, 87 skipped, 51 xfailed in 2527.44 seconds 
===
] === 2 passed in 0.05 seconds 
===
{code}

I followed your footsteps and I think what you're seeing is the multiple runs 
of py.test within run-tests, with the "metrics" tests failing. The following 
patch identifies the failing tests at the end by collecting them, but I'm not 
entirely sure that's what we want. You can also pass {{-x}} to run-tests.py to 
have it fail at the first failure, which makes things easier to spot.

{code}
diff --git 
testdata/workloads/functional-query/queries/QueryTest/explain-level1.test 
testdata/workloads/functional-query/queries/QueryTest/explain-level1.test
index 9a6dea3..23365ae 100644
--- testdata/workloads/functional-query/queries/QueryTest/explain-level1.test
+++ testdata/workloads/functional-query/queries/QueryTest/explain-level1.test
@@ -14,7 +14,7 @@ row_regex:.*Per-Host Resource Estimates: Memory=[0-9.]*MB.*
 '|'
 '02:HASH JOIN [INNER JOIN, BROADCAST]'
 '|  hash predicates: l_orderkey = o_orderkey'
-'|  runtime filters: RF000 <- o_orderkey'
+'|  runtime filters: RF000 <- bogus'
 row_regex:.*row-size=.* cardinality=.*
 '|'
 '|--03:EXCHANGE [BROADCAST]'
diff --git tests/run-tests.py tests/run-tests.py
index 0de9ce9..8ea9b0c 100755
--- tests/run-tests.py
+++ tests/run-tests.py
@@ -83,6 +83,7 @@ class TestCounterPlugin(object):
   def __init__(self):
 self.tests_collected = set()
 self.tests_executed = set()
+self.failed_tests = []

   # pytest hook to handle test collection when xdist is used (parallel tests)
   # https://github.com/pytest-dev/pytest-xdist/pull/35/commits (No official 
documentation available)
@@ -100,11 +101,14 @@ class TestCounterPlugin(object):
   def pytest_runtest_logreport(self, report):
 if report.passed:
self.tests_executed.add(report.nodeid)
+if report.failed:
+  self.failed_tests.append(report.nodeid)

 class TestExecutor(object):
   def __init__(self, exit_on_error=True):
 self._exit_on_error = exit_on_error
 self.tests_failed = False
+self.failed_tests = []
 self.total_executed = 0

   def run_tests(self, args):
@@ -121,10 +125,13 @@ class TestExecutor(object):
 print(test)

 self.total_executed += len(testcounterplugin.tests_executed)
+self.failed_tests.extend(testcounterplugin.failed_tests)

 if 0 < pytest_exit_code < EXIT_NOTESTSCOLLECTED and self._exit_on_error:
   sys.exit(pytest_exit_code)
 self.tests_failed = 0 < pytest_exit_code < EXIT_NOTESTSCOLLECTED or 
self.tests_failed
+if len(self.failed_tests) > 0:
+  assert self.tests_failed

 def build_test_args(base_name, valid_dirs=VALID_TEST_DIRS):
   """
@@ -305,4 +312,7 @@ if __name__ == "__main__":
 run(args)

   if test_executor.tests_failed:
+print "Failed tests:"
+for t in test_executor.failed_tests:
+  print "  " + t
 sys.exit(1)
{code}

> run-tests.py reports tests as passed even if the did not
> 
>
> Key: IMPALA-8055
> URL: https://issues.apache.org/jira/browse/IMPALA-8055
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Paul Rogers
>Priority: Minor
>
> Been mucking about with the EXPLAIN output format which required rebasing a 
> bunch of tests on the new format. PlannerTest is fine: it clearly fails when 
> the expected ".test" files don't match the new "actual" files.
> When run on Jenkins in "pre-review" mode, the build does fail if a Python 
> end-to-end test fails. But, the job seems to give up at that point, not 
> running other tests and finding more problems. (There were three separate 
> test cases that needed fixing; took multiple runs to find them.)
> When run on my dev box, I get the following (highly abbreviated) output:
> {noformat}
> '|  in pipelines: 00(GETNEXT)' != '|  row-size=402B cardinality=5.76M'
> ...
> [gw3] PASSED 
> metadata/test_explain.py::TestExplain::test_explain_level0[protocol: beeswax 
> | exec_option: {'batch_size': 0, 'num_nodes': 

[jira] [Commented] (IMPALA-8064) test_min_max_filters is flaky

2019-01-18 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16746534#comment-16746534
 ] 

Philip Zeyliger commented on IMPALA-8064:
-

I think runtime_filter_wait_time_ms is the maximum time we'll wait. i.e., it's 
the deadline for the filters to show up. If they don't show up by then, then 
the query will proceed without the benefit of the runtime filters.

> test_min_max_filters is flaky 
> --
>
> Key: IMPALA-8064
> URL: https://issues.apache.org/jira/browse/IMPALA-8064
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Pooja Nilangekar
>Assignee: Janaki Lahorani
>Priority: Blocker
>  Labels: broken-build, flaky-test
> Attachments: profile.txt
>
>
> The following configuration of the test_min_max_filters:
> {code:java}
> query_test.test_runtime_filters.TestMinMaxFilters.test_min_max_filters[protocol:
>  beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: kudu/none]{code}
> It produces a higher aggregation of sum over the proberows than expected:
> {code:java}
> query_test/test_runtime_filters.py:113: in test_min_max_filters 
> self.run_test_case('QueryTest/min_max_filters', vector) 
> common/impala_test_suite.py:518: in run_test_case 
> update_section=pytest.config.option.update_results) 
> common/test_result_verifier.py:612: in verify_runtime_profile % 
> (function, field, expected_value, actual_value, actual)) 
> E   AssertionError: Aggregation of SUM over ProbeRows did not match expected 
> results. 
> E   EXPECTED VALUE: E   619 
> EACTUAL VALUE: E   652
> {code}
> This test was introduced in the patch for IMPALA-6533. The failure occurred 
> during an ASAN build. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8089) Sporadic upstream jenkins failures with "ERROR in bin/run-all-tests.sh at line 237: pkill -P $TIMEOUT_PID"

2019-01-18 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger reassigned IMPALA-8089:
---

Assignee: Philip Zeyliger  (was: Bikramjeet Vig)

> Sporadic upstream jenkins failures with "ERROR in bin/run-all-tests.sh at 
> line 237: pkill -P $TIMEOUT_PID"
> --
>
> Key: IMPALA-8089
> URL: https://issues.apache.org/jira/browse/IMPALA-8089
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: David Knupp
>Assignee: Philip Zeyliger
>Priority: Critical
>
> Example failure at:
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/4113/consoleFull
> {noformat}
> 04:32:09 ERROR in bin/run-all-tests.sh at line 237: pkill -P $TIMEOUT_PID
> 04:32:09 Generated: 
> /home/ubuntu/Impala/logs/extra_junit_xml_logs/generate_junitxml.buildall.run-all-tests.20190115_04_32_09.xml
> 04:32:09 + RET_CODE=1
> {noformat}
> Still looking, but I don't see any other obvious issues right now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8089) Sporadic upstream jenkins failures with "ERROR in bin/run-all-tests.sh at line 237: pkill -P $TIMEOUT_PID"

2019-01-18 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger resolved IMPALA-8089.
-
Resolution: Fixed

> Sporadic upstream jenkins failures with "ERROR in bin/run-all-tests.sh at 
> line 237: pkill -P $TIMEOUT_PID"
> --
>
> Key: IMPALA-8089
> URL: https://issues.apache.org/jira/browse/IMPALA-8089
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: David Knupp
>Assignee: Philip Zeyliger
>Priority: Critical
>
> Example failure at:
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/4113/consoleFull
> {noformat}
> 04:32:09 ERROR in bin/run-all-tests.sh at line 237: pkill -P $TIMEOUT_PID
> 04:32:09 Generated: 
> /home/ubuntu/Impala/logs/extra_junit_xml_logs/generate_junitxml.buildall.run-all-tests.20190115_04_32_09.xml
> 04:32:09 + RET_CODE=1
> {noformat}
> Still looking, but I don't see any other obvious issues right now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8089) Sporadic upstream jenkins failures with "ERROR in bin/run-all-tests.sh at line 237: pkill -P $TIMEOUT_PID"

2019-01-17 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16745587#comment-16745587
 ] 

Philip Zeyliger commented on IMPALA-8089:
-

See https://gerrit.cloudera.org/#/c/12230/ . It's not a legitimate timeout, I 
believe.

> Sporadic upstream jenkins failures with "ERROR in bin/run-all-tests.sh at 
> line 237: pkill -P $TIMEOUT_PID"
> --
>
> Key: IMPALA-8089
> URL: https://issues.apache.org/jira/browse/IMPALA-8089
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: David Knupp
>Assignee: Bikramjeet Vig
>Priority: Critical
>
> Example failure at:
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/4113/consoleFull
> {noformat}
> 04:32:09 ERROR in bin/run-all-tests.sh at line 237: pkill -P $TIMEOUT_PID
> 04:32:09 Generated: 
> /home/ubuntu/Impala/logs/extra_junit_xml_logs/generate_junitxml.buildall.run-all-tests.20190115_04_32_09.xml
> 04:32:09 + RET_CODE=1
> {noformat}
> Still looking, but I don't see any other obvious issues right now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7992) test_decimal_fuzz.py/test_decimal_ops failing in exhaustive runs

2019-01-08 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16737441#comment-16737441
 ] 

Philip Zeyliger commented on IMPALA-7992:
-

Then how can we be fixed by reducing the number of iterations?

> test_decimal_fuzz.py/test_decimal_ops failing in exhaustive runs
> 
>
> Key: IMPALA-7992
> URL: https://issues.apache.org/jira/browse/IMPALA-7992
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: bharath v
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build
>
> Error Message
> {noformat}
> query_test/test_decimal_fuzz.py:251: in test_decimal_ops 
> self.execute_one_decimal_op() query_test/test_decimal_fuzz.py:247: in 
> execute_one_decimal_op assert self.result_equals(expected_result, result) E 
> assert  >(Decimal('-0.80'), 
> None) E + where  > = 
> .result_equals
> {noformat}
> Stacktrace
> {noformat}
> query_test/test_decimal_fuzz.py:251: in test_decimal_ops 
> self.execute_one_decimal_op() query_test/test_decimal_fuzz.py:247: in 
> execute_one_decimal_op assert self.result_equals(expected_result, result) E 
> assert  >(Decimal('-0.80'), 
> None) E + where  > = 
> .result_equals
> {noformat}
> stderr
> {noformat}
> -- 2018-12-16 00:10:48,905 INFO MainThread: Started query 
> aa4b44ad5b34c3fb:24d18385
> SET decimal_v2=true;
> -- executing against localhost:21000
> select cast(-879550566.24 as decimal(11,2)) % 
> cast(-100.000 as decimal(28,5));
> -- 2018-12-16 00:10:48,979 INFO MainThread: Started query 
> b24acf22b1607dc6:4f287530
> SET decimal_v2=true;
> -- executing against localhost:21000
> select cast(17179869.184 as decimal(19,7)) / 
> cast(-87808593158000679814.7939232649738916 as decimal(38,17));
> -- 2018-12-16 00:10:49,054 INFO MainThread: Started query 
> 38435f02022e590a:18f7e97
> SET decimal_v2=true;
> -- executing against localhost:21000
> select cast(99 as decimal(32,2)) - 
> cast(-519203.671959101313 as decimal(18,12));
> -- 2018-12-16 00:10:49,132 INFO MainThread: Started query 
> 504edbac7ecb32ce:bfbbbe93
> ~ Stack of  (140061483271936) 
> ~
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/infra/python/env/lib/python2.6/site-packages/execnet/gateway_base.py",
>  line 277, in _perform_spawn
> reply.run()
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/infra/python/env/lib/python2.6/site-packages/execnet/gateway_base.py",
>  line 213, in run
> self._result = func(*args, **kwargs)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/infra/python/env/lib/python2.6/site-packages/execnet/gateway_base.py",
>  line 954, in _thread_receiver
> msg = Message.from_io(io)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/infra/python/env/lib/python2.6/site-packages/execnet/gateway_base.py",
>  line 418, in from_io
> header = io.read(9)  # type 1, channel 4, payload 4
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/infra/python/env/lib/python2.6/site-packages/execnet/gateway_base.py",
>  line 386, in read
> data = self._read(numbytes-len(buf))
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8055) run-tests.py reports tests as passed even if the did not

2019-01-07 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16736703#comment-16736703
 ] 

Philip Zeyliger commented on IMPALA-8055:
-

Are you just running into {{$MAX_PYTEST_FAILURES}} or something else? That's an 
option that configures when to give up. The thorny issue is that sometimes you 
want to run all of them, but, sometimes, if a test crashes impala, the 
gazillion test failures after the crash aren't particularly useful.

It seems like a test shouldn't have said "passed" if it "failed", and, if you 
know how to reproduce that, let's figure that out. Do you have a repro handy?

> run-tests.py reports tests as passed even if the did not
> 
>
> Key: IMPALA-8055
> URL: https://issues.apache.org/jira/browse/IMPALA-8055
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Paul Rogers
>Priority: Minor
>
> Been mucking about with the EXPLAIN output format which required rebasing a 
> bunch of tests on the new format. PlannerTest is fine: it clearly fails when 
> the expected ".test" files don't match the new "actual" files.
> When run on Jenkins in "pre-review" mode, the build does fail if a Python 
> end-to-end test fails. But, the job seems to give up at that point, not 
> running other tests and finding more problems. (There were three separate 
> test cases that needed fixing; took multiple runs to find them.)
> When run on my dev box, I get the following (highly abbreviated) output:
> {noformat}
> '|  in pipelines: 00(GETNEXT)' != '|  row-size=402B cardinality=5.76M'
> ...
> [gw3] PASSED 
> metadata/test_explain.py::TestExplain::test_explain_level0[protocol: beeswax 
> | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: text/none] 
> ...
>  6 passed in 68.63 seconds =
> {noformat}
> I've learned that "passed" means "maybe failed" and to go back and inspect 
> the actual output to figure out if the test did, indeed, fail. I suspect 
> "passed" means "didn't crash" rather than "tests worked."
> Would be very helpful to plumb the failure through to the summary line so it 
> said "3 passed, 3 failed" or whatever. Would be a huge time-saver.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7992) test_decimal_fuzz.py/test_decimal_ops failing in exhaustive runs

2019-01-07 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16736521#comment-16736521
 ] 

Philip Zeyliger commented on IMPALA-7992:
-

I think that means our randomness isn't seeded? That means the change to reduce 
iterations lost coverage (as opposed to getting coverage over time), which 
wasn't quite the intent.



> test_decimal_fuzz.py/test_decimal_ops failing in exhaustive runs
> 
>
> Key: IMPALA-7992
> URL: https://issues.apache.org/jira/browse/IMPALA-7992
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: bharath v
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build
>
> Error Message
> {noformat}
> query_test/test_decimal_fuzz.py:251: in test_decimal_ops 
> self.execute_one_decimal_op() query_test/test_decimal_fuzz.py:247: in 
> execute_one_decimal_op assert self.result_equals(expected_result, result) E 
> assert  >(Decimal('-0.80'), 
> None) E + where  > = 
> .result_equals
> {noformat}
> Stacktrace
> {noformat}
> query_test/test_decimal_fuzz.py:251: in test_decimal_ops 
> self.execute_one_decimal_op() query_test/test_decimal_fuzz.py:247: in 
> execute_one_decimal_op assert self.result_equals(expected_result, result) E 
> assert  >(Decimal('-0.80'), 
> None) E + where  > = 
> .result_equals
> {noformat}
> stderr
> {noformat}
> -- 2018-12-16 00:10:48,905 INFO MainThread: Started query 
> aa4b44ad5b34c3fb:24d18385
> SET decimal_v2=true;
> -- executing against localhost:21000
> select cast(-879550566.24 as decimal(11,2)) % 
> cast(-100.000 as decimal(28,5));
> -- 2018-12-16 00:10:48,979 INFO MainThread: Started query 
> b24acf22b1607dc6:4f287530
> SET decimal_v2=true;
> -- executing against localhost:21000
> select cast(17179869.184 as decimal(19,7)) / 
> cast(-87808593158000679814.7939232649738916 as decimal(38,17));
> -- 2018-12-16 00:10:49,054 INFO MainThread: Started query 
> 38435f02022e590a:18f7e97
> SET decimal_v2=true;
> -- executing against localhost:21000
> select cast(99 as decimal(32,2)) - 
> cast(-519203.671959101313 as decimal(18,12));
> -- 2018-12-16 00:10:49,132 INFO MainThread: Started query 
> 504edbac7ecb32ce:bfbbbe93
> ~ Stack of  (140061483271936) 
> ~
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/infra/python/env/lib/python2.6/site-packages/execnet/gateway_base.py",
>  line 277, in _perform_spawn
> reply.run()
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/infra/python/env/lib/python2.6/site-packages/execnet/gateway_base.py",
>  line 213, in run
> self._result = func(*args, **kwargs)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/infra/python/env/lib/python2.6/site-packages/execnet/gateway_base.py",
>  line 954, in _thread_receiver
> msg = Message.from_io(io)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/infra/python/env/lib/python2.6/site-packages/execnet/gateway_base.py",
>  line 418, in from_io
> header = io.read(9)  # type 1, channel 4, payload 4
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/infra/python/env/lib/python2.6/site-packages/execnet/gateway_base.py",
>  line 386, in read
> data = self._read(numbytes-len(buf))
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8025) End-to-end tests sometimes unhelpfully truncate error output

2018-12-28 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730545#comment-16730545
 ] 

Philip Zeyliger commented on IMPALA-8025:
-

{{run-tests.py}} is a wrapper around {{py.test}}. It's likely you can run the 
test directly with {{impala-py.test tests/metadata/test_explain.py -vv -s}} and 
get a bit more of what you want. 

If we want to always pass {{-vv}} to py.test, the invocation is at and around 
https://github.com/apache/impala/blob/master/tests/run-tests.py#L294 . 

> End-to-end tests sometimes unhelpfully truncate error output
> 
>
> Key: IMPALA-8025
> URL: https://issues.apache.org/jira/browse/IMPALA-8025
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Paul Rogers
>Priority: Minor
>
> Made a change to the DESCRIBE output per IMPALA-8021. This required 
> adjustment to {{metadata/test_explain.py}} to account for the change. The 
> test encodes a "golden" version in a .test file using a specialized syntax.
> But, when running the test, the output shows the first few lines (which do 
> match), then elides the rest:
> {noformat}
> E row_regex:.*mem-estimate=[0-9.]*[A-Z]*B mem-reservation=[0-9.]*[A-Z]*B 
> thread-reservation=0 == '|  mem-estimate=0B mem-reservation=0B 
> thread-reservation=0'
> E Detailed information truncated (45 more lines), use "-vv" to show
> {noformat}
> As it turns out, passing "-vv" to {{tests/run-tests.py}} does not seem to 
> pass it to the test program, so that did not work.
> The .xml file for the test contains the same message: output is truncated. 
> Same in the .log file.
> So, the question is, how is a developer to figure out the issue if we can't 
> see the actual error lines? This is the kind of thing that converts a simple 
> task into a multi-hour ordeal.
> Right now, the only solution is to rerun the tests with {{--update_results}} 
> flag to {{run-tests.py}}, then hunt down the generated output file.
> Better would be to output the n lines before the error, rather than the first 
> n lines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-8016) UDFs unable to use other functions defined in same jar

2018-12-27 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger closed IMPALA-8016.
---
Resolution: Fixed

> UDFs unable to use other functions defined in same jar
> --
>
> Key: IMPALA-8016
> URL: https://issues.apache.org/jira/browse/IMPALA-8016
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Critical
>
> The fix for IMPALA-7668 introduced a bug wherein a UDF calling other 
> functions within the same jar would fail with errors like:
> {code}
> WARNINGS: UDF WARNING: Hive UDF 
> path=hdfs://localhost:20500/test-warehouse/impala-hive-udfs2.jar 
> class=org.apache.impala.ImportsNearbyClassesUdf failed due to: 
> ImpalaRuntimeException: UDF::evaluate() ran into a problem.
> CAUSED BY: ImpalaRuntimeException: UDF failed to evaluate
> CAUSED BY: InvocationTargetException: null
> CAUSED BY: NoClassDefFoundError: org/apache/impala/UtilForUdf
> CAUSED BY: ClassNotFoundException: org.apache.impala.UtilForUdf
> {code}
> I believe this is caused by over-eagerly closing the associated class loader, 
> which was introduced recently in IMPALA-7668.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6544) Lack of S3 consistency leads to rare test failures

2018-12-26 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729196#comment-16729196
 ] 

Philip Zeyliger commented on IMPALA-6544:
-

Perhaps depressingly, {{hdfs fs -put}} triggers 24 HTTP requests to S3 to 
upload a small (in this case, 29 byte) file:
{code}
[root@philip-bb-3 ~]# HADOOP_ROOT_LOGGER=TRACE,console hadoop fs -put z  
s3a:///test$(date +%s) |& grep 'http.wire.*>>.*HTTP/1.1' | grep -n .
1:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-0 >> "HEAD / 
HTTP/1.1[\r][\n]"
2:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD / 
HTTP/1.1[\r][\n]"
3:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD /test1545858824 
HTTP/1.1[\r][\n]"
4:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD /test1545858824/ 
HTTP/1.1[\r][\n]"
5:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "GET 
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824%2F&fetch-owner=false
 HTTP/1.1[\r][\n]"
6:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "GET 
/?list-type=2&delimiter=%2F&max-keys=1&prefix=&fetch-owner=false 
HTTP/1.1[\r][\n]"
7:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
8:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_/ HTTP/1.1[\r][\n]"
9:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "GET 
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824._COPYING_%2F&fetch-owner=false
 HTTP/1.1[\r][\n]"
10:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
11:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_/ HTTP/1.1[\r][\n]"
12:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "GET 
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824._COPYING_%2F&fetch-owner=false
 HTTP/1.1[\r][\n]"
13:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
14:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_/ HTTP/1.1[\r][\n]"
15:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "GET 
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824._COPYING_%2F&fetch-owner=false
 HTTP/1.1[\r][\n]"
16:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "PUT 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
17:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
18:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD /test1545858824 
HTTP/1.1[\r][\n]"
19:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD /test1545858824/ 
HTTP/1.1[\r][\n]"
20:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "GET 
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824%2F&fetch-owner=false
 HTTP/1.1[\r][\n]"
21:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
22:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
23:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "PUT /test1545858824 
HTTP/1.1[\r][\n]"
24:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "DELETE 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
{code}

This includes the "HEAD before PUT" behavior that blows away read-after-write 
consistency. We definitely have some tests that use {{hdfs put}}. We have even 
more that use Impala to write, though, and it's less clear if this is going on 
there. (I've had some trouble getting these http wire logs out.)

> Lack of S3 consistency leads to rare test failures
> --
>
> Key: IMPALA-6544
> URL: https://issues.apache.org/jira/browse/IMPALA-6544
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Affects Versions: Impala 2.8.0
>Reporter: Sailesh Mukil
>Priority: Major
>  Labels: S3, broken-build, consistency, flaky, test-framework
>
> Every now and then, we hit a flaky test on S3 runs due to files missing when 
> they should be present, and vice versa. We could consider running our tests 
> (or a subset of our tests) with S3Guard to avoid these problems, however rare 
> they are.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8016) UDFs unable to use other functions defined in same jar

2018-12-26 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-8016:
---

 Summary: UDFs unable to use other functions defined in same jar
 Key: IMPALA-8016
 URL: https://issues.apache.org/jira/browse/IMPALA-8016
 Project: IMPALA
  Issue Type: Task
  Components: Frontend
Reporter: Philip Zeyliger
Assignee: Philip Zeyliger


The fix for IMPALA-7668 introduced a bug wherein a UDF calling other functions 
within the same jar would fail with errors like:
{code}
WARNINGS: UDF WARNING: Hive UDF 
path=hdfs://localhost:20500/test-warehouse/impala-hive-udfs2.jar 
class=org.apache.impala.ImportsNearbyClassesUdf failed due to: 
ImpalaRuntimeException: UDF::evaluate() ran into a problem.
CAUSED BY: ImpalaRuntimeException: UDF failed to evaluate
CAUSED BY: InvocationTargetException: null
CAUSED BY: NoClassDefFoundError: org/apache/impala/UtilForUdf
CAUSED BY: ClassNotFoundException: org.apache.impala.UtilForUdf
{code}

I believe this is caused by over-eagerly closing the associated class loader, 
which was introduced recently in IMPALA-7668.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6544) Lack of S3 consistency leads to rare test failures

2018-12-21 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16727092#comment-16727092
 ] 

Philip Zeyliger commented on IMPALA-6544:
-

Per 
https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel,
 we should be experiencing read-after-write consistency:
{quote}
Amazon S3 provides read-after-write consistency for PUTS of new objects in your 
S3 bucket in all regions with one caveat. The caveat is that if you make a HEAD 
or GET request to the key name (to find if the object exists) before creating 
the object, Amazon S3 provides eventual consistency for read-after-write.
{quote}

Perhaps we're re-using filenames across runs. Then we'd have effectively gotten 
one of these HEAD/GET requests.

Or perhaps HDFS is doing a GET before writing a file. 

I've not yet traced through these paths to figure out if we're hitting that.

> Lack of S3 consistency leads to rare test failures
> --
>
> Key: IMPALA-6544
> URL: https://issues.apache.org/jira/browse/IMPALA-6544
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Affects Versions: Impala 2.8.0
>Reporter: Sailesh Mukil
>Priority: Major
>  Labels: S3, broken-build, consistency, flaky, test-framework
>
> Every now and then, we hit a flaky test on S3 runs due to files missing when 
> they should be present, and vice versa. We could consider running our tests 
> (or a subset of our tests) with S3Guard to avoid these problems, however rare 
> they are.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7980) High system CPU time usage (and waste) when runtime filters filter out files

2018-12-14 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7980:
---

 Summary: High system CPU time usage (and waste) when runtime 
filters filter out files
 Key: IMPALA-7980
 URL: https://issues.apache.org/jira/browse/IMPALA-7980
 Project: IMPALA
  Issue Type: Task
Reporter: Philip Zeyliger


When running TPC-DS query 1 on scale factor 10,000 (10TB) on a 140-node cluster 
with {{replica_preference=remote}}, we observed really high system CPU usage 
for some of the scan nodes:
{code}
HDFS_SCAN_NODE (id=6):(Total: 59s107ms, non-child: 59s107ms, % non-
child: 100.00%
- BytesRead: 80.50 MB (84408563)
- ScannerThreadsSysTime: 36m17s
{code}
Using {perf}, we discovered a lot of usage of {futex_wait} and 
{pthread_cond_wait} and so on. (We also used perf to record context switches 
and cycles.) Interestingly, observing in top saw the really high system CPU 
usage spike some time into the query.

We believe what's going on is that we start many ScannerThread instances, which 
wait first until initial ranges have been issued and then grab data using 
{impala::io::ScanRange::GetNext()}. They do this in a loop, and it uses two 
locks, until the query is done or there are no {{num_unqueued_files_}} left. If 
num_unqueued_files_ is left above zero, then these threads just loop through 
two lock acquisitions and nothing else. We believe that this hot loop is eating 
system CPU aggressively.

It's a bit interesting that this is exacerbated in the case with more remote 
reads. Our best guess is that some of the reads take significantly longer in 
this case, and a single outlier can extend this period of waste.







--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7962) Figure out how to handle localtime in docker minicluster containers

2018-12-12 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16719294#comment-16719294
 ] 

Philip Zeyliger commented on IMPALA-7962:
-

See the {{test-with-docker.py}} source code (grep for {{localtime}}) for one 
approach. There are some "interesting" behaviors here.

> Figure out how to handle localtime in docker minicluster containers
> ---
>
> Key: IMPALA-7962
> URL: https://issues.apache.org/jira/browse/IMPALA-7962
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>
> The timezone from the host is not inherited by the container - it is instead 
> mapped to /usr/share/zoneinfo/Etc/UTC
> We should figure out if we need to fix this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-4555) Don't cancel query for failed ReportExecStatus (done=false) RPC

2018-12-10 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715567#comment-16715567
 ] 

Philip Zeyliger commented on IMPALA-4555:
-

I ran into {{ReportExecStatus request on impala.ControlService from 
172.26.26.40:59030 dropped due to backpressure. The service queue is full; it 
has 2147483647 items.}} recently, which this would have helped with.

If we're saying that {{ReportExecStatus(done=false)}} is less important than 
{{ReportExecStatus(done=true)}}, should we give them separate queues on the 
coordinator side?

> Don't cancel query for failed ReportExecStatus (done=false) RPC
> ---
>
> Key: IMPALA-4555
> URL: https://issues.apache.org/jira/browse/IMPALA-4555
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.7.0
>Reporter: Sailesh Mukil
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>
> We currently try to send the ReportExecStatus RPC up to 3 times if the first 
> 2 times are unsuccessful - due to high network load or a network partition. 
> If all 3 attempts fail, we cancel the fragment instance and hence the query.
> However, we do not need to cancel the fragment instance if sending the report 
> with _done=false_ failed. We can just skip this turn and try again the next 
> time.
> We could probably skip sending the report up to 2 times (if we're unable to 
> send due to high network load and if done=false) before succumbing to the 
> current behavior, which is to cancel the fragment instance. The point is to 
> try at a later time when the network load may be lower rather than try 
> quickly again. The chance that the network load would reduce in 100 ms is 
> less than in 5s.
> Also, we probably do not need to have the retry logic unless we've already 
> skipped twice or if done=true.
> This could help reduce the network load on the coordinator for highly 
> concurrent workloads.
> The only drawback I see now is that the QueryExecSummary might be stale for a 
> while (which it would have anyway because the RPCs would have failed to send)
> P.S: This above proposed solution may need to change if we go ahead with 
> IMPALA-2990.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7928) Investigate consistent placement of remote scan ranges

2018-12-05 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710544#comment-16710544
 ] 

Philip Zeyliger commented on IMPALA-7928:
-

I'm interested in the results even in the currently common case of the number 
of nodes not changing, but I agree that we'll eventually want more stability 
than that. 

> Investigate consistent placement of remote scan ranges
> --
>
> Key: IMPALA-7928
> URL: https://issues.apache.org/jira/browse/IMPALA-7928
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Joe McDonnell
>Priority: Major
>
> With the file handle cache, it is useful for repeated scans of the same file 
> to go to the same node, as that node will already have a file handle cached.
> When scheduling remote ranges, the scheduler introduces randomness that can 
> spread reads across all of the nodes. Repeated executions of queries on the 
> same set of files will not schedule the remote reads on the same nodes. This 
> causes a large amount of duplication across file handle caches on different 
> nodes. This reduces the efficiency of the cache significantly.
> It may be useful for the scheduler to introduce some determinism in 
> scheduling remote reads to take advantage of the file handle cache. This is a 
> variation on the well-known tradeoff between skew and locality.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7928) Investigate consistent placement of remote scan ranges

2018-12-04 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709460#comment-16709460
 ] 

Philip Zeyliger commented on IMPALA-7928:
-

We're roughly proposing to change the following bit to, instead of picking the 
least used executor from the heap, to pick the minimum of, say, five executors, 
determined by five distinct hashes of the filename. That should have limited 
skew (though we've not worked out a model for that) but let the file handle 
cache work linearly in cluster size.

{code}
const IpAddr* Scheduler::AssignmentCtx::SelectRemoteExecutor() {
  const IpAddr* candidate_ip;
  if (HasUnusedExecutors()) {
// Pick next unused executor.
candidate_ip = GetNextUnusedExecutorAndIncrement();
  } else {
// Pick next executor from assignment_heap. All executors must have been 
inserted into
// the heap at this point.
DCHECK_GT(executors_config_.NumBackends(), 0);
DCHECK_EQ(executors_config_.NumBackends(), assignment_heap_.size());
candidate_ip = &(assignment_heap_.top().ip);
  }
  DCHECK(candidate_ip != nullptr);
  return candidate_ip;
}
{code}

> Investigate consistent placement of remote scan ranges
> --
>
> Key: IMPALA-7928
> URL: https://issues.apache.org/jira/browse/IMPALA-7928
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Joe McDonnell
>Priority: Major
>
> With the file handle cache, it is useful for repeated scans of the same file 
> to go to the same node, as that node will already have a file handle cached.
> When scheduling remote ranges, the scheduler introduces randomness that can 
> spread reads across all of the nodes. Repeated executions of queries on the 
> same set of files will not schedule the remote reads on the same nodes. This 
> causes a large amount of duplication across file handle caches on different 
> nodes. This reduces the efficiency of the cache significantly.
> It may be useful for the scheduler to introduce some determinism in 
> scheduling remote reads to take advantage of the file handle cache. This is a 
> variation on the well-known tradeoff between skew and locality.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7825) Upgrade Thrift version to 0.11.0

2018-12-03 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16707816#comment-16707816
 ] 

Philip Zeyliger commented on IMPALA-7825:
-

We can revisit this as, "Let's provide Thrift 0.11 python generate code as 
well." That shouldn't encounter as much version resistance.

> Upgrade Thrift version to 0.11.0
> 
>
> Key: IMPALA-7825
> URL: https://issues.apache.org/jira/browse/IMPALA-7825
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Lars Volker
>Assignee: Sahil Takiar
>Priority: Major
>  Labels: performance
>
> Thrift has added performance improvements to its Python deserialization code. 
> We should upgrade to 0.11.0 to make use of those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6293) Shell commands run by Impala can fail when using the Java debugger

2018-11-08 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680024#comment-16680024
 ] 

Philip Zeyliger commented on IMPALA-6293:
-

Be aware that the way we initialize our JVM is via HDFS's libhdfs code, and 
JAVA_TOOL_OPTIONS is (perhaps implicitly) being used from there. For our own 
subprocesses, we should probably be scrubbing the environment.

> Shell commands run by Impala can fail when using the Java debugger
> --
>
> Key: IMPALA-6293
> URL: https://issues.apache.org/jira/browse/IMPALA-6293
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Joe McDonnell
>Priority: Major
>
> Impala has several parameters that specify shell commands for Impala to run:
> s3a_access_key_cmd
> s3a_secret_key_cmd
> ssl_private_key_password_cmd
> webserver_private_key_password_cmd
> When debugging the JVM inside the Impala process, it is useful to specify 
> JAVA_TOOL_OPTIONS to run the Java debugger on a particular port. However, 
> JAVA_TOOL_OPTIONS remains in the environment, so it is passed to these shell 
> commands. If any of these shell commands run java, then that JVM will attempt 
> to use the JAVA_TOOL_OPTIONS specified and thus try to bind to the same port. 
> The Impala process JVM is already bound to that port, so this will fail. 
> Several of these commands run at startup, so Impala will fail to startup with 
> the Java debugger.
> Impala should be careful about the environment variables that get passed to 
> these shell programs. In particular, JAVA_TOOL_OPTIONS should be scrubbed of 
> any Java debugger configuration to avoid these port conflicts. It might be 
> best to simply null out JAVA_TOOL_OPTIONS for these commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7799) Store periodic snapshots of Impalad metrics

2018-11-01 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672296#comment-16672296
 ] 

Philip Zeyliger commented on IMPALA-7799:
-

Great; makes sense.

> Store periodic snapshots of Impalad metrics
> ---
>
> Key: IMPALA-7799
> URL: https://issues.apache.org/jira/browse/IMPALA-7799
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.0, Impala 2.12.0, Impala 3.1.0
>Reporter: Michael Ho
>Priority: Critical
>
> Currently, each Impala demon exposes a set of metrics exposed via the debug 
> webpage metrics page. While this may be very helpful for development, there 
> are many incidents in which one may want to record these metrics to do 
> postmortem analysis for various issues.
> We should consider taking periodic snapshots of these Impalad metrics and 
> archive them for a certain retention period to allow for postmortem analysis. 
> We need to be mindful of the space usage concern (e.g. using compressed json 
> ?)
> To enable easier analysis, one may need to build a tool (or use some existing 
> off-the-shell libraries) to show the collected snapshots as time series and 
> calculate various statistics (e.g. mean, median, min, max etc).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7799) Store periodic snapshots of Impalad metrics

2018-11-01 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672291#comment-16672291
 ] 

Philip Zeyliger commented on IMPALA-7799:
-

I've used a handful of systems that dump these to logs. In the absence of 
fancier stuff, I suspect a thread which dumps the metrics-json to the log file 
every minute would be basically ok.

> Store periodic snapshots of Impalad metrics
> ---
>
> Key: IMPALA-7799
> URL: https://issues.apache.org/jira/browse/IMPALA-7799
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.0, Impala 2.12.0, Impala 3.1.0
>Reporter: Michael Ho
>Priority: Critical
>
> Currently, each Impala demon exposes a set of metrics exposed via the debug 
> webpage metrics page. While this may be very helpful for development, there 
> are many incidents in which one may want to record these metrics to do 
> postmortem analysis for various issues.
> We should consider taking periodic snapshots of these Impalad metrics and 
> archive them for a certain retention period to allow for postmortem analysis. 
> We need to be mindful of the space usage concern (e.g. using compressed json 
> ?)
> To enable easier analysis, one may need to build a tool (or use some existing 
> off-the-shell libraries) to show the collected snapshots as time series and 
> calculate various statistics (e.g. mean, median, min, max etc).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7795) Add a command to refresh authorization data

2018-11-01 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672276#comment-16672276
 ] 

Philip Zeyliger commented on IMPALA-7795:
-

As an explicit mechanism, I'm fine with adding this. That said, if the case is 
very common, we need a mechanism that doesn't require manual intervention, no?

> Add a command to refresh authorization data
> ---
>
> Key: IMPALA-7795
> URL: https://issues.apache.org/jira/browse/IMPALA-7795
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend
>Reporter: Fredy Wijaya
>Priority: Major
>
> There is currently no way to refresh authorization data (privileges and 
> principals) without calling "invalidate metadata" (which is very costly) and 
> increasing the polling time. There needs to be a way to explicitly refresh 
> the authorization data similar to "refresh functions".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7795) Add a command to refresh authorization data

2018-10-31 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670501#comment-16670501
 ] 

Philip Zeyliger commented on IMPALA-7795:
-

Note that we expose a lot of these already (refresh, invalidate, etc.) and 
they're hugely confusing.

> Add a command to refresh authorization data
> ---
>
> Key: IMPALA-7795
> URL: https://issues.apache.org/jira/browse/IMPALA-7795
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend
>Reporter: Fredy Wijaya
>Priority: Major
>
> There is currently no way to refresh authorization data (privileges and 
> principals) without calling "invalidate metadata" (which is very costly) and 
> increasing the polling time. There needs to be a way to explicitly refresh 
> the authorization data similar to "refresh functions".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7772) Print message when open file limit restricts size of file handle cache

2018-10-30 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669428#comment-16669428
 ] 

Philip Zeyliger commented on IMPALA-7772:
-

It seems like Impala will open files (e.g., jars, log files, libraries, etc.) 
outside of the file handle cache, so we should reserve some number of the OS 
limit as well.

> Print message when open file limit restricts size of file handle cache
> --
>
> Key: IMPALA-7772
> URL: https://issues.apache.org/jira/browse/IMPALA-7772
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
>
> The size of the file handle cache is determined by the 
> min(max_cached_file_handles, OS limit on number of open files). Right now, 
> there is no message printed if the OS limit on the number of open files is 
> less than max_cached_file_handles. This is confusing, because the file handle 
> cache will be smaller than expected. We should print a message on startup in 
> this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7785) GROUP BY clause not analyzed prior to rewrite step

2018-10-30 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669417#comment-16669417
 ] 

Philip Zeyliger commented on IMPALA-7785:
-

Yep, definitely makes sense. Definitely a bug. {{GROUP BY 1}} does seem to 
work, which isn't totally surprising.

If you have access to customer behavior, it'd be interesting to see if people 
set "EXPR_REWRITES=0" to work around these issues as a barometer of priority.

> GROUP BY clause not analyzed prior to rewrite step
> --
>
> Key: IMPALA-7785
> URL: https://issues.apache.org/jira/browse/IMPALA-7785
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Priority: Minor
>
> The FE fails to analyze a {{GROUP BY}} clause prior to invoking the rewrite 
> rules, causing the rules to fail to do any rewrites.
> For the {{SELECT}} list, the analyzer processes each expression and marks it 
> as analyzed.
> The rewrite rules, however, tend to skip unanalyzed nodes. (And, according to 
> IMPALA-7754, often are not re-analyzed after a rewrite.)
> Consider this simple query:
> {code:sql}
> SELECT case when string_col is not null then string_col else 'foo' end
> 
> FROM functional.alltypestiny 
> GROUP BY case when string_col is not null then string_col else 'foo' end  
>
> {code}
> This query works. Now, using the new feature in IMPALA-7655 with a query that 
> will be rewritten to the above:
> {code:sql}
> SELECT coalesce(string_col, 'foo')
> FROM functional.alltypes  
> GROUP BY coalesce(string_col, 'foo') 
> {code}
> The above is rewritten using the new conditional function rewrite rules. 
> Result:
> {noformat}
> org.apache.impala.common.AnalysisException:
>   select list expression not produced by aggregation output
>   (missing from GROUP BY clause?):
>   CASE WHEN string_col IS NOT NULL THEN string_col ELSE 'foo' END
> {noformat}
> The reason is the check used in multiple rewrite rules:
> {code:java}
>   public Expr apply(Expr expr, Analyzer analyzer) throws AnalysisException {  
> 
> if (!expr.isAnalyzed()) return expr;  
> 
> {code}
> Step though the code. The {{coalesce()}} expression in the {{SELECT}} clause 
> is analyzed, the one in the {{GROUP BY}} is not. This creates a problem 
> because SQL semantics require the identical expression in both clause for 
> them to match. (It also means no other rewrite rules, at least not those with 
> this check, are invoked, leading to an unintended code path.)
> This query makes it a bit clearer:
> {code:sql}
> SELECT 1 + 2
> FROM functional.alltypestiny
> GROUP BY 1 + 2
> {code}
> This works. But, if we use test code to inspect the "rewritten" {{GROUP BY}}, 
> we find that it is still at "1 + 2" while the {{SELECT}} expression has been 
> rewritten to "3".
> Seems that, when working with rewrites, we must be very careful because, as 
> the code currently is written, we rewrite some clauses but not others. Then, 
> we have to know when it is safe to have the SELECT clause differ from the 
> GROUP BY clause. (Looks like it is OK for constants to differ, but not for 
> functions...)
> VERY confusing, would be better to just fix the darn thing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7787) python26-incompatibility-check failed because of docker 503 Service Unavailable

2018-10-30 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669344#comment-16669344
 ] 

Philip Zeyliger commented on IMPALA-7787:
-

I added a retry loop (5 retries; sleep 30 seconds between) and separated out 
the "docker pull". We're dependent on the Docker public repos here, but perhaps 
their windows of failure are small enough that this will help.

> python26-incompatibility-check failed because of docker 503 Service 
> Unavailable
> ---
>
> Key: IMPALA-7787
> URL: https://issues.apache.org/jira/browse/IMPALA-7787
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Tim Armstrong
>Assignee: Philip Zeyliger
>Priority: Major
>  Labels: flaky
>
> https://jenkins.impala.io/job/python26-incompatibility-check/529
> https://jenkins.impala.io/job/python26-incompatibility-check/528
> {noformat}
> 15:50:37 Initialized empty Git repository in /tmp/tmp.MKJUMZ3SBi/.git/
> 15:50:37 + git fetch http://gerrit.cloudera.org:8080/Impala-ASF 
> refs/changes/00/11800/5
> 15:50:54 From http://gerrit.cloudera.org:8080/Impala-ASF
> 15:50:54  * branchrefs/changes/00/11800/5 -> FETCH_HEAD
> 15:50:54 + git archive --prefix=impala/ -o /tmp/impala.tar FETCH_HEAD
> 15:50:54 + docker run -u nobody -v /tmp/impala.tar:/tmp/impala.tar centos:6 
> bash -o pipefail -c 'cd /tmp; python -c '\''import 
> tarfile;tarfile.TarFile("/tmp/impala.tar").extractall()'\''; python -m 
> compileall /tmp/impala'
> 15:50:54 Unable to find image 'centos:6' locally
> 15:50:55 docker: Error response from daemon: Get 
> https://registry-1.docker.io/v2/library/centos/manifests/6: received 
> unexpected HTTP status: 503 Service Unavailable.
> 15:50:55 See 'docker run --help'.
> 15:50:55 Build step 'Execute shell' marked build as failure
> 15:50:55 Set build name.
> 15:50:55 New build name is '#529 refs/changes/00/11800/5'
> 15:50:57 Finished: FAILURE
> {noformat}
> This happened a couple of times. Looks like flakiness but unsure if it was 
> just a transient infra issue or something we're doing wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7785) GROUP BY clause cannot contain a CASE statement

2018-10-30 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669216#comment-16669216
 ] 

Philip Zeyliger commented on IMPALA-7785:
-

Hi Paul!

I tried this on a recent (two weeks old?) build and didn't have any trouble:
{code}
[localhost:21000] default> select CASE WHEN string_col IS NOT NULL THEN 
string_col ELSE 'foo' END from functional.alltypes group by CASE WHEN 
string_col IS NOT NULL THEN string_col ELSE 'foo' END limit 1;
Query: select CASE WHEN string_col IS NOT NULL THEN string_col ELSE 'foo' END 
from functional.alltypes group by CASE WHEN string_col IS NOT NULL THEN 
string_col ELSE 'foo' END limit 1
Query submitted at: 2018-10-30 12:10:49 (Coordinator: http://12987a4df8d5:25000)
Query progress can be monitored at: 
http://12987a4df8d5:25000/query_plan?query_id=6346330a763f4efb:87736383
+-+
| case when string_col is not null then string_col else 'foo' end |
+-+
| 8   |
+-+
Fetched 1 row(s) in 5.06s
{code}

Are you sure this isn't happening because of a change you have on your branch?

> GROUP BY clause cannot contain a CASE statement
> ---
>
> Key: IMPALA-7785
> URL: https://issues.apache.org/jira/browse/IMPALA-7785
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Priority: Minor
>
> The FE cannot handle a {{CASE}} statement in a {{GROUP BY}} clause. As a 
> result, the change in IMPALA-7655 cannot be applied to queries with such a 
> clause for fear of ending up in the situation shown later.
> Consider this simple query:
> {code:sql}
> SELECT coalesce(string_col, 'foo')
> FROM functional.alltypes  
> GROUP BY coalesce(string_col, 'foo') 
> {code}
> The above will fail with the following:
> {noformat}
>  org.apache.impala.common.AnalysisException:
>  select list expression not produced by aggregation output
>  (missing from GROUP BY clause?):
>  CASE WHEN string_col IS NOT NULL THEN string_col ELSE 'foo' END
> {noformat}
> This then causes the rewrites in IMPALA-7655 to fail:
> {code:sql}
> SELECT coalesce(string_col, 'foo')
> FROM functional.alltypes  
> GROUP BY coalesce(string_col, 'foo') 
> {code}
> The above is rewritten using the new conditional function rewrite rules. 
> Result:
> {noformat}
> org.apache.impala.common.AnalysisException:
>   select list expression not produced by aggregation output
>   (missing from GROUP BY clause?):
>   CASE WHEN string_col IS NOT NULL THEN string_col ELSE 'foo' END
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7730) Improve ORC File Format Timezone issues

2018-10-23 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger updated IMPALA-7730:

Attachment: orc.zip

> Improve ORC File Format Timezone issues
> ---
>
> Key: IMPALA-7730
> URL: https://issues.apache.org/jira/browse/IMPALA-7730
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Philip Zeyliger
>Priority: Major
> Attachments: orc.zip
>
>
> As pointed out in https://gerrit.cloudera.org/#/c/11731 by [~csringhofer], 
> our support for the ORC file format doesn't follow the same timezone 
> conventions as the rest of Impala.
> {quote}
> tldr: ORC's timezone handling is likely to be broken in Impala so we should 
> patch it in the toolchain
> The ORC library implements its own IANA timezone handling to convert stored 
> timestamps from UTC to local time + do something similar for min/max stats. 
> The writer's timezone can be also stored in .orc files and used instead of 
> local timezone.
> Impala's and ORC library's timezone can be different because of several 
> reasons:
> ORC's timezone is not overridden by env var TZ and query option timezone
> ORC uses a simpler way to detect the local timezone which may not work on 
> some Linux distros (see TimezoneDatabase::LocalZoneName in Impala vs 
> LOCAL_TIMEZONE in Orc)
> .orc files can use any time zone as writer's timezone and we cannot be sure 
> that it will exist on the reader machine
> My suggestion is to patch the ORC library in the toolchain and remove 
> timezone handling (e.g. by always using UTC, maybe depending on a flag), as 
> the way it is currently working is likely to be broken and is surely not 
> consistent with the rest of Impala.
> I am not sure how timezones could be handled correctly in Orc + Impala. If 
> someone plans to work on it, I would gladly help in the integration to Impala.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7730) Improve ORC File Format Timezone issues

2018-10-23 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16660743#comment-16660743
 ] 

Philip Zeyliger commented on IMPALA-7730:
-

I've attached the {{alltypes_orc_def}} directory from HDFS which exhibits this 
problem for me.

> Improve ORC File Format Timezone issues
> ---
>
> Key: IMPALA-7730
> URL: https://issues.apache.org/jira/browse/IMPALA-7730
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Philip Zeyliger
>Priority: Major
> Attachments: orc.zip
>
>
> As pointed out in https://gerrit.cloudera.org/#/c/11731 by [~csringhofer], 
> our support for the ORC file format doesn't follow the same timezone 
> conventions as the rest of Impala.
> {quote}
> tldr: ORC's timezone handling is likely to be broken in Impala so we should 
> patch it in the toolchain
> The ORC library implements its own IANA timezone handling to convert stored 
> timestamps from UTC to local time + do something similar for min/max stats. 
> The writer's timezone can be also stored in .orc files and used instead of 
> local timezone.
> Impala's and ORC library's timezone can be different because of several 
> reasons:
> ORC's timezone is not overridden by env var TZ and query option timezone
> ORC uses a simpler way to detect the local timezone which may not work on 
> some Linux distros (see TimezoneDatabase::LocalZoneName in Impala vs 
> LOCAL_TIMEZONE in Orc)
> .orc files can use any time zone as writer's timezone and we cannot be sure 
> that it will exist on the reader machine
> My suggestion is to patch the ORC library in the toolchain and remove 
> timezone handling (e.g. by always using UTC, maybe depending on a flag), as 
> the way it is currently working is likely to be broken and is surely not 
> consistent with the rest of Impala.
> I am not sure how timezones could be handled correctly in Orc + Impala. If 
> someone plans to work on it, I would gladly help in the integration to Impala.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7655) Codegen output for conditional functions (if,isnull, coalesce) is very suboptimal

2018-10-22 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16659783#comment-16659783
 ] 

Philip Zeyliger commented on IMPALA-7655:
-

[~Paul.Rogers],

That seems like a long list! Are we confident that queries like the stats one 
where converting IF to CASE is going to be a performance win? You might want to 
break this out into smaller reviews for your reviewers' sanity.

> Codegen output for conditional functions (if,isnull, coalesce) is very 
> suboptimal
> -
>
> Key: IMPALA-7655
> URL: https://issues.apache.org/jira/browse/IMPALA-7655
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: codegen, perf, performance
>
> https://gerrit.cloudera.org/#/c/11565/ provided a clue that an aggregation 
> involving an if() function was very slow, 10x slower than the equivalent 
> version using a case:
> {noformat}
> [localhost:21000] default> set num_nodes=1; set mt_dop=1; select count(case 
> when l_orderkey is NULL then 1 else NULL end) from 
> tpch10_parquet.lineitem;summary;
> NUM_NODES set to 1
> MT_DOP set to 1
> Query: select count(case when l_orderkey is NULL then 1 else NULL end) from 
> tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 11:17:31 (Coordinator: 
> http://tarmstrong-box:25000)
> Query progress can be monitored at: 
> http://tarmstrong-box:25000/query_plan?query_id=274b2a6f35cefe31:95a19642
> +--+
> | count(case when l_orderkey is null then 1 else null end) |
> +--+
> | 0|
> +--+
> Fetched 1 row(s) in 0.51s
> +--++--+--+++--+---+-+
> | Operator | #Hosts | Avg Time | Max Time | #Rows  | Est. #Rows | Peak 
> Mem | Est. Peak Mem | Detail  |
> +--++--+--+++--+---+-+
> | 01:AGGREGATE | 1  | 44.03ms  | 44.03ms  | 1  | 1  | 25.00 
> KB | 10.00 MB  | FINALIZE|
> | 00:SCAN HDFS | 1  | 411.57ms | 411.57ms | 59.99M | -1 | 16.61 
> MB | 88.00 MB  | tpch10_parquet.lineitem |
> +--++--+--+++--+---+-+
> [localhost:21000] default> set num_nodes=1; set mt_dop=1; select 
> count(if(l_orderkey is NULL, 1, NULL)) from tpch10_parquet.lineitem;summary;
> NUM_NODES set to 1
> MT_DOP set to 1
> Query: select count(if(l_orderkey is NULL, 1, NULL)) from 
> tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 11:23:07 (Coordinator: 
> http://tarmstrong-box:25000)
> Query progress can be monitored at: 
> http://tarmstrong-box:25000/query_plan?query_id=8e46ab1b84c4dbff:2786ca26
> ++
> | count(if(l_orderkey is null, 1, null)) |
> ++
> | 0  |
> ++
> Fetched 1 row(s) in 1.01s
> +--++--+--+++--+---+-+
> | Operator | #Hosts | Avg Time | Max Time | #Rows  | Est. #Rows | Peak 
> Mem | Est. Peak Mem | Detail  |
> +--++--+--+++--+---+-+
> | 01:AGGREGATE | 1  | 422.07ms | 422.07ms | 1  | 1  | 25.00 
> KB | 10.00 MB  | FINALIZE|
> | 00:SCAN HDFS | 1  | 511.13ms | 511.13ms | 59.99M | -1 | 16.61 
> MB | 88.00 MB  | tpch10_parquet.lineitem |
> +--++--+--+++--+---+-+
> {noformat}
> It turns out that this is because we don't have good codegen support for 
> ConditionalFunction, and just fall back to emitting a call to the interpreted 
> path: 
> https://github.com/apache/impala/blob/master/be/src/exprs/conditional-functions.cc#L28
> See CaseExpr for an example of much better codegen support: 
> https://github.com/apache/impala/blob/master/be/src/exprs/case-expr.cc#L178



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7738) Implement timeouts for HDFS calls

2018-10-22 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16659622#comment-16659622
 ] 

Philip Zeyliger commented on IMPALA-7738:
-

Hadoop should be exposing settings for all the timeouts. 
{{ipc.client.rpc-timeout.ms}} is one such. See also IMPALA-6189 for a watchdog 
approach.

(I whole-heartedly agree that we should be timing out.)

> Implement timeouts for HDFS calls
> -
>
> Key: IMPALA-7738
> URL: https://issues.apache.org/jira/browse/IMPALA-7738
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, 
> Impala 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Priority: Critical
>
> Currently, there is no timeout with the various HDFS calls (e.g. hdfsOpen(), 
> hdfsRead()) we made in libhdfs.so in either the disk-io-mgr thread or scanner 
> thread context. Various users of Impala have complaint in the past about hung 
> queries which eventually boiled down to stuck hdfs calls. HDFS maintainers 
> have been slow to find the root cause of those hangs. To make this kind of 
> stuck queries problem easier to identify in the future, we should just 
> enforce a timeout in various hdfs calls so the queries will fail when certain 
> HDFS calls take longer than a designated timeout period.
> There may be multiple layers which this timeout can be enforced:
>  * at Impala level, we can have a fixed sized thread pool which handles all 
> hdfs calls. The existing hdfs calls will be a wrapper with a timeout.
>  * at libhdfs.so, enforce a timeout at places in the HDFS client code which 
> may block forever.
> The second option is probably beyond the charter of Apache Impala project.
> cc'ing [~tarmstr...@cloudera.com], [~joemcdonnell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7655) Codegen output for conditional functions (if,isnull, coalesce) is very suboptimal

2018-10-19 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657540#comment-16657540
 ] 

Philip Zeyliger commented on IMPALA-7655:
-

I think I steered you a bit wrong: codegen is probably enabled in debug builds 
(though there's a query option to disable it which sometimes kicks in!). The 
thing I was remembering is this:
{code}
#ifndef NDEBUG
  // For debug builds, don't generate JIT compiled optimized assembly.
  // This takes a non-neglible amount of time (~.5 ms per function) and
  // blows up the fe tests (which take ~10-20 ms each).
  opt_level = llvm::CodeGenOpt::None;
#endif
{code}
That is to say, we skip LLVM optimization in debug builds.

> Codegen output for conditional functions (if,isnull, coalesce) is very 
> suboptimal
> -
>
> Key: IMPALA-7655
> URL: https://issues.apache.org/jira/browse/IMPALA-7655
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: codegen, perf, performance
>
> https://gerrit.cloudera.org/#/c/11565/ provided a clue that an aggregation 
> involving an if() function was very slow, 10x slower than the equivalent 
> version using a case:
> {noformat}
> [localhost:21000] default> set num_nodes=1; set mt_dop=1; select count(case 
> when l_orderkey is NULL then 1 else NULL end) from 
> tpch10_parquet.lineitem;summary;
> NUM_NODES set to 1
> MT_DOP set to 1
> Query: select count(case when l_orderkey is NULL then 1 else NULL end) from 
> tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 11:17:31 (Coordinator: 
> http://tarmstrong-box:25000)
> Query progress can be monitored at: 
> http://tarmstrong-box:25000/query_plan?query_id=274b2a6f35cefe31:95a19642
> +--+
> | count(case when l_orderkey is null then 1 else null end) |
> +--+
> | 0|
> +--+
> Fetched 1 row(s) in 0.51s
> +--++--+--+++--+---+-+
> | Operator | #Hosts | Avg Time | Max Time | #Rows  | Est. #Rows | Peak 
> Mem | Est. Peak Mem | Detail  |
> +--++--+--+++--+---+-+
> | 01:AGGREGATE | 1  | 44.03ms  | 44.03ms  | 1  | 1  | 25.00 
> KB | 10.00 MB  | FINALIZE|
> | 00:SCAN HDFS | 1  | 411.57ms | 411.57ms | 59.99M | -1 | 16.61 
> MB | 88.00 MB  | tpch10_parquet.lineitem |
> +--++--+--+++--+---+-+
> [localhost:21000] default> set num_nodes=1; set mt_dop=1; select 
> count(if(l_orderkey is NULL, 1, NULL)) from tpch10_parquet.lineitem;summary;
> NUM_NODES set to 1
> MT_DOP set to 1
> Query: select count(if(l_orderkey is NULL, 1, NULL)) from 
> tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 11:23:07 (Coordinator: 
> http://tarmstrong-box:25000)
> Query progress can be monitored at: 
> http://tarmstrong-box:25000/query_plan?query_id=8e46ab1b84c4dbff:2786ca26
> ++
> | count(if(l_orderkey is null, 1, null)) |
> ++
> | 0  |
> ++
> Fetched 1 row(s) in 1.01s
> +--++--+--+++--+---+-+
> | Operator | #Hosts | Avg Time | Max Time | #Rows  | Est. #Rows | Peak 
> Mem | Est. Peak Mem | Detail  |
> +--++--+--+++--+---+-+
> | 01:AGGREGATE | 1  | 422.07ms | 422.07ms | 1  | 1  | 25.00 
> KB | 10.00 MB  | FINALIZE|
> | 00:SCAN HDFS | 1  | 511.13ms | 511.13ms | 59.99M | -1 | 16.61 
> MB | 88.00 MB  | tpch10_parquet.lineitem |
> +--++--+--+++--+---+-+
> {noformat}
> It turns out that this is because we don't have good codegen support for 
> ConditionalFunction, and just fall back to emitting a call to the interpreted 
> path: 
> https://github.com/apache/impala/blob/master/be/src/exprs/conditional-functions.cc#L28
> See CaseExpr for an example of much better codegen support: 
> https://github.com/apache/impala/blob/master/be/src/exprs/case-expr.cc#L1

[jira] [Commented] (IMPALA-7732) Check / Implement resource limits documented in IMPALA-5605

2018-10-19 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657296#comment-16657296
 ] 

Philip Zeyliger commented on IMPALA-7732:
-

If we tackle this, I think it makes sense to dump {{/proc/self/limits}} into 
the log at startup time. It's small enough that it'd be handy regardless.

> Check / Implement resource limits documented in IMPALA-5605
> ---
>
> Key: IMPALA-7732
> URL: https://issues.apache.org/jira/browse/IMPALA-7732
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Priority: Critical
>
> IMPALA-5605 documents a list of recommended bump in system resource limits 
> which may be necessary when running Impala at scale. We may consider checking 
> those limits at startup with {{getrlimit()}} and potentially setting them 
> with {{setrlimit()}} if possible. At the minimum, may be helpful to log a 
> warning message if the limit is below certain threshold.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7730) Improve ORC File Format Timezone issues

2018-10-19 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657059#comment-16657059
 ] 

Philip Zeyliger commented on IMPALA-7730:
-

[~stiga-huang]: you might be interested in this.

> Improve ORC File Format Timezone issues
> ---
>
> Key: IMPALA-7730
> URL: https://issues.apache.org/jira/browse/IMPALA-7730
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Philip Zeyliger
>Priority: Major
>
> As pointed out in https://gerrit.cloudera.org/#/c/11731 by [~csringhofer], 
> our support for the ORC file format doesn't follow the same timezone 
> conventions as the rest of Impala.
> {quote}
> tldr: ORC's timezone handling is likely to be broken in Impala so we should 
> patch it in the toolchain
> The ORC library implements its own IANA timezone handling to convert stored 
> timestamps from UTC to local time + do something similar for min/max stats. 
> The writer's timezone can be also stored in .orc files and used instead of 
> local timezone.
> Impala's and ORC library's timezone can be different because of several 
> reasons:
> ORC's timezone is not overridden by env var TZ and query option timezone
> ORC uses a simpler way to detect the local timezone which may not work on 
> some Linux distros (see TimezoneDatabase::LocalZoneName in Impala vs 
> LOCAL_TIMEZONE in Orc)
> .orc files can use any time zone as writer's timezone and we cannot be sure 
> that it will exist on the reader machine
> My suggestion is to patch the ORC library in the toolchain and remove 
> timezone handling (e.g. by always using UTC, maybe depending on a flag), as 
> the way it is currently working is likely to be broken and is surely not 
> consistent with the rest of Impala.
> I am not sure how timezones could be handled correctly in Orc + Impala. If 
> someone plans to work on it, I would gladly help in the integration to Impala.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7730) Improve ORC File Format Timezone issues

2018-10-19 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7730:
---

 Summary: Improve ORC File Format Timezone issues
 Key: IMPALA-7730
 URL: https://issues.apache.org/jira/browse/IMPALA-7730
 Project: IMPALA
  Issue Type: Task
  Components: Backend
Affects Versions: Impala 3.0
Reporter: Philip Zeyliger


As pointed out in https://gerrit.cloudera.org/#/c/11731 by [~csringhofer], our 
support for the ORC file format doesn't follow the same timezone conventions as 
the rest of Impala.

{quote}
tldr: ORC's timezone handling is likely to be broken in Impala so we should 
patch it in the toolchain

The ORC library implements its own IANA timezone handling to convert stored 
timestamps from UTC to local time + do something similar for min/max stats. The 
writer's timezone can be also stored in .orc files and used instead of local 
timezone.

Impala's and ORC library's timezone can be different because of several reasons:

ORC's timezone is not overridden by env var TZ and query option timezone
ORC uses a simpler way to detect the local timezone which may not work on some 
Linux distros (see TimezoneDatabase::LocalZoneName in Impala vs LOCAL_TIMEZONE 
in Orc)
.orc files can use any time zone as writer's timezone and we cannot be sure 
that it will exist on the reader machine
My suggestion is to patch the ORC library in the toolchain and remove timezone 
handling (e.g. by always using UTC, maybe depending on a flag), as the way it 
is currently working is likely to be broken and is surely not consistent with 
the rest of Impala.

I am not sure how timezones could be handled correctly in Orc + Impala. If 
someone plans to work on it, I would gladly help in the integration to Impala.
{quote}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7655) Codegen output for conditional functions (if,isnull, coalesce) is very suboptimal

2018-10-18 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16656010#comment-16656010
 ] 

Philip Zeyliger commented on IMPALA-7655:
-

codegen might be disabled on debug builds in your case.

> Codegen output for conditional functions (if,isnull, coalesce) is very 
> suboptimal
> -
>
> Key: IMPALA-7655
> URL: https://issues.apache.org/jira/browse/IMPALA-7655
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: codegen, perf, performance
>
> https://gerrit.cloudera.org/#/c/11565/ provided a clue that an aggregation 
> involving an if() function was very slow, 10x slower than the equivalent 
> version using a case:
> {noformat}
> [localhost:21000] default> set num_nodes=1; set mt_dop=1; select count(case 
> when l_orderkey is NULL then 1 else NULL end) from 
> tpch10_parquet.lineitem;summary;
> NUM_NODES set to 1
> MT_DOP set to 1
> Query: select count(case when l_orderkey is NULL then 1 else NULL end) from 
> tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 11:17:31 (Coordinator: 
> http://tarmstrong-box:25000)
> Query progress can be monitored at: 
> http://tarmstrong-box:25000/query_plan?query_id=274b2a6f35cefe31:95a19642
> +--+
> | count(case when l_orderkey is null then 1 else null end) |
> +--+
> | 0|
> +--+
> Fetched 1 row(s) in 0.51s
> +--++--+--+++--+---+-+
> | Operator | #Hosts | Avg Time | Max Time | #Rows  | Est. #Rows | Peak 
> Mem | Est. Peak Mem | Detail  |
> +--++--+--+++--+---+-+
> | 01:AGGREGATE | 1  | 44.03ms  | 44.03ms  | 1  | 1  | 25.00 
> KB | 10.00 MB  | FINALIZE|
> | 00:SCAN HDFS | 1  | 411.57ms | 411.57ms | 59.99M | -1 | 16.61 
> MB | 88.00 MB  | tpch10_parquet.lineitem |
> +--++--+--+++--+---+-+
> [localhost:21000] default> set num_nodes=1; set mt_dop=1; select 
> count(if(l_orderkey is NULL, 1, NULL)) from tpch10_parquet.lineitem;summary;
> NUM_NODES set to 1
> MT_DOP set to 1
> Query: select count(if(l_orderkey is NULL, 1, NULL)) from 
> tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 11:23:07 (Coordinator: 
> http://tarmstrong-box:25000)
> Query progress can be monitored at: 
> http://tarmstrong-box:25000/query_plan?query_id=8e46ab1b84c4dbff:2786ca26
> ++
> | count(if(l_orderkey is null, 1, null)) |
> ++
> | 0  |
> ++
> Fetched 1 row(s) in 1.01s
> +--++--+--+++--+---+-+
> | Operator | #Hosts | Avg Time | Max Time | #Rows  | Est. #Rows | Peak 
> Mem | Est. Peak Mem | Detail  |
> +--++--+--+++--+---+-+
> | 01:AGGREGATE | 1  | 422.07ms | 422.07ms | 1  | 1  | 25.00 
> KB | 10.00 MB  | FINALIZE|
> | 00:SCAN HDFS | 1  | 511.13ms | 511.13ms | 59.99M | -1 | 16.61 
> MB | 88.00 MB  | tpch10_parquet.lineitem |
> +--++--+--+++--+---+-+
> {noformat}
> It turns out that this is because we don't have good codegen support for 
> ConditionalFunction, and just fall back to emitting a call to the interpreted 
> path: 
> https://github.com/apache/impala/blob/master/be/src/exprs/conditional-functions.cc#L28
> See CaseExpr for an example of much better codegen support: 
> https://github.com/apache/impala/blob/master/be/src/exprs/case-expr.cc#L178



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7655) Codegen output for conditional functions (if,isnull, coalesce) is very suboptimal

2018-10-15 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650939#comment-16650939
 ] 

Philip Zeyliger commented on IMPALA-7655:
-

I'm not sure if your questions are rhetorical. 
{{common/function-registry/impala_functions.py}} is helpful for tracing how 
functions are implemented. 

I think the re-write here is similar to {{BetweenToCompoundRule}} in spirit. 
Case is implemented in {{be/src/exprs/case-expr.cc}} and has manual codegen, 
probably because case can have a variable number of expressions. If is 
implemented at {{be/src/exprs/conditional-functions-ir.cc}}; look for 
{{IfExpr}}. I'd have to dig a little bit deeper to understand whether or not 
that's getting executed correctly or if theres a wrapper as indicated by Tim at 
https://github.com/apache/impala/blob/master/be/src/exprs/conditional-functions.cc#L28
 . Tim's always right, but that's the interesting bit.

> Codegen output for conditional functions (if,isnull, coalesce) is very 
> suboptimal
> -
>
> Key: IMPALA-7655
> URL: https://issues.apache.org/jira/browse/IMPALA-7655
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: codegen, perf, performance
>
> https://gerrit.cloudera.org/#/c/11565/ provided a clue that an aggregation 
> involving an if() function was very slow, 10x slower than the equivalent 
> version using a case:
> {noformat}
> [localhost:21000] default> set num_nodes=1; set mt_dop=1; select count(case 
> when l_orderkey is NULL then 1 else NULL end) from 
> tpch10_parquet.lineitem;summary;
> NUM_NODES set to 1
> MT_DOP set to 1
> Query: select count(case when l_orderkey is NULL then 1 else NULL end) from 
> tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 11:17:31 (Coordinator: 
> http://tarmstrong-box:25000)
> Query progress can be monitored at: 
> http://tarmstrong-box:25000/query_plan?query_id=274b2a6f35cefe31:95a19642
> +--+
> | count(case when l_orderkey is null then 1 else null end) |
> +--+
> | 0|
> +--+
> Fetched 1 row(s) in 0.51s
> +--++--+--+++--+---+-+
> | Operator | #Hosts | Avg Time | Max Time | #Rows  | Est. #Rows | Peak 
> Mem | Est. Peak Mem | Detail  |
> +--++--+--+++--+---+-+
> | 01:AGGREGATE | 1  | 44.03ms  | 44.03ms  | 1  | 1  | 25.00 
> KB | 10.00 MB  | FINALIZE|
> | 00:SCAN HDFS | 1  | 411.57ms | 411.57ms | 59.99M | -1 | 16.61 
> MB | 88.00 MB  | tpch10_parquet.lineitem |
> +--++--+--+++--+---+-+
> [localhost:21000] default> set num_nodes=1; set mt_dop=1; select 
> count(if(l_orderkey is NULL, 1, NULL)) from tpch10_parquet.lineitem;summary;
> NUM_NODES set to 1
> MT_DOP set to 1
> Query: select count(if(l_orderkey is NULL, 1, NULL)) from 
> tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 11:23:07 (Coordinator: 
> http://tarmstrong-box:25000)
> Query progress can be monitored at: 
> http://tarmstrong-box:25000/query_plan?query_id=8e46ab1b84c4dbff:2786ca26
> ++
> | count(if(l_orderkey is null, 1, null)) |
> ++
> | 0  |
> ++
> Fetched 1 row(s) in 1.01s
> +--++--+--+++--+---+-+
> | Operator | #Hosts | Avg Time | Max Time | #Rows  | Est. #Rows | Peak 
> Mem | Est. Peak Mem | Detail  |
> +--++--+--+++--+---+-+
> | 01:AGGREGATE | 1  | 422.07ms | 422.07ms | 1  | 1  | 25.00 
> KB | 10.00 MB  | FINALIZE|
> | 00:SCAN HDFS | 1  | 511.13ms | 511.13ms | 59.99M | -1 | 16.61 
> MB | 88.00 MB  | tpch10_parquet.lineitem |
> +--++--+--+++--+---+-+
> {noformat}
> It turns out that this is because we don't have good codegen support for 
> ConditionalFunction, and just fall back to emitting a call to the interpreted 
> pat

[jira] [Created] (IMPALA-7698) Add centos/redhat 6/7 support to bootstrap_system.sh

2018-10-11 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7698:
---

 Summary: Add centos/redhat 6/7 support to bootstrap_system.sh
 Key: IMPALA-7698
 URL: https://issues.apache.org/jira/browse/IMPALA-7698
 Project: IMPALA
  Issue Type: Task
  Components: Infrastructure
Reporter: Philip Zeyliger
Assignee: Philip Zeyliger


{{bootstrap_system.sh}} currently only works on Ubuntu. Making it work on 
CentOS/Redhat would open the door to running automated tests on those platforms 
more readily, including using {{test-with-docker}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7691) test_web_pages not being run

2018-10-10 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16645407#comment-16645407
 ] 

Philip Zeyliger commented on IMPALA-7691:
-

[~stakiar_impala_496e]'s in progress change 
https://gerrit.cloudera.org/#/c/11410/ for IMPALA-6249 happens to fix this, fyi.

> test_web_pages not being run
> 
>
> Key: IMPALA-7691
> URL: https://issues.apache.org/jira/browse/IMPALA-7691
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Blocker
>
> test_web_pages.py is not being run by test/run-tests.py because the 
> 'webserver' directory is missing from VALID_TEST_DIRS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7666) Profiles should contain the xDBC driver version

2018-10-05 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640309#comment-16640309
 ] 

Philip Zeyliger commented on IMPALA-7666:
-

I agree. Note that it's unclear where to stuff this in the current HS2 protocol.

> Profiles should contain the xDBC driver version
> ---
>
> Key: IMPALA-7666
> URL: https://issues.apache.org/jira/browse/IMPALA-7666
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Clients
>Affects Versions: Impala 3.1.0
>Reporter: Lars Volker
>Priority: Major
>  Labels: supportability
>
> Profiles should contain the version of the xDBC driver that clients use to 
> connect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7564) Conservative FK/PK join type detection with complex equi-join conjuncts

2018-10-02 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16635978#comment-16635978
 ] 

Philip Zeyliger commented on IMPALA-7564:
-

See HIVE-13290. We'd have to dig out the details, but it looks like Hive is or 
has added ways to hint at these. In our world, there's no constraint 
validation, but the hint is still a useful one. (I've also heard it requested 
for purposes of exporting schemas to tools that visualize and help you build 
joins.)

> Conservative FK/PK join type detection with complex equi-join conjuncts
> ---
>
> Key: IMPALA-7564
> URL: https://issues.apache.org/jira/browse/IMPALA-7564
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.12.0, Impala 2.13.0, Impala 3.1.0
>Reporter: bharath v
>Priority: Major
>
> With IMPALA-5547, we predict whether a join is an FK/PK join as follows.
> {noformat}
>  // Iterate over all groups of conjuncts that belong to the same joined tuple 
> id pair.
> // For each group, we compute the join NDV of the rhs slots and compare 
> it to the
> // number of rows in the rhs table.
> for (List fkPkCandidate: 
> scanSlotsByJoinedTids.values()) {
>   double jointNdv = 1.0;
>   for (EqJoinConjunctScanSlots slots: fkPkCandidate) jointNdv *= 
> slots.rhsNdv();
>   double rhsNumRows = fkPkCandidate.get(0).rhsNumRows();
>   if (jointNdv >= Math.round(rhsNumRows * (1.0 - 
> FK_PK_MAX_STATS_DELTA_PERC))) {
> // We cannot disprove that the RHS is a PK.
> if (result == null) result = Lists.newArrayList();
> result.addAll(fkPkCandidate);
>   }
> }
> {noformat}
> We iterate through all the "simple" equi join conjuncts on the RHS, multiply 
> their NDVs and check if it close to rhsNumRows. The issue here is that this 
> can result in conservative FK/Pk detection if the equi-join conjuncts are not 
> simple (of the form  = )
> {noformat}
> /**
>  * Returns a new EqJoinConjunctScanSlots for the given equi-join conjunct 
> or null if
>  * the given conjunct is not of the form  =  or if the 
> underlying
>  * table/column of at least one side is missing stats.
>  */
> public static EqJoinConjunctScanSlots create(Expr eqJoinConjunct) {
>   if (!Expr.IS_EQ_BINARY_PREDICATE.apply(eqJoinConjunct)) return null;
>   SlotDescriptor lhsScanSlot = 
> eqJoinConjunct.getChild(0).findSrcScanSlot();
>   if (lhsScanSlot == null || !hasNumRowsAndNdvStats(lhsScanSlot)) return 
> null;
>   SlotDescriptor rhsScanSlot = 
> eqJoinConjunct.getChild(1).findSrcScanSlot();
> {noformat}
> For example, the following query contains a complex equi-join conjunct 
> {{substr(l.c3, 1, 6) = substr(r.c3, 1,6)}}, so while detecting if the left 
> outer join is an FK/PK, we just check if 
> {{NDVs(r.c1) * NDVs(r.c2) ~ r.numRows()}} which is incorrect. (This happens 
> because EqJoinConjunctScanSlots.create() returns null for any non-simple 
> predicates which are not considered later).
> {noformat}
> [localhost:21000]> explain select * from test_left l left outer join 
> test_right r on l.c1 = r.c1 and l.c2 = r.c2 and substr(l.c3, 1, 6) = 
> substr(r.c3, 1,6);
> Query: explain select * from test_left l left outer join test_right r on l.c1 
> = r.c1 and l.c2 = r.c2 and substr(l.c3, 1, 6) = substr(r.c3, 1,6)
> +-+
> | Explain String  
> |
> +-+
> | Max Per-Host Resource Reservation: Memory=1.95MB Threads=5  
> |
> | Per-Host Resource Estimates: Memory=66MB
> |
> | 
> |
> | F02:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1   
> |
> | |  Per-Host Resources: mem-estimate=0B mem-reservation=0B 
> thread-reservation=1  |
> | PLAN-ROOT SINK  
> |
> | |  mem-estimate=0B mem-reservation=0B thread-reservation=0  
> |
> | |   
> |
> | 04:EXCHANGE [UNPARTITIONED] 
> |
> | |  mem-estimate=0B mem-reservation=0B thread-reservation=0  
> |
> | |  tuple-ids=0,1N row-size=94B cardinality=49334767023  
> |
> | |  in pipelines: 00(GETNEXT) 

[jira] [Resolved] (IMPALA-7629) TestClientSsl tests seem to be disabled on non-legacy platforms

2018-10-02 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger resolved IMPALA-7629.
-
Resolution: Fixed

> TestClientSsl tests seem to be disabled on non-legacy platforms
> ---
>
> Key: IMPALA-7629
> URL: https://issues.apache.org/jira/browse/IMPALA-7629
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
> Environment: Ubuntu 16.04, Python 2.7.14
>Reporter: Tim Armstrong
>Assignee: Philip Zeyliger
>Priority: Blocker
>
> I noticed that when I ran some of these tests on Ubuntu 16.04 they are 
> skipped:
> {noformat}
> $ impala-py.test tests/custom_cluster/test_client_ssl.py -k ecdh
> ...
> tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_ecdh[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] SKIPPED
> {noformat}
> I don't think this is intended. The logic in IMPALA-6990 looks backwards in 
> that HAS_LEGACY_OPENSSL is a non-None integer (i.e. truthy) when the version 
> field exists.
> Assigning to Phil since he reviewed the patch and probably has some context.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7596) Expose JvmPauseMonitor and GC Metrics to Impala's metrics infrastructure

2018-09-27 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger resolved IMPALA-7596.
-
Resolution: Fixed

> Expose JvmPauseMonitor and GC Metrics to Impala's metrics infrastructure
> 
>
> Key: IMPALA-7596
> URL: https://issues.apache.org/jira/browse/IMPALA-7596
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Major
>
> In IMPALA-6857 we added a thread that checks for GC pauses a bit. To allow 
> monitoring tools to pick up on the fact that pauses are happening, it's 
> useful to promote those as full-fledged metrics.
> It turns out we were also collecting those metrics by doing a lot of round 
> trips to the Java side of the house. This JIRA may choose to address that as 
> well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7624) test-with-docker sometimes hangs creating docker containers

2018-09-27 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger resolved IMPALA-7624.
-
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> test-with-docker sometimes hangs creating docker containers
> ---
>
> Key: IMPALA-7624
> URL: https://issues.apache.org/jira/browse/IMPALA-7624
> Project: IMPALA
>  Issue Type: Task
>Reporter: Philip Zeyliger
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> I've seen the test-with-docker executions hang, or sort of hang, in threads 
> doing {{docker create}}. I think this is ultimately a Docker or kernel bug, 
> but we can work around it by serializing our "docker create" invocations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7501) Slim down metastore Partition objects in LocalCatalog cache

2018-09-26 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629450#comment-16629450
 ] 

Philip Zeyliger commented on IMPALA-7501:
-

I think Todd's immediate suggestion here is to null out the Thrift stuff. Note 
that I think we first retrieve in in {{catalogd}} but it eventually makes its 
way into {{impalad}} and is presumably Thrift-serialized on the way. It may be 
useful to null it out in {{catalogd}} since memory there is also valuable, but 
you'll have to work out the details.

> Slim down metastore Partition objects in LocalCatalog cache
> ---
>
> Key: IMPALA-7501
> URL: https://issues.apache.org/jira/browse/IMPALA-7501
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Priority: Minor
>
> I took a heap dump of an impalad running in LocalCatalog mode with a 2G limit 
> after running a production workload simulation for a couple hours. It had 
> 38.5M objects and 2.02GB heap (the vast majority of the heap is, as expected, 
> in the LocalCatalog cache). Of this total footprint, 1.78GB and 34.6M objects 
> are retained by 'Partition' objects. Drilling into those, 1.29GB and 33.6M 
> objects are retained by FieldSchema, which, as far as I remember, are ignored 
> on the partition level by the Impala planner. So, with a bit of slimming down 
> of these objects, we could make a huge dent in effective cache capacity given 
> a fixed budget. Reducing object count should also have the effect of improved 
> GC performance (old gen GC is more closely tied to object count than size)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7581) Hang in buffer-pool-test

2018-09-26 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629351#comment-16629351
 ] 

Philip Zeyliger commented on IMPALA-7581:
-

We could also wrap the test runners in {{/usr/bin/timeout}} to make debugging 
this slightly more pleasant. 

I think it's reasonable to skip them under ASAN. We already skip death tests 
with release builds:
{code}
// Gtest's ASSERT_DEBUG_DEATH macro has peculiar semantics where in debug 
builds it
// executes the code in a forked process, so it has no visible side-effects, 
but in
// release builds it executes the code as normal. This makes it difficult to 
write
// death tests that work in both debug and release builds. To avoid this 
problem, update
// our wrapper macro to simply omit the death test expression in release 
builds, where we
// can't actually test DCHECKs anyway.
#define IMPALA_ASSERT_DEBUG_DEATH(fn, msg)
#endif
{code}

> Hang in buffer-pool-test
> 
>
> Key: IMPALA-7581
> URL: https://issues.apache.org/jira/browse/IMPALA-7581
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build, flaky
> Attachments: gdb.txt
>
>
> We have observed a hang in buffer-pool-test an ASAN build. Unfortunately, no 
> logs were generated with any info about what might have happened.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7624) test-with-docker sometimes hangs creating docker containers

2018-09-25 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7624:
---

 Summary: test-with-docker sometimes hangs creating docker 
containers
 Key: IMPALA-7624
 URL: https://issues.apache.org/jira/browse/IMPALA-7624
 Project: IMPALA
  Issue Type: Task
Reporter: Philip Zeyliger


I've seen the test-with-docker executions hang, or sort of hang, in threads 
doing {{docker create}}. I think this is ultimately a Docker or kernel bug, but 
we can work around it by serializing our "docker create" invocations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7605) AnalysisException when first accessing Hive-create table on pristine HMS

2018-09-24 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626120#comment-16626120
 ] 

Philip Zeyliger commented on IMPALA-7605:
-

To belabor this: you're likely seeing a race. I wonder if CDH-73078 is similar. 
It was filed recently after https://review.infra.cloudera.com/r/97460/ added a 
sleep() somewhere.

> AnalysisException when first accessing Hive-create table on pristine HMS
> 
>
> Key: IMPALA-7605
> URL: https://issues.apache.org/jira/browse/IMPALA-7605
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Michael Brown
>Assignee: bharath v
>Priority: Blocker
>  Labels: regression
> Attachments: 3.0-logs.tar.gz, 3.1-logs.tar.gz, metadata-bug.sh
>
>
> This is a corner case encountered when loading test data from Hive on a 
> pristine/new cluster. As we tend to keep bigger clusters around and upgrade 
> them, as opposed to refreshing them, and our data load doesn't hit this 
> either, this was tough to spot.
> The procedure in general is to start with a pristine HMS, create a table in 
> Hive, and use Impala to access the table. Upon the first access, 
> AnalysisException is raised. Subsequent accesses work.
> This is a P1 in the sense that, while a user can try again and succeed, test 
> automation is going to increasingly load data via Hive and then access it in 
> Impala. This case needs to work. The other thing making this a P1 is that 
> it's a change behavior relative to both 2.12 and 3.0 and thus a regression.
> Here's what catalogd.INFO looks like in the successful case (3.0):
> {noformat}
> I0921 10:28:23.697592 47879 CatalogServiceCatalog.java:1102] Invalidating all 
> metadata. Version: 0
> I0921 10:28:23.810739 47879 CatalogServiceCatalog.java:914] Loading native 
> functions for database: default
> I0921 10:28:23.811686 47879 CatalogServiceCatalog.java:930] Loaded native 
> functions for database: default
> I0921 10:28:23.811738 47879 CatalogServiceCatalog.java:941] Loading Java 
> functions for database: default
> I0921 10:28:23.811772 47879 CatalogServiceCatalog.java:952] Loaded Java 
> functions for database: default
> I0921 10:28:23.853292 47879 CatalogServiceCatalog.java:1170] Invalidated all 
> metadata.
> I0921 10:28:23.860013 47879 statestore-subscriber.cc:190] Starting statestore 
> subscriber
> I0921 10:28:23.861377 47879 thrift-server.cc:452] ThriftServer 
> 'StatestoreSubscriber' started on port: 23020
> I0921 10:28:23.861383 47879 statestore-subscriber.cc:217] Registering with 
> statestore
> I0921 10:28:23.861979 47879 statestore-subscriber.cc:174] Subscriber 
> registration ID: 624d03c923c15c12:f9204d9fb14054a9
> I0921 10:28:23.861989 47879 statestore-subscriber.cc:221] statestore 
> registration successful
> I0921 10:28:23.868679 48041 catalog-server.cc:490] Collected update: 
> DATABASE:default, version=2, original size=156
> I0921 10:28:23.870929 48041 catalog-server.cc:490] Collected deletion: 
> DATABASE:_impala_builtins, version=3, original size=140
> I0921 10:28:23.872802 48041 catalog-server.cc:490] Collected update: 
> CATALOG_SERVICE_ID, version=3, original size=49
> I0921 10:28:23.875424 47879 thrift-server.cc:452] ThriftServer 
> 'CatalogService' started on port: 26000
> I0921 10:28:23.875434 47879 catalogd-main.cc:111] CatalogService started on 
> port: 26000
> I0921 10:28:26.776692 48047 catalog-server.cc:245] A catalog update with 3 
> entries is assembled. Catalog version: 3 Last sent catalog version: 0
> I0921 10:28:53.571209 48924 CatalogServiceCatalog.java:1102] Invalidating all 
> metadata. Version: 3
> I0921 10:28:53.608983 48924 CatalogServiceCatalog.java:914] Loading native 
> functions for database: default
> I0921 10:28:53.609027 48924 CatalogServiceCatalog.java:930] Loaded native 
> functions for database: default
> I0921 10:28:53.609058 48924 CatalogServiceCatalog.java:941] Loading Java 
> functions for database: default
> I0921 10:28:53.609087 48924 CatalogServiceCatalog.java:952] Loaded Java 
> functions for database: default
> I0921 10:28:53.614903 48924 CatalogServiceCatalog.java:914] Loading native 
> functions for database: foo1537550878
> I0921 10:28:53.614946 48924 CatalogServiceCatalog.java:930] Loaded native 
> functions for database: foo1537550878
> I0921 10:28:53.614977 48924 CatalogServiceCatalog.java:941] Loading Java 
> functions for database: foo1537550878
> I0921 10:28:53.615005 48924 CatalogServiceCatalog.java:952] Loaded Java 
> functions for database: foo1537550878
> I0921 10:28:53.632726 48924 CatalogServiceCatalog.java:1170] Invalidated all 
> metadata.
> I0921 10:28:54.782857 48041 catalog-server.cc:490] Collected update: 
> DATABASE:default, version=5, ori

[jira] [Comment Edited] (IMPALA-7605) AnalysisException when first accessing Hive-create table on pristine HMS

2018-09-24 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626120#comment-16626120
 ] 

Philip Zeyliger edited comment on IMPALA-7605 at 9/24/18 5:01 PM:
--

To belabor this: you're likely seeing a race.


was (Author: philip):
To belabor this: you're likely seeing a race. I wonder if CDH-73078 is similar. 
It was filed recently after https://review.infra.cloudera.com/r/97460/ added a 
sleep() somewhere.

> AnalysisException when first accessing Hive-create table on pristine HMS
> 
>
> Key: IMPALA-7605
> URL: https://issues.apache.org/jira/browse/IMPALA-7605
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Michael Brown
>Assignee: bharath v
>Priority: Blocker
>  Labels: regression
> Attachments: 3.0-logs.tar.gz, 3.1-logs.tar.gz, metadata-bug.sh
>
>
> This is a corner case encountered when loading test data from Hive on a 
> pristine/new cluster. As we tend to keep bigger clusters around and upgrade 
> them, as opposed to refreshing them, and our data load doesn't hit this 
> either, this was tough to spot.
> The procedure in general is to start with a pristine HMS, create a table in 
> Hive, and use Impala to access the table. Upon the first access, 
> AnalysisException is raised. Subsequent accesses work.
> This is a P1 in the sense that, while a user can try again and succeed, test 
> automation is going to increasingly load data via Hive and then access it in 
> Impala. This case needs to work. The other thing making this a P1 is that 
> it's a change behavior relative to both 2.12 and 3.0 and thus a regression.
> Here's what catalogd.INFO looks like in the successful case (3.0):
> {noformat}
> I0921 10:28:23.697592 47879 CatalogServiceCatalog.java:1102] Invalidating all 
> metadata. Version: 0
> I0921 10:28:23.810739 47879 CatalogServiceCatalog.java:914] Loading native 
> functions for database: default
> I0921 10:28:23.811686 47879 CatalogServiceCatalog.java:930] Loaded native 
> functions for database: default
> I0921 10:28:23.811738 47879 CatalogServiceCatalog.java:941] Loading Java 
> functions for database: default
> I0921 10:28:23.811772 47879 CatalogServiceCatalog.java:952] Loaded Java 
> functions for database: default
> I0921 10:28:23.853292 47879 CatalogServiceCatalog.java:1170] Invalidated all 
> metadata.
> I0921 10:28:23.860013 47879 statestore-subscriber.cc:190] Starting statestore 
> subscriber
> I0921 10:28:23.861377 47879 thrift-server.cc:452] ThriftServer 
> 'StatestoreSubscriber' started on port: 23020
> I0921 10:28:23.861383 47879 statestore-subscriber.cc:217] Registering with 
> statestore
> I0921 10:28:23.861979 47879 statestore-subscriber.cc:174] Subscriber 
> registration ID: 624d03c923c15c12:f9204d9fb14054a9
> I0921 10:28:23.861989 47879 statestore-subscriber.cc:221] statestore 
> registration successful
> I0921 10:28:23.868679 48041 catalog-server.cc:490] Collected update: 
> DATABASE:default, version=2, original size=156
> I0921 10:28:23.870929 48041 catalog-server.cc:490] Collected deletion: 
> DATABASE:_impala_builtins, version=3, original size=140
> I0921 10:28:23.872802 48041 catalog-server.cc:490] Collected update: 
> CATALOG_SERVICE_ID, version=3, original size=49
> I0921 10:28:23.875424 47879 thrift-server.cc:452] ThriftServer 
> 'CatalogService' started on port: 26000
> I0921 10:28:23.875434 47879 catalogd-main.cc:111] CatalogService started on 
> port: 26000
> I0921 10:28:26.776692 48047 catalog-server.cc:245] A catalog update with 3 
> entries is assembled. Catalog version: 3 Last sent catalog version: 0
> I0921 10:28:53.571209 48924 CatalogServiceCatalog.java:1102] Invalidating all 
> metadata. Version: 3
> I0921 10:28:53.608983 48924 CatalogServiceCatalog.java:914] Loading native 
> functions for database: default
> I0921 10:28:53.609027 48924 CatalogServiceCatalog.java:930] Loaded native 
> functions for database: default
> I0921 10:28:53.609058 48924 CatalogServiceCatalog.java:941] Loading Java 
> functions for database: default
> I0921 10:28:53.609087 48924 CatalogServiceCatalog.java:952] Loaded Java 
> functions for database: default
> I0921 10:28:53.614903 48924 CatalogServiceCatalog.java:914] Loading native 
> functions for database: foo1537550878
> I0921 10:28:53.614946 48924 CatalogServiceCatalog.java:930] Loaded native 
> functions for database: foo1537550878
> I0921 10:28:53.614977 48924 CatalogServiceCatalog.java:941] Loading Java 
> functions for database: foo1537550878
> I0921 10:28:53.615005 48924 CatalogServiceCatalog.java:952] Loaded Java 
> functions for database: foo1537550878
> I0921 10:28:53.632726 48924 CatalogServiceCatalog.java:1170] Invalidated all 

[jira] [Commented] (IMPALA-7310) Compute Stats not computing NULLs as a distinct value causing wrong estimates

2018-09-19 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621472#comment-16621472
 ] 

Philip Zeyliger commented on IMPALA-7310:
-

I've not looked at how that test is hooked up, but it's entirely possible that 
constant folding isn't happening in the test but would be happening in real 
life for the constant expression.

> Compute Stats not computing NULLs as a distinct value causing wrong estimates
> -
>
> Key: IMPALA-7310
> URL: https://issues.apache.org/jira/browse/IMPALA-7310
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, 
> Impala 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Zsombor Fedor
>Assignee: Paul Rogers
>Priority: Major
>
> As seen in other DBMSs
> {code:java}
> NDV(col){code}
> not counting NULL as a distinct value. The same also applies to
> {code:java}
> COUNT(DISTINCT col){code}
> This is working as intended, but when computing column statistics it can 
> cause some anomalies (i.g. bad join order) as compute stats uses NDV() to 
> determine columns NDVs.
>  
> For example when aggregating more columns, the estimated cardinality is 
> [counted as the product of the columns' number of distinct 
> values.|https://github.com/cloudera/Impala/blob/64cd0bb0c3529efa0ab5452c4e9e2a04fd815b4f/fe/src/main/java/org/apache/impala/analysis/Expr.java#L669]
>  If there is a column full of NULLs the whole product will be 0.
>  
> There are two possible fix for this.
> Either we should count NULLs as a distinct value when Computing Stats in the 
> query:
> {code:java}
> SELECT NDV(a) + COUNT(DISTINCT CASE WHEN a IS NULL THEN 1 END) AS a, CAST(-1 
> as BIGINT), 4, CAST(4 as DOUBLE) FROM test;{code}
> instead of
> {code:java}
> SELECT NDV(a) AS a, CAST(-1 as BIGINT), 4, CAST(4 as DOUBLE) FROM test;{code}
>  
>  
> Or we should change the planner 
> [function|https://github.com/cloudera/Impala/blob/2d2579cb31edda24457d33ff5176d79b7c0432c5/fe/src/main/java/org/apache/impala/planner/AggregationNode.java#L169]
>  to take care of this bug.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7596) Expose JvmPauseMonitor and GC Metrics to Impala's metrics infrastructure

2018-09-19 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7596:
---

 Summary: Expose JvmPauseMonitor and GC Metrics to Impala's metrics 
infrastructure
 Key: IMPALA-7596
 URL: https://issues.apache.org/jira/browse/IMPALA-7596
 Project: IMPALA
  Issue Type: Task
  Components: Infrastructure
Reporter: Philip Zeyliger
Assignee: Philip Zeyliger


In IMPALA-6857 we added a thread that checks for GC pauses a bit. To allow 
monitoring tools to pick up on the fact that pauses are happening, it's useful 
to promote those as full-fledged metrics.

It turns out we were also collecting those metrics by doing a lot of round 
trips to the Java side of the house. This JIRA may choose to address that as 
well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7488) TestShellCommandLine::test_cancellation hangs occasionally

2018-09-05 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger updated IMPALA-7488:

Priority: Critical  (was: Major)

> TestShellCommandLine::test_cancellation hangs occasionally
> --
>
> Key: IMPALA-7488
> URL: https://issues.apache.org/jira/browse/IMPALA-7488
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build
> Attachments: psauxf.txt
>
>
> We've seen a couple of hung builds with no queries running on the cluster. I 
> got "ps auxf" output and it looks like an impala-shell process is hanging 
> around.
> I'm guessing the IMPALA-7407 fix somehow relates to this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7348) PlannerTest.testKuduSelectivity failing due to missing Cardinality information

2018-09-04 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16603236#comment-16603236
 ] 

Philip Zeyliger commented on IMPALA-7348:
-

Saw this again. The two tests were: 
org.apache.impala.planner.PlannerTest.testMinMaxRuntimeFilters   and 
org.apache.impala.planner.PlannerTest.testKudu

Here's the plan differences:
{code}
hash predicates: a.int_col = b.tinyint_col + 1, a.string_col = b.string_col
vs
hash predicates: a.string_col = b.string_col, a.int_col = b.tinyint_col + 1
{code}

and 

{code}
predicates: bigint_col IN (NULL), bool_col IN (NULL), double_col IN (NULL), 
float_col IN (NULL), smallint_col IN (NULL), string_col IN (NULL), tinyint_col 
IN (NULL), id IN (1, NULL)
vs
predicates: id IN (1, NULL), bigint_col IN (NULL), bool_col IN (NULL), 
double_col IN (NULL), float_col IN (NULL), smallint_col IN (NULL), string_col 
IN (NULL), tinyint_col IN (NULL)
{code}

Curiously, this is all happening only within Kudu. The verbose plans indicate 
that cardinality is unavailable, which makes me think that 
{{orderConjunctsByCost}} is stable, but we're missing cardinality information.



> PlannerTest.testKuduSelectivity failing due to missing Cardinality information
> --
>
> Key: IMPALA-7348
> URL: https://issues.apache.org/jira/browse/IMPALA-7348
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: nithya
>Assignee: Vuk Ercegovac
>Priority: Blocker
>  Labels: broken-build
>
> PlannerTest.testKuduSelectivity failed in the recent run. It is an assertion 
> failure to unavailable cardinality information.
> Assertion failure as follows
> {code:java}
> Actual does not match expected result:
> F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> Per-Host Resources: mem-estimate=0B mem-reservation=0B
>   PLAN-ROOT SINK
>   |  mem-estimate=0B mem-reservation=0B
>   |
>   00:SCAN KUDU [functional_kudu.zipcode_incomes]
>      kudu predicates: id = '860US00601'
>      mem-estimate=0B mem-reservation=0B
>      tuple-ids=0 row-size=68B cardinality=unavailable
> ^
> Expected:
> F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> Per-Host Resources: mem-estimate=0B mem-reservation=0B
>   PLAN-ROOT SINK
>   |  mem-estimate=0B mem-reservation=0B
>   |
>   00:SCAN KUDU [functional_kudu.zipcode_incomes]
>      kudu predicates: id = '860US00601'
>      mem-estimate=0B mem-reservation=0B
>      tuple-ids=0 row-size=124B cardinality=1 {code}
> Verbose plan
> {code:java}
> Verbose plan:
> F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> Per-Host Resources: mem-estimate=0B mem-reservation=0B
>   PLAN-ROOT SINK
>   |  mem-estimate=0B mem-reservation=0B
>   |
>   00:SCAN KUDU [functional_kudu.zipcode_incomes]
>      kudu predicates: id = '860US00601'
>      mem-estimate=0B mem-reservation=0B
>      tuple-ids=0 row-size=68B cardinality=unavailable
> Section DISTRIBUTEDPLAN of query:
> select * from functional_kudu.zipcode_incomes where id = '860US00601'
> Actual does not match expected result:
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> Per-Host Resources: mem-estimate=0B mem-reservation=0B
>   PLAN-ROOT SINK
>   |  mem-estimate=0B mem-reservation=0B
>   |
>   01:EXCHANGE [UNPARTITIONED]
>      mem-estimate=0B mem-reservation=0B
>      tuple-ids=0 row-size=68B cardinality=unavailable
> ^
> F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> Per-Host Resources: mem-estimate=0B mem-reservation=0B
>   DATASTREAM SINK [FRAGMENT=F01, EXCHANGE=01, UNPARTITIONED]
>   |  mem-estimate=0B mem-reservation=0B
>   00:SCAN KUDU [functional_kudu.zipcode_incomes]
>      kudu predicates: id = '860US00601'
>      mem-estimate=0B mem-reservation=0B
>      tuple-ids=0 row-size=68B cardinality=unavailable
> Expected:
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> Per-Host Resources: mem-estimate=0B mem-reservation=0B
>   PLAN-ROOT SINK
>   |  mem-estimate=0B mem-reservation=0B
>   |
>   01:EXCHANGE [UNPARTITIONED]
>      mem-estimate=0B mem-reservation=0B
>      tuple-ids=0 row-size=124B cardinality=1
> F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> Per-Host Resources: mem-estimate=0B mem-reservation=0B
>   DATASTREAM SINK [FRAGMENT=F01, EXCHANGE=01, UNPARTITIONED]
>   |  mem-estimate=0B mem-reservation=0B
>   00:SCAN KUDU [functional_kudu.zipcode_incomes]
>      kudu predicates: id = '860US00601'
>      mem-estimate=0B mem-reservation=0B
>      tuple-ids=0 row-size=124B cardinality=1
> Verbose plan:
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> Per-Host Resources: mem-estimate=0B mem-reservation=0B
>   PLAN-ROOT SINK
>   |  mem-estimate=0B mem-r

[jira] [Created] (IMPALA-7523) Planner Test failing with "Failed to assign regions to servers after 60000 millis."

2018-09-04 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7523:
---

 Summary: Planner Test failing with "Failed to assign regions to 
servers after 6 millis."
 Key: IMPALA-7523
 URL: https://issues.apache.org/jira/browse/IMPALA-7523
 Project: IMPALA
  Issue Type: Task
  Components: Frontend
Reporter: Philip Zeyliger


I've seen 
{{org.apache.impala.planner.PlannerTest.org.apache.impala.planner.PlannerTest}} 
fail with the following trace:
{code}
java.lang.IllegalStateException: Failed to assign regions to servers after 
6 millis.
at 
org.apache.impala.datagenerator.HBaseTestDataRegionAssignment.performAssignment(HBaseTestDataRegionAssignment.java:153)
at 
org.apache.impala.planner.PlannerTestBase.setUp(PlannerTestBase.java:120)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
{code}

I think we've seen it before as indicated in IMPALA-7061.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7498) impalad should wait for catalogd during start up

2018-08-29 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596754#comment-16596754
 ] 

Philip Zeyliger commented on IMPALA-7498:
-

There's a similar issue with the catalog, which fails (in an unpleasant, 
core-dump-y way) if it can't talk to the Hive metastore. To be consistent, we 
should wait a certain amount of time in all cases. (No obligation to fix this 
together; just mentioning it if you end up in the same code.)

> impalad should wait for catalogd during start up
> 
>
> Key: IMPALA-7498
> URL: https://issues.apache.org/jira/browse/IMPALA-7498
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Priority: Major
>
> If you start all daemons simultaneously, impalad with --use_local_catalog 
> enabled will retry three times in a tight loop trying to fetch the DB names, 
> and then exit. Instead it should loop for some amount of time waiting for the 
> catalog to be ready in the same way that the existing implementation does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7479) parquet versions not harmonized in testdata

2018-08-23 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7479:
---

 Summary: parquet versions not harmonized in testdata
 Key: IMPALA-7479
 URL: https://issues.apache.org/jira/browse/IMPALA-7479
 Project: IMPALA
  Issue Type: Task
Reporter: Philip Zeyliger


The testdata pom uses an older version of parquet than elsewhere.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7455) log4j2 can sneak into classpath and break logging

2018-08-16 Thread Philip Zeyliger (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger resolved IMPALA-7455.
-
Resolution: Fixed

> log4j2 can sneak into classpath and break logging
> -
>
> Key: IMPALA-7455
> URL: https://issues.apache.org/jira/browse/IMPALA-7455
> Project: IMPALA
>  Issue Type: Task
>Reporter: Philip Zeyliger
>Priority: Major
>
> {{bin/set-classpath}} sets the classpath based on {{fe/target/dependency/*}} 
> and uses a shell glob. These globs are sorted according to your locale 
> settings. The sorting differs between the C locale and the {{en_US.UTF-8}} 
> locale:
> {code}
> impdev@phiblip:~/Impala$ LC_ALL=en_US.UTF-8 bash -c 'echo 
> fe/target/dependency/*' | tr ' ' '\n' | grep log4j
> fe/target/dependency/log4j-1.2.17.jar
> fe/target/dependency/log4j-1.2-api-2.8.2.jar
> fe/target/dependency/log4j-api-2.8.2.jar
> fe/target/dependency/log4j-core-2.8.2.jar
> fe/target/dependency/log4j-web-2.8.2.jar
> fe/target/dependency/slf4j-log4j12-1.7.25.jar
> impdev@phiblip:~/Impala$ LC_ALL=C bash -c 'echo fe/target/dependency/*' | tr 
> ' ' '\n' | grep log4j
> fe/target/dependency/log4j-1.2-api-2.8.2.jar
> fe/target/dependency/log4j-1.2.17.jar
> fe/target/dependency/log4j-api-2.8.2.jar
> fe/target/dependency/log4j-core-2.8.2.jar
> fe/target/dependency/log4j-web-2.8.2.jar
> fe/target/dependency/slf4j-log4j12-1.7.25.jar
> {code}
> When the {{LC_ALL=C}} locale is in play, you get logs like the following in 
> custom cluster tests:
> {code}
> $cat 
> impalad.philip-dev.gce.cloudera.com.philip.log.ERROR.20180815-093713.22141
> Log file created at: 2018/08/15 09:37:13
> Running on machine: philip-dev.gce.cloudera.com
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> E0815 09:37:13.841073 22141 logging.cc:121] stderr will be logged to this 
> file.
> ERROR StatusLogger No log4j2 configuration file found. Using default 
> configuration: logging only errors to the console. Set system property 
> 'org.apache.logging.log4j.simplelog.StatusLogger.level' to TRACE to show 
> Log4j2 internal initialization logging.
> {code}
> What I think is going on is the following dependency pulled in via 
> {{hive-exec}}:
> {code}
> [INFO] +- org.apache.hive:hive-exec:jar:2.1.1-cdh6.x-SNAPSHOT:compile
> [INFO] |  +- org.apache.logging.log4j:log4j-1.2-api:jar:2.8.2:compile
> {code}
> Impala configures log4j manually in {{GlogAppender.Install()}} that uses 
> log4j-1.2 (and not log4j2) configuration.
> I'll be trying to rip out this dependency and see what happens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7455) log4j2 can sneak into classpath and break logging

2018-08-15 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7455:
---

 Summary: log4j2 can sneak into classpath and break logging
 Key: IMPALA-7455
 URL: https://issues.apache.org/jira/browse/IMPALA-7455
 Project: IMPALA
  Issue Type: Task
Reporter: Philip Zeyliger


{{bin/set-classpath}} sets the classpath based on {{fe/target/dependency/*}} 
and uses a shell glob. These globs are sorted according to your locale 
settings. The sorting differs between the C locale and the {{en_US.UTF-8}} 
locale:
{code}
impdev@phiblip:~/Impala$ LC_ALL=en_US.UTF-8 bash -c 'echo 
fe/target/dependency/*' | tr ' ' '\n' | grep log4j
fe/target/dependency/log4j-1.2.17.jar
fe/target/dependency/log4j-1.2-api-2.8.2.jar
fe/target/dependency/log4j-api-2.8.2.jar
fe/target/dependency/log4j-core-2.8.2.jar
fe/target/dependency/log4j-web-2.8.2.jar
fe/target/dependency/slf4j-log4j12-1.7.25.jar
impdev@phiblip:~/Impala$ LC_ALL=C bash -c 'echo fe/target/dependency/*' | tr ' 
' '\n' | grep log4j
fe/target/dependency/log4j-1.2-api-2.8.2.jar
fe/target/dependency/log4j-1.2.17.jar
fe/target/dependency/log4j-api-2.8.2.jar
fe/target/dependency/log4j-core-2.8.2.jar
fe/target/dependency/log4j-web-2.8.2.jar
fe/target/dependency/slf4j-log4j12-1.7.25.jar
{code}

When the {{LC_ALL=C}} locale is in play, you get logs like the following in 
custom cluster tests:
{code}
$cat impalad.philip-dev.gce.cloudera.com.philip.log.ERROR.20180815-093713.22141
Log file created at: 2018/08/15 09:37:13
Running on machine: philip-dev.gce.cloudera.com
Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
E0815 09:37:13.841073 22141 logging.cc:121] stderr will be logged to this file.
ERROR StatusLogger No log4j2 configuration file found. Using default 
configuration: logging only errors to the console. Set system property 
'org.apache.logging.log4j.simplelog.StatusLogger.level' to TRACE to show Log4j2 
internal initialization logging.
{code}

What I think is going on is the following dependency pulled in via 
{{hive-exec}}:
{code}
[INFO] +- org.apache.hive:hive-exec:jar:2.1.1-cdh6.x-SNAPSHOT:compile
[INFO] |  +- org.apache.logging.log4j:log4j-1.2-api:jar:2.8.2:compile
{code}
Impala configures log4j manually in {{GlogAppender.Install()}} that uses 
log4j-1.2 (and not log4j2) configuration.

I'll be trying to rip out this dependency and see what happens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7441) Java logging happens before Java logging is configured, leading to INFOs printed at ERROR

2018-08-14 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7441:
---

 Summary: Java logging happens before Java logging is configured, 
leading to INFOs printed at ERROR
 Key: IMPALA-7441
 URL: https://issues.apache.org/jira/browse/IMPALA-7441
 Project: IMPALA
  Issue Type: Task
Reporter: Philip Zeyliger


The following is pretty surprising:
{code}
catalogd.impala.log.ERROR.20180813-040536.9529:18/08/13 04:05:37 INFO 
util.JvmPauseMonitor: Starting JVM pause monitor
{code}
Namely, we've got an INFO-level log message appearing in the ERROR file. 

What's going on (I think) is that {{GlogAppender.Install()}} is being called 
during the constructors of Frontend and Catalog objects:
{code}
$git grep GlogAppender.Install
fe/src/main/java/org/apache/impala/service/JniCatalog.java:
GlogAppender.Install(TLogLevel.values()[cfg.impala_log_lvl],
fe/src/main/java/org/apache/impala/service/JniFrontend.java:
GlogAppender.Install(TLogLevel.values()[cfg.impala_log_lvl],
{code}
Meanwhile, the pause monitor is initialized earlier:
{code}
$git grep -C5 JniUtil::InitJvmPauseMonitor | head
be/src/common/init.cc-  if (!fs_cache_init_status.ok()) 
CLEAN_EXIT_WITH_ERROR(fs_cache_init_status.GetDetail());
be/src/common/init.cc-
be/src/common/init.cc-  if (init_jvm) {
be/src/common/init.cc-ABORT_IF_ERROR(JniUtil::Init());
be/src/common/init.cc-InitJvmLoggingSupport();
be/src/common/init.cc:ABORT_IF_ERROR(JniUtil::InitJvmPauseMonitor());
be/src/common/init.cc-ZipUtil::InitJvm();
be/src/common/init.cc-  }
{code}

This is largely cosmetic, but it was surprising to me.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7410) HDFS Datanodes unable to start

2018-08-08 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16573399#comment-16573399
 ] 

Philip Zeyliger commented on IMPALA-7410:
-

The build of Hadoop we used here was built on centos7 (I checked this) and the 
native libraries are refusing to work on a centos6 machine (I didn't check 
this, but that's what this error usually means). We need to update the build 
we're using.

> HDFS Datanodes unable to start
> --
>
> Key: IMPALA-7410
> URL: https://issues.apache.org/jira/browse/IMPALA-7410
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Vuk Ercegovac
>Priority: Blocker
>  Labels: broken-build
>
> A recent test job err'd out when HDFS could not be setup. From the console:
> {noformat}
> ...
> 14:59:31 Stopping hdfs
> 14:59:33 Starting hdfs (Web UI - http://localhost:5070)
> 14:59:38 Failed to start hdfs-datanode. The end of the log 
> (/data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/testdata/cluster/cdh6/node-3/var/log/hdfs-datanode.out)
>  is:
> 14:59:39 WARNING: 
> /data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/testdata/cluster/cdh6/node-3/var/log/hadoop-hdfs
>  does not exist. Creating.
> 14:59:39 Failed to start hdfs-datanode. The end of the log 
> (/data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/testdata/cluster/cdh6/node-2/var/log/hdfs-datanode.out)
>  is:
> 14:59:39 WARNING: 
> /data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/testdata/cluster/cdh6/node-2/var/log/hadoop-hdfs
>  does not exist. Creating.
> 14:59:39 Failed to start hdfs-datanode. The end of the log 
> (/data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/testdata/cluster/cdh6/node-1/var/log/hdfs-datanode.out)
>  is:
> 14:59:39 WARNING: 
> /data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/testdata/cluster/cdh6/node-1/var/log/hadoop-hdfs
>  does not exist. Creating.
> 14:59:47 Namenode started
> 14:59:47 Error in 
> /data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/testdata/bin/run-mini-dfs.sh
>  at line 41: $IMPALA_HOME/testdata/cluster/admin start_cluster
> 14:59:48 Error in 
> /data/jenkins/workspace/impala-asf-master-exhaustive-centos6/repos/Impala/testdata/bin/run-all.sh
>  at line 44: tee ${IMPALA_CLUSTER_LOGS_DIR}/run-mini-dfs.log
> ...{noformat}
> From one of the datanodes that could not start:
> {noformat}
> ...
> 2018-08-07 14:59:38,561 ERROR 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
> java.lang.RuntimeException: Cannot start datanode because the configured max 
> locked memory size (dfs.datanode.max.locked.memory) is greater than zero and 
> native code is not available.
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1365)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:497)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2778)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2681)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2728)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2872)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2896)
> 2018-08-07 14:59:38,568 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: java.lang.RuntimeException: Cannot start datanode because the 
> configured max locked memory size (dfs.datanode.max.locked.memory) is greater 
> than zero and native code is not available.
> 2018-08-07 14:59:38,575 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> SHUTDOWN_MSG:{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7390) rpc-mgr-kerberized-test fails inside of test-with-docker for hostname resolution reasons

2018-08-02 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7390:
---

 Summary: rpc-mgr-kerberized-test fails inside of test-with-docker 
for hostname resolution reasons
 Key: IMPALA-7390
 URL: https://issues.apache.org/jira/browse/IMPALA-7390
 Project: IMPALA
  Issue Type: Task
Reporter: Philip Zeyliger


When running inside of test-with-docker, the test fails like so:
{code}
2018-08-02 15:11:38.846042 [==] Running 1 test from 1 test case.
2018-08-02 15:11:38.846073 [--] Global test environment set-up.
2018-08-02 15:11:38.846114 [--] 1 test from 
KerberosOnAndOff/RpcMgrKerberizedTest
2018-08-02 15:11:38.846159 [ RUN  ] 
KerberosOnAndOff/RpcMgrKerberizedTest.MultipleServicesTls/0
2018-08-02 15:11:38.846284 Aug 02 22:11:38 i-20180802-132959 
krb5kdc[13121](info): AS_REQ (2 etypes {17 16}) 127.0.0.1: ISSUE: authtime 
1533247898, etypes {rep=17 tkt=17 ses=17}, 
impala-test/i-20180802-132...@krbtest.com for krbtgt/krbtest@krbtest.com
2018-08-02 15:11:38.846417 Aug 02 22:11:38 i-20180802-132959 
krb5kdc[13121](info): TGS_REQ (2 etypes {17 16}) 127.0.0.1: LOOKING_UP_SERVER: 
authtime 0,  impala-test/i-20180802-132...@krbtest.com for 
impala-test/localh...@krbtest.com, Server not found in Kerberos database
2018-08-02 15:11:38.846555 Aug 02 22:11:38 i-20180802-132959 
krb5kdc[13121](info): TGS_REQ (2 etypes {17 16}) 127.0.0.1: LOOKING_UP_SERVER: 
authtime 0,  impala-test/i-20180802-132...@krbtest.com for 
impala-test/localh...@krbtest.com, Server not found in Kerberos database
2018-08-02 15:11:38.846599 
/home/impdev/Impala/be/src/rpc/rpc-mgr-kerberized-test.cc:72: Failure
2018-08-02 15:11:38.846619 Value of: status_.ok()
2018-08-02 15:11:38.846635   Actual: false
2018-08-02 15:11:38.846651 Expected: true
2018-08-02 15:11:38.846763 Error: unable to execute ScanMem() RPC.: Not 
authorized: Client connection negotiation failed: client connection to 
127.0.0.1:53900: Server impala-test/localh...@krbtest.com not found in Kerberos 
database
{code}

The issue is that in this context the "hostname" is roughly mapping to 
127.0.0.1 but then maps back to localhost.

The workaround is pretty straight-forward; patch forthcoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7385) several timezone conversion tests are failing within test-with-docker context

2018-08-01 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7385:
---

 Summary: several timezone conversion tests are failing within 
test-with-docker context
 Key: IMPALA-7385
 URL: https://issues.apache.org/jira/browse/IMPALA-7385
 Project: IMPALA
  Issue Type: Task
Reporter: Philip Zeyliger


The following tests tend to fail when using test-with-docker. What's common 
about them is that they're doing a time zone conversion.

{code}
 
metadata.test_show_create_table.TestShowCreateTable.test_show_create_table[table_format:
 text/none]1 min 23 sec19
 query_test.test_queries.TestHdfsQueries.test_hdfs_scan_node[exec_option: 
{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
'exec_single_node_rows_threshold': 0} | table_format: orc/def/block]  4.1 
sec 58
 query_test.test_scanners.TestOrc.test_type_conversions[exec_option: 
{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
'exec_single_node_rows_threshold': 0} | table_format: orc/def/block]   16 sec  
58
 ExprTest.TimestampFunctions
{code}

It turns out that people parse {{readlink(/etc/localtime)}} to find out the 
current time zone. This got noticeable when the ORC change landed (though I've 
seen that one come and go) and then very frequently when we changed time zone 
DBs. (Or so I think.)

I have a simple change forthcoming. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7335) Assertion Failure - test_corrupt_files

2018-07-31 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564464#comment-16564464
 ] 

Philip Zeyliger commented on IMPALA-7335:
-

https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/2822/testReport/junit/query_test.test_scanners/TestParquet/test_parquet_exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___debug_action___None___exec_single_node_rows_threshold___0table_format__parquet_none_/
 seems similar too. We expect an error that doesn't show up.

> Assertion Failure - test_corrupt_files
> --
>
> Key: IMPALA-7335
> URL: https://issues.apache.org/jira/browse/IMPALA-7335
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: nithya
>Assignee: Pooja Nilangekar
>Priority: Critical
>  Labels: broken-build
>
> test_corrupt_files fails 
>  
> query_test.test_scanners.TestParquet.test_corrupt_files[exec_option: 
> \\{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from 
> pytest)
>  
> {code:java}
> Error Message
> query_test/test_scanners.py:300: in test_corrupt_files     
> self.run_test_case('QueryTest/parquet-abort-on-error', vector) 
> common/impala_test_suite.py:420: in run_test_case     assert False, "Expected 
> exception: %s" % expected_str E   AssertionError: Expected exception: Column 
> metadata states there are 11 values, but read 10 values from column id.
> STACKTRACE
> query_test/test_scanners.py:300: in test_corrupt_files
>     self.run_test_case('QueryTest/parquet-abort-on-error', vector)
> common/impala_test_suite.py:420: in run_test_case
>     assert False, "Expected exception: %s" % expected_str
> E   AssertionError: Expected exception: Column metadata states there are 11 
> values, but read 10 values from column id.
> Standard Error
> -- executing against localhost:21000
> use functional_parquet;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=0;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id, cnt from bad_column_metadata t, (select count(*) cnt from 
> t.int_array) v;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id from bad_column_metadata;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> -- executing against localhost:21000
> SELECT * from bad_parquet_strings_negative_len;
> -- executing against localhost:21000
> SELECT * from bad_parquet_strings_out_of_bounds;
> -- executing against localhost:21000
> use functional_parquet;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id, cnt from bad_column_metadata t, (select count(*) cnt from 
> t.int_array) v;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id from bad_column_metadata;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7338) Test coverage for codegen with CHAR type

2018-07-30 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16562295#comment-16562295
 ] 

Philip Zeyliger commented on IMPALA-7338:
-

For the bail out path, [~joemcdonnell] recently got code coverage to work on 
our code base. That may be an effective way to see how wide the current testing 
gaps are.

> Test coverage for codegen with CHAR type
> 
>
> Key: IMPALA-7338
> URL: https://issues.apache.org/jira/browse/IMPALA-7338
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: codegen
>
> Until IMPALA-3207 is fixed, we will disable codegen for operators if CHAR is 
> involved. Our test coverage for these cases is not very comprehensive. For 
> instance, we still hit cases such as IMPALA-7032, IMPALA-7288. This Jira aims 
> to track the effort to boost the test coverage for these cases.
> Some ideas:
>  * augment existing test cases to make sure we cover all built-in expressions 
> which can take char type as argument or return type. This should help catch 
> cases such as IMPALA-7032
>  * exercise the bail out path in codegen functions (e.g. hash tables, exec 
> nodes, expressions) by injecting faults in expression codegen functions to 
> fail codegen. This should help catch cases such as IMPALA-7288.
>  * Make sure our query generator will generate all possible combination of 
> built-in expressions (with CHAR type) and all exec nodes which may codegen.
> cc'ing [~mikesbrown] for query generator idea



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6453) Test Python 2.6 in pre-merge testing

2018-07-19 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16549582#comment-16549582
 ] 

Philip Zeyliger commented on IMPALA-6453:
-

I wrote https://jenkins.impala.io/job/python26-incompatibility-check/ . It's 
"working" except that it found a bug being addressed in 
https://gerrit.cloudera.org/#/c/10993/ . After it starts being green, I'll 
shove this into the parallel test runner.

I used t2small and it seems to work end to end in under a minute, so I'm not 
worrying about improving performance.

> Test Python 2.6 in pre-merge testing
> 
>
> Key: IMPALA-6453
> URL: https://issues.apache.org/jira/browse/IMPALA-6453
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Jim Apple
>Priority: Minor
>
> IMPALA-6447 got past 
> [https://jenkins.impala.io/view/Gerrit/job/gerrit-verify-dryrun/] because 
> that runs Impala 2.7, but the test failed under Python 2.6.
> If possible, it might make sense to run tests under both version of Python in 
> pre-merge testing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7274) Add a check to require Java 8

2018-07-10 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16539287#comment-16539287
 ] 

Philip Zeyliger commented on IMPALA-7274:
-

There are a variety ways of doing this, but it's configured for Maven here: 
https://github.com/apache/impala/blob/c01efd09679faaacfd5488fc7f4c1526a1af2f35/fe/pom.xml#L370

Specifically:
{code}
  
org.apache.maven.plugins
maven-compiler-plugin
3.3

  1.7
  1.7

  
{code}

> Add a check to require Java 8
> -
>
> Key: IMPALA-7274
> URL: https://issues.apache.org/jira/browse/IMPALA-7274
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Lars Volker
>Priority: Major
>
> In [this email 
> thread|https://lists.apache.org/thread.html/7a18a69fd687ed0279566dc58b60fd14c39d7d956ddda7636e5b2822@%3Cdev.impala.apache.org%3E]
>  we achieved lazy consensus that we should deprecate support for Java 7 and 
> should require Java 8 going forward.
> We should introduce a pre-compile check for the minimum supported Java 
> version. This will help us prevent opaque errors during compilation from 
> eating up our time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7250) AnalysisException in select datediff() with group by

2018-07-05 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16534266#comment-16534266
 ] 

Philip Zeyliger commented on IMPALA-7250:
-

I've not checked in a debugger, but it's likely that {{now()}} is getting 
rewritten into a constant, but it gets different values in different parts of 
the query.

> AnalysisException in select datediff() with group by
> 
>
> Key: IMPALA-7250
> URL: https://issues.apache.org/jira/browse/IMPALA-7250
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Priority: Major
>
> {noformat}
> [localhost:21000] default> select datediff(day,now()) from test_table where 
> day>=(now() - interval 5 days) group by datediff(day,now());
> Query: select datediff(day,now()) from test_table where day>=(now() - 
> interval 5 days) group by datediff(day,now())
> Query submitted at: 2018-06-05 21:11:23 (Coordinator: http://impala-dev:25000)
> ERROR: AnalysisException: select list expression not produced by aggregation 
> output (missing from GROUP BY clause?): datediff(day, TIMESTAMP '2018-06-05 
> 21:11:23.320564000')
> [localhost:21000] default> set ENABLE_EXPR_REWRITES=0;
> ENABLE_EXPR_REWRITES set to 0
> [localhost:21000] default> select datediff(day,now()) from test_table where 
> day>=(now() - interval 5 days) group by datediff(day,now());
> Query: select datediff(day,now()) from test_table where day>=(now() - 
> interval 5 days) group by datediff(day,now())
> Query submitted at: 2018-06-05 21:11:31 (Coordinator: http://impala-dev:25000)
> Query progress can be monitored at: 
> http://impala-dev:25000/query_plan?query_id=b64c8eabcedc58e5:da9edc86
> Fetched 0 row(s) in 0.21s
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7164) Define public API for RuntimeProfile

2018-06-12 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510337#comment-16510337
 ] 

Philip Zeyliger commented on IMPALA-7164:
-

Sure.

> Define public API for RuntimeProfile
> 
>
> Key: IMPALA-7164
> URL: https://issues.apache.org/jira/browse/IMPALA-7164
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: resource-management, usability
>
> Currently the only public API for the runtime profile is the Thrift 
> definition. The exact layout and counters are all implementation-dependent 
> and subject to change. People are trying to build tools that consume runtime 
> profiles and process them, but that's hard since the counters can change from 
> version to version and sometimes the semantics change.
> I think we need to figure out which things are part of the public API, 
> validate that they make sense and document them clearly. We should also 
> clearly document that things outside of this public API are subject to change 
> without notice. I don't think the public API necessarily needs to be 
> "porcelain", but we should generally try to avoid unnecessary changes and 
> mention any changes in release notes etc.
> We could start simple and just collect "public" counter names in a module and 
> comment each of them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7164) Define public API for RuntimeProfile

2018-06-12 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510312#comment-16510312
 ] 

Philip Zeyliger commented on IMPALA-7164:
-

I think instead of a map of counters keyed by counter name, we should strive to 
put those names in the Thrift schema definition itself. That will allow 
evolution and be considerably more obvious.

> Define public API for RuntimeProfile
> 
>
> Key: IMPALA-7164
> URL: https://issues.apache.org/jira/browse/IMPALA-7164
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: resource-management, usability
>
> Currently the only public API for the runtime profile is the Thrift 
> definition. The exact layout and counters are all implementation-dependent 
> and subject to change. People are trying to build tools that consume runtime 
> profiles and process them, but that's hard since the counters can change from 
> version to version and sometimes the semantics change.
> I think we need to figure out which things are part of the public API, 
> validate that they make sense and document them clearly. We should also 
> clearly document that things outside of this public API are subject to change 
> without notice. I don't think the public API necessarily needs to be 
> "porcelain", but we should generally try to avoid unnecessary changes and 
> mention any changes in release notes etc.
> We could start simple and just collect "public" counter names in a module and 
> comment each of them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7161) Bootstrap's handling of JAVA_HOME needs improvement

2018-06-12 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510039#comment-16510039
 ] 

Philip Zeyliger commented on IMPALA-7161:
-

This all sounds good to me. I think if there's a working java on the path, we 
should use it.

I found the following works for me:
{code}
export JAVA_HOME="$(jrunscript -e 
'java.lang.System.out.println(java.lang.System.getProperty("java.home"));')"
{code}

I think the {{readlink}} approach is good too.

> Bootstrap's handling of JAVA_HOME needs improvement
> ---
>
> Key: IMPALA-7161
> URL: https://issues.apache.org/jira/browse/IMPALA-7161
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Joe McDonnell
>Priority: Major
>
> bin/bootstrap_system.sh installs the Java SDK and sets JAVA_HOME in the 
> current shell. It also adds a command to the bin/impala-config-local.sh to 
> export JAVA_HOME there. This doesn't do the job.
> bin/impala-config.sh tests for JAVA_HOME at the very start of the script, 
> before it has sourced bin/impala-config-local.sh. So, the user doesn't have a 
> way of developing over the long term without manually setting up JAVA_HOME.
> bin/impala-config.sh also doesn't detect the system JAVA_HOME. For Ubuntu 
> 16.04, this is fairly simple and if a developer has their system JDK set up 
> appropriately, it would make sense to use it. For example:
>  
> {noformat}
> # If javac exists, then the system has a Java SDK (JRE does not have javac).
> # Follow the symbolic links and use this to determine the system's JAVA_HOME.
> if [ -L /usr/bin/javac ]; then
>   SYSTEM_JAVA_HOME=$(readlink -f /usr/bin/javac | sed "s:bin/javac::")
> fi
> export JAVA_HOME="${JAVA_HOME:-${SYSTEM_JAVA_HOME}}"{noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7155) Create a way to designate a large number of arbitrary tests for targeted test runs

2018-06-08 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506638#comment-16506638
 ] 

Philip Zeyliger commented on IMPALA-7155:
-

If you're asking for votes, I think pytest marks are the more pleasant 
implementation. An external file seems burdensome.

In terms of de-duping, I think if you produce a spreadsheet of tests and which 
scenarios they write in, we could split them up and reason about whether those 
scenarios make sense or not.

> Create a way to designate a large number of arbitrary tests for targeted test 
> runs
> --
>
> Key: IMPALA-7155
> URL: https://issues.apache.org/jira/browse/IMPALA-7155
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Priority: Major
>
> It's already possible to specify an arbitrary list of test modules, test 
> classes, and/or test functions as command line arguments when running the 
> Impala mini-cluster tests. It's also possible to opt-out of running specific 
> tests by applying any of a variety of skipif markers.
> What we don't have is a comprehensive way for tests to be opted-in to a 
> targeted test run, other than by naming it as a command line argument. This 
> becomes extremely unwieldy beyond a certain number of tests. In fact, we 
> don't have a general concept of targeted test runs at all. The approach to 
> date has been to always run as many tests as possible, except for those tests 
> specifically marked for skipping. This is an OK way to make sure tests don't 
> get overlooked, but it also results in many tests frequently being run in 
> contexts in which they don't necessarily apply, e.g. against S3, or against 
> actual deployed clusters, which can lead to false negatives.
> There are different ways that we could group together a disparate array of 
> tests into a targeted run. We could come up with a permanent series of new 
> pytest markers/decorators for opting-in, as opposed to opting-out, of a given 
> test run. An initial pass would then need to be made to apply the new 
> decorators as needed to all of the existing tests. One could then invoke 
> something like "impala-pytest -m cluster_tests" as needed.
> Another approach might be to define test runs in special files (probably 
> yaml). The file would include a list of which tests to run, possibly along 
> with other test parameters, e.g. "run this list of tests, but only on 
> parquet, and skip tests that require LZO compression."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7069) Java UDF tests can trigger a crash in Java ClassLoader

2018-06-07 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505657#comment-16505657
 ] 

Philip Zeyliger commented on IMPALA-7069:
-

Depending on how we do class loaders, it's entirely possible that we leak Java 
memory (as distinct from malloc() memory) too. The easiest way to check that is 
to run with some {{JAVA_TOOL_OPTIONS=-verbose:gc}} flags and see what comes 
out. (Or configure a GC log file.)

> Java UDF tests can trigger a crash in Java ClassLoader
> --
>
> Key: IMPALA-7069
> URL: https://issues.apache.org/jira/browse/IMPALA-7069
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Taras Bobrovytsky
>Priority: Blocker
>  Labels: broken-build, crash, flaky
> Attachments: hs_err_pid22764.log, hs_err_pid29246.log, 
> hs_err_pid8975.log, hs_err_pid9694.log
>
>
> I hit this crash on a GVO, but was able to reproduce it on master on my 
> desktop.
> Repro steps:
> {code}
> git checkout c1362afb9a072e49df470d9068d44cdbdf5cdec5
> ./buildall.sh -debug -noclean -notests -skiptests -ninja
> start-impala-cluster.py
> while impala-py.test tests/query_test/test_udfs.py -k 'hive or java or jar' 
> -n4 --verbose; do date; done
> {code}
> I generally hit the crash within a hour of looping the test.
> {noformat}
> Stack: [0x7fa04791f000,0x7fa04812],  sp=0x7fa04811aff0,  free 
> space=8175k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> V  [libjvm.so+0x8a8107]
> V  [libjvm.so+0x96cf5f]
> v  ~RuntimeStub::_complete_monitor_locking_Java
> J 2758 C2 
> java.util.concurrent.ConcurrentHashMap.putVal(Ljava/lang/Object;Ljava/lang/Object;Z)Ljava/lang/Object;
>  (362 bytes) @ 0x7fa0c73637d4 [0x7fa0c7362d00+0xad4]
> J 2311 C2 
> java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class; (122 
> bytes) @ 0x7fa0c70a09a8 [0x7fa0c70a08e0+0xc8]
> J 3953 C2 
> java.net.FactoryURLClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;
>  (40 bytes) @ 0x7fa0c71ce0f0 [0x7fa0c71ce0a0+0x50]
> J 2987 C2 
> java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class; (7 
> bytes) @ 0x7fa0c72ddb64 [0x7fa0c72ddb20+0x44]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x661ec4]
> V  [libjvm.so+0x662523]
> V  [libjvm.so+0x9e398d]
> V  [libjvm.so+0x9e2326]
> V  [libjvm.so+0x9e2b50]
> V  [libjvm.so+0x42c099]
> V  [libjvm.so+0x9dc786]
> V  [libjvm.so+0x6a5edf]
> V  [libjvm.so+0x6a70cb]  JVM_DefineClass+0xbb
> V  [libjvm.so+0xa31ea5]
> V  [libjvm.so+0xa37ea7]
> J 4842  
> sun.misc.Unsafe.defineClass(Ljava/lang/String;[BIILjava/lang/ClassLoader;Ljava/security/ProtectionDomain;)Ljava/lang/Class;
>  (0 bytes) @ 0x7fa0c7af120b [0x7fa0c7af1100+0x10b]
> J 13229 C2 sun.reflect.MethodAccessorGenerator$1.run()Ljava/lang/Object; (5 
> bytes) @ 0x7fa0c8cf2a74 [0x7fa0c8cf2940+0x134]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x6b5949]  JVM_DoPrivileged+0x429
> J 1035  
> java.security.AccessController.doPrivileged(Ljava/security/PrivilegedAction;)Ljava/lang/Object;
>  (0 bytes) @ 0x7fa0c7220c7f [0x7fa0c7220bc0+0xbf]
> J 20421 C2 
> sun.reflect.MethodAccessorGenerator.generate(Ljava/lang/Class;Ljava/lang/String;[Ljava/lang/Class;Ljava/lang/Class;[Ljava/lang/Class;IZZLjava/lang/Class;)Lsun/reflect/MagicAccessorImpl;
>  (762 bytes) @ 0x7fa0c89bb848 [0x7fa0c89b9da0+0x1aa8]
> J 4163 C2 
> sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (104 bytes) @ 0x7fa0c789cca8 [0x7fa0c789c8c0+0x3e8]
> J 2379 C2 org.apache.impala.hive.executor.UdfExecutor.evaluate()V (396 bytes) 
> @ 0x7fa0c711c638 [0x7fa0c711c400+0x238]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6648eb]
> V  [libjvm.so+0x6822d7]
> V  [libjvm.so+0x6862c9]
> C  [impalad+0x2a004fa]  JNIEnv_::CallNonvirtualVoidMethodA(_jobject*, 
> _jclass*, _jmethodID*, jvalue const*)+0x40
> C  [impalad+0x29fe4ff]  
> impala::HiveUdfCall::Evaluate(impala::ScalarExprEvaluator*, impala::TupleRow 
> const*) const+0x44b
> C  [impalad+0x29ffde9]  
> impala::HiveUdfCall::GetSmallIntVal(impala::ScalarExprEvaluator*, 
> impala::TupleRow const*) const+0xbb
> C  [impalad+0x2a0948a]  
> impala::ScalarExprEvaluator::GetValue(impala::ScalarExpr const&, 
> impala::TupleRow const*)+0x14c
> C  [impalad+0x2a48eb1]  
> impala::ScalarFnCall::EvaluateNonConstantChildren(impala::ScalarExprEvaluator*,
>  impala::TupleRow const*) const+0x9d
> C  [impalad+0x2a4abba]  impala_udf::BooleanVal 
> impala::ScalarFnCall::InterpretEval(impala::ScalarExprEvaluator*,
>  impala::TupleRow const*) const+0x18c
> C  [impalad+0x2a

[jira] [Commented] (IMPALA-7122) Data load failure: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try

2018-06-07 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505369#comment-16505369
 ] 

Philip Zeyliger commented on IMPALA-7122:
-

I've seen issues with Amazon disks. Joe tells me this isn't on EBS, but for EBS 
people do interesting "warming" things as well as "sparse" volumes: 
[https://aws.amazon.com/blogs/apn/how-to-build-sparse-ebs-volumes-for-fun-and-easy-snapshotting/]

> Data load failure: Failed to replace a bad datanode on the existing pipeline 
> due to no more good datanodes being available to try
> -
>
> Key: IMPALA-7122
> URL: https://issues.apache.org/jira/browse/IMPALA-7122
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Joe McDonnell
>Priority: Critical
>  Labels: flaky
> Attachments: data-load-functional-exhaustive.log, hdfs-logs.tar.gz, 
> impalad.ec2-m2-4xlarge-centos-6-4-0570.vpc.cloudera.com.jenkins.log.INFO.20180604-205755.5587,
>  load-functional-query.log
>
>
> {noformat}
> 20:58:29 Started Loading functional-query data in background; pid 6813.
> 20:58:29 Loading functional-query data (logging to 
> /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-functional-query.log)...
>  
> 20:58:29 Started Loading TPC-H data in background; pid 6814.
> 20:58:29 Loading TPC-H data (logging to 
> /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-tpch.log)...
>  
> 20:58:29 Started Loading TPC-DS data in background; pid 6815.
> 20:58:29 Loading TPC-DS data (logging to 
> /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-tpcds.log)...
>  
> 21:35:26 FAILED (Took: 36 min 57 sec)
> 21:35:26 'load-data functional-query exhaustive' failed. Tail of log:
> 21:35:26  at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213)
> 21:35:26  at 
> org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1086)
> 21:35:26 18/06/04 21:20:29 WARN hdfs.DataStreamer: Error Recovery for 
> BP-1407206351-127.0.0.1-1528170335185:blk_1073743620_2799 in pipeline 
> [DatanodeInfoWithStorage[127.0.0.1:31000,DS-37cfc57c-ab39-443c-80c9-e440cb18b63d,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]]:
>  datanode 
> 0(DatanodeInfoWithStorage[127.0.0.1:31000,DS-37cfc57c-ab39-443c-80c9-e440cb18b63d,DISK])
>  is bad.
> 21:35:26 18/06/04 21:21:29 INFO hdfs.DataStreamer: Exception in 
> createBlockOutputStream blk_1073743620_2799
> 21:35:26 java.io.IOException: Got error, status=ERROR, status message , ack 
> with firstBadLink as 127.0.0.1:31002
> 21:35:26  at 
> org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:134)
> 21:35:26  at 
> org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:110)
> 21:35:26  at 
> org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1778)
> 21:35:26  at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1507)
> 21:35:26  at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1481)
> 21:35:26  at 
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1256)
> 21:35:26  at 
> org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:667)
> 21:35:26 18/06/04 21:21:29 WARN hdfs.DataStreamer: Error Recovery for 
> BP-1407206351-127.0.0.1-1528170335185:blk_1073743620_2799 in pipeline 
> [DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK],
>  
> DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK]]:
>  datanode 
> 1(DatanodeInfoWithStorage[127.0.0.1:31002,DS-4ba4d3a0-af31-4eaf-b43d-89b408231481,DISK])
>  is bad.
> 21:35:26 18/06/04 21:21:29 WARN hdfs.DataStreamer: DataStreamer Exception
> 21:35:26 java.io.IOException: Failed to replace a bad datanode on the 
> existing pipeline due to no more good datanodes being available to try. 
> (Nodes: 
> current=[DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK]],
>  
> original=[DatanodeInfoWithStorage[127.0.0.1:31001,DS-2bc41558-4f2c-460f-ae87-5d1a6acbf42f,DISK]]).
>  The current failed datanode replacement policy is DEFAULT, and a client may 
> configure this via 
> 'dfs.client.block.write.replace-data

[jira] [Commented] (IMPALA-7131) Support external data sources without catalogd

2018-06-06 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503633#comment-16503633
 ] 

Philip Zeyliger commented on IMPALA-7131:
-

When I was looking to remove them, I found that they were actually stored 
specially in the metastore. [https://gerrit.cloudera.org/#/c/9192/] was the 
review. The relevant bit from the commit message was as follows, and there are 
actually more properties to determine the class. So, I think they are persisted.
{code:java}
[pannier.ca.cloudera.com:21000] > create table t (x int) stored as textfile 
tblproperties('__IMPALA_DATA_SOURCE_NAME'='V1');{code}

> Support external data sources without catalogd
> --
>
> Key: IMPALA-7131
> URL: https://issues.apache.org/jira/browse/IMPALA-7131
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Reporter: Todd Lipcon
>Priority: Minor
>
> Currently it seems that external data sources are not persisted except in 
> memory on the catalogd. This means that it will be somewhat more difficult to 
> support this feature in the design of impalad without a catalogd.
> This JIRA is to eventually figure out a way to support this feature -- either 
> by supporting in-memory on a per-impalad basis, or perhaps by figuring out a 
> way to register them persistently in a file system directory, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7129) Can't start catalogd in tests under UBSAN

2018-06-06 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503606#comment-16503606
 ] 

Philip Zeyliger commented on IMPALA-7129:
-

I couldn't spot anything obvious, so I'm reproducing. (I'm stumped about how 
custom cluster tests may be clearing their environment, since 
start-impala-cluster.py certainly seems to depend on $IMPALA_HOME all over the 
place, but reproduction will hopefully tell me.) If you're blocked, I'd be 
happy to prepare a partial revert. The other workaround is to set UBSAN_OPTIONS 
explicitly.

> Can't start catalogd in tests under UBSAN 
> --
>
> Key: IMPALA-7129
> URL: https://issues.apache.org/jira/browse/IMPALA-7129
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Jim Apple
>Assignee: Philip Zeyliger
>Priority: Major
>
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/2377/testReport/junit/custom_cluster.test_admission_controller/TestAdmissionController/test_require_user/
> This custom cluster test failed
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/2377/artifact/Impala/logs_static/logs/custom_cluster_tests/catalogd-error.log/*view*/
> {{UndefinedBehaviorSanitizer: failed to read suppressions file 
> '/home/ubuntu/Impala/be/build/debug/service/./bin/ubsan-suppressions.txt'}}
> A number of other tests failed, too, and I suspect it's 
> https://github.com/apache/impala/commit/48625335d220566a1d69e65fb34bfca9a7dc3cff
>  that broke it. I guess maybe {{IMPALA_HOME}} is not set in some tests?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7120) GVD failed talking to oss.sonatype.org "Bad Gateway"

2018-06-04 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16501236#comment-16501236
 ] 

Philip Zeyliger commented on IMPALA-7120:
-

I've had the mixed fortune of running into a similar thing in the past. 
Somewhere in our dependency chain, we depend on {{org.glassfish:javax.el}} but 
don't specify a version or specify a version range. This causes Maven to 
iterate through any defined \{{}}ies to look for versions that 
would match the restrictions. If any of the remote repositories is down, it 
dies. In a previous incarnation, Impala would reach out to Cloudera's private 
repositories, which had been erroneously specified in Sentry's poms.

I thought I had fixed this both by fixing Sentry and by pinning the version in:
{code:java}




org.glassfish
javax.el
3.0.1-b08


{code}
Based on debug output (running maven with {{-X}}), this takes effect, but it 
looks like it still talks to remote servers, though I wasn't able to see the 
exact URL you saw downloading.

I'd say let's track this one for a little bit longer. There are heavy-weight 
ways to tackle this too like setting up a Maven mirror, but it's kind of 
unpleasant.

 

> GVD failed talking to oss.sonatype.org "Bad Gateway"
> 
>
> Key: IMPALA-7120
> URL: https://issues.apache.org/jira/browse/IMPALA-7120
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build, flaky
>
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/2368/
> I'm not sure what would cause this.
> {noformat}
> 22:56:20 ] [WARNING] Could not transfer metadata 
> com.cloudera.cdh:cdh-root:6.x-SNAPSHOT/maven-metadata.xml from/to 
> ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}): Cannot access 
> ${distMgmtSnapshotsUrl} with type default using the available connector 
> factories: BasicRepositoryConnectorFactory
> 22:56:20 ] [WARNING] Could not transfer metadata 
> org.glassfish:javax.el/maven-metadata.xml from/to sonatype-nexus-snapshots 
> (https://oss.sonatype.org/content/repositories/snapshots): Failed to transfer 
> file: 
> https://oss.sonatype.org/content/repositories/snapshots/org/glassfish/javax.el/maven-metadata.xml.
>  Return code is: 502 , ReasonPhrase:Bad Gateway.
> 22:56:20 ]
>
>
>
> [WARNING] Could not transfer metadata 
> org.glassfish:javax.el:3.0.1-b06-SNAPSHOT/maven-metadata.xml from/to 
> sonatype-nexus-snapshots 
> (https://oss.sonatype.org/content/repositories/snapshots): Failed to transfer 
> file: 
> https://oss.sonatype.org/content/repositories/snapshots/org/glassfish/javax.el/3.0.1-b06-SNAPSHOT/maven-metadata.xml.
>  Return code is: 502 , ReasonPhrase:Bad Gateway.
> 22:56:20 ] [WARNING] Failure to transfer 
> org.glassfish:javax.el:3.0.1-b06-SNAPSHOT/maven-metadata.xml from 
> https://oss.sonatype.org/content/repositories/snapshots was cached in the 
> local repository, resolution will not be reattempted until the update 
> interval of sonatype-nexus-snapshots has elapsed or updates are forced. 
> Original error: Could not transfer metadata 
> org.glassfish:javax.el:3.0.1-b06-SNAPSHOT/maven-metadata.xml from/to 
> sonatype-nexus-snapshots 
> (https://oss.sonatype.org/content/repositories/snapshots): Failed to transfer 
> file: 
> https://oss.sonatype.org/content/repositories/snapshots/org/glassfish/javax.el/3.0.1-b06-SNAPSHOT/maven-metadata.xml.
>  Return code is: 502 , ReasonPhrase:Bad Gateway.
> 22:56:20 ]
>
>
>
> [WARNING] Could not transfer metadata 
> org.glassfish:javax.el:3.0.1-b07-SNAPSHOT/maven-metadata.xml from/to 
> sonatype-nexus-snapshots 
> (https://oss.sonatype.org/content/repositories/snapshots): Failed to transfer 
> file: 
> https://oss.sonatype.org/content/repositories/snapshots/org/glassfish/javax.el/3.0.1-b07-SNAPSHOT/maven-metadata.xml.
>  Return code is: 502 , ReasonPhrase:Bad Gateway.
> 22:56:20 ] [WARNING] Failure to transfer 
> org.glassfish:javax.el:3.0.1-b07-SNAPSHOT/maven-metadata.xml from 
> https://oss.sonatype.org/content/repositories/snapshots was cached in the 
> local repository, resolution will not be reattempted until the update 
> interval of sonatype-nexus-snapshots has elapsed or updates are forced. 
> Original error: Could not transfer metadata 
> org.glassfish:javax.el:3.0.1-b07-SNAPSHOT/maven-metadata.xml from/to 
> sonatype-nexus-snapshots 
> (https://oss.sonatype.org/content/repositories/snapshots): Failed to transfer 
> file: 
> https://oss.sonatype.org/content/repositories/snapshots/org/glassfish/javax.el/3.0.1-b07-SNAPSHOT/maven-metadata.xml.
>  Return code is: 502 , ReasonPhrase:Bad Gateway.

[jira] [Commented] (IMPALA-7115) Set a default THREAD_RESERVATION_LIMIT value

2018-06-01 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498680#comment-16498680
 ] 

Philip Zeyliger commented on IMPALA-7115:
-

Though it's hard to enumerate all possibilities, logging and comparing the 
limits we know about (like {{/proc/self/limits}}) especially on centos6/7 seems 
quite useful.

> Set a default THREAD_RESERVATION_LIMIT value
> 
>
> Key: IMPALA-7115
> URL: https://issues.apache.org/jira/browse/IMPALA-7115
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: resource-management
>
> As a follow on to IMPALA-6035, we should set a default value that actually 
> will help protect again insanely complex queries.
> Motivating discussion is here: 
> https://gerrit.cloudera.org/#/c/10365/9/common/thrift/ImpalaInternalService.thrift
> {quote}
> Tim Armstrong
> 1:11 PM
> Dan suggested setting a default here. I started doing some experiments to see 
> what our current practical limits are.
> On stock Ubuntu 16.04 I start getting thread_resource_error at around 8000 
> reserved threads. I'm not sure that the config reflects what people would use 
> on production systems so continuing to investigate.
> Dan Hecht
> 1:31 PM
> We could also consider choosing a default dynamically based on the OS's 
> setting, if that's necessary.
> Tim Armstrong
> 3:45 PM
> I increased some of the configs (I think I was limited by 
> /sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.max == 12288) and now it 
> got oom-killed at ~26000 threads.
> I think unfortunately there are a lot of different OS knobs that impact this 
> and they seem to evolve over time, so it's probably not feasible with a 
> reasonable amount of effort to get it working on all common Linux distros.
> I was thinking ~5000, since 1000-2000 plan nodes is the most I've seen for a 
> query running successfully in production.
> Maybe I should do this in a follow-on change, since we probably also want to 
> add a test query at or near this limit.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6990) TestClientSsl.test_tls_v12 failing due to Python SSL error

2018-05-29 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494139#comment-16494139
 ] 

Philip Zeyliger commented on IMPALA-6990:
-

Is this user-visible? Let's say that a user had impala-shell working on RH6 or 
RH7 before. Does it still work? Does it work when using the same 
{{ssl-minimum-version}} and {{ssl-cipher-list}} flags?

I think this test is saying that these flags don't work for the Python shipped 
in RH7. I suspect they didn't work before either: did they somehow work before? 
Surely before the Thrift change, we were using the same RH image?

Once we've figured this out, I think the easier thing to do is to disable the 
test when using a too-old version of Python. We already have a "skip if legacy 
SSL" flag on the test; this is just one more skip if. We still want to run the 
test for Ubuntu16 or whatever. I think we can assume that the Python running 
the test and the python running impala-shell are the same for our purposes.

Is there a weaker test that we'd want to add?

> TestClientSsl.test_tls_v12 failing due to Python SSL error
> --
>
> Key: IMPALA-6990
> URL: https://issues.apache.org/jira/browse/IMPALA-6990
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Blocker
>  Labels: broken-build, flaky
>
> We've seen quite a few jobs fail with the following error:
> *_ssl.c:504: EOF occurred in violation of protocol*
> {code:java}
> custom_cluster/test_client_ssl.py:128: in test_tls_v12
> self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR)
> custom_cluster/test_client_ssl.py:181: in _validate_positive_cases
> result = run_impala_shell_cmd(shell_options)
> shell/util.py:97: in run_impala_shell_cmd
> result.stderr)
> E   AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: 
> Starting Impala Shell without Kerberos authentication
> E   SSL is enabled. Impala server certificates will NOT be verified (set 
> --ca_cert to change)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 3th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 4th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 5th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:216:
>  DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE 
> instead
> E DeprecationWarning)
> E   No handlers could be found for logger "thrift.transport.TSSLSocket"
> E   Error connecting: TTransportException, Could not connect to 
> localhost:21000: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
> E   Not connected to Impala, could not execute queries.
> {code}
> We need to investigate why this is happening and fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7091) Occasional errors with failure.test_failpoints.TestFailpoints.test_failpoints in hbase tests

2018-05-29 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7091:
---

 Summary: Occasional errors with 
failure.test_failpoints.TestFailpoints.test_failpoints in hbase tests
 Key: IMPALA-7091
 URL: https://issues.apache.org/jira/browse/IMPALA-7091
 Project: IMPALA
  Issue Type: Task
  Components: Frontend
Reporter: Philip Zeyliger
Assignee: Philip Zeyliger


When running the following test with "test-with-docker", I sometimes (but not 
always) see it fail.
{code:java}
failure.test_failpoints.TestFailpoints.test_failpoints[table_format: hbase/none 
| exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
0} | mt_dop: 4 | location: OPEN | action: MEM_LIMIT_EXCEEDED | query: select * 
from alltypessmall union all select * from alltypessmall]{code}
 
The error I see is an NPE, and, correlating some logs, I think it's this:
{code}
26420:I0524 14:30:14.696190 12271 jni-util.cc:230] 
java.lang.NullPointerException
26421-  at 
org.apache.impala.catalog.HBaseTable.getRegionSize(HBaseTable.java:652)
26422-  at 
org.apache.impala.catalog.HBaseTable.getEstimatedRowStatsForRegion(HBaseTable.java:520)
26423-  at 
org.apache.impala.catalog.HBaseTable.getEstimatedRowStats(HBaseTable.java:605)
26424-  at 
org.apache.impala.planner.HBaseScanNode.computeStats(HBaseScanNode.java:203)
26425-  at org.apache.impala.planner.HBaseScanNode.init(HBaseScanNode.java:127)
26426-  at 
org.apache.impala.planner.SingleNodePlanner.createScanNode(SingleNodePlanner.java:1344)
26427-  at 
org.apache.impala.planner.SingleNodePlanner.createTableRefNode(SingleNodePlanner.java:1514)
26428-  at 
org.apache.impala.planner.SingleNodePlanner.createTableRefsPlan(SingleNodePlanner.java:776)
26429-  at 
org.apache.impala.planner.SingleNodePlanner.createSelectPlan(SingleNodePlanner.java:614)
26430-  at 
org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:257)
26431-  at 
org.apache.impala.planner.SingleNodePlanner.createUnionPlan(SingleNodePlanner.java:1563)
26432-  at 
org.apache.impala.planner.SingleNodePlanner.createUnionPlan(SingleNodePlanner.java:1630)
26433-  at 
org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:275)
26434-  at 
org.apache.impala.planner.SingleNodePlanner.createSingleNodePlan(SingleNodePlanner.java:147)
26435-  at org.apache.impala.planner.Planner.createPlan(Planner.java:101)
26436-  at 
org.apache.impala.planner.Planner.createParallelPlans(Planner.java:230)
26437-  at 
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:938)
26438-  at 
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1062)
26439-  at 
org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:156)
26440:I0524 14:30:14.796514 12271 status.cc:125] NullPointerException: null
26441-@  0x1891839  impala::Status::Status()
{code}

The test-with-docker stuff starts HBase at run time independently of data load 
in a way that our other tests don't, and I suspect HBase simply hasn't loaded 
the tables.

I have a change forthcoming to address this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7063) Miniprofile 2 compilation broken on trunk

2018-05-24 Thread Philip Zeyliger (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger resolved IMPALA-7063.
-
Resolution: Fixed
  Assignee: Philip Zeyliger

> Miniprofile 2 compilation broken on trunk
> -
>
> Key: IMPALA-7063
> URL: https://issues.apache.org/jira/browse/IMPALA-7063
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Major
>
> The commit for IMPALA-7019 used {{FileStatus.isErasureCoded()}} which doesn't 
> exist in Hadoop 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6119) Inconsistent file metadata updates when multiple partitions point to the same path

2018-05-24 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489375#comment-16489375
 ] 

Philip Zeyliger commented on IMPALA-6119:
-

Does it make sense to be allowing two partitions to have the same location? Is 
Impala's behavior consistent with Hive and Spark when this happens?

> Inconsistent file metadata updates when multiple partitions point to the same 
> path
> --
>
> Key: IMPALA-6119
> URL: https://issues.apache.org/jira/browse/IMPALA-6119
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>Reporter: bharath v
>Assignee: Gabor Kaszab
>Priority: Critical
>  Labels: correctness, ramp-up
>
> Following steps can give inconsistent results.
> {noformat}
> // Create a partitioned table
> create table test(a int) partitioned by (b int);
> // Create two partitions b=1/b=2 mapped to the same HDFS location.
> insert into test partition(b=1) values (1);
> alter table test add partition (b=2) location 
> 'hdfs://localhost:20500/test-warehouse/test/b=1/' 
> [localhost:21000] > show partitions test;
> Query: show partitions test
> +---+---++--+--+---++---++
> | b | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | 
> Incremental stats | Location   |
> +---+---++--+--+---++---++
> | 1 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | 2 | -1| 1  | 2B   | NOT CACHED   | NOT CACHED| TEXT   | 
> false | hdfs://localhost:20500/test-warehouse/test/b=1 |
> | Total | -1| 2  | 4B   | 0B   |   || 
>   ||
> +---+---++--+--+---++---++
> // Insert new data into one of the partitions
> insert into test partition(b=1) values (2);
> // Newly added file is reflected only in the added partition files. 
> show files in test;
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=2   |
> ++--+---+
> invalidate metadata test;
>  show files in test;
> // After invalidation, the newly added file now shows up in both the 
> partitions.
> Query: show files in test
> ++--+---+
> | Path
>| Size | Partition |
> ++--+---+
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=1   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/2e44cd49e8c3d30d-572fc978_627280230_data.0.
>  | 2B   | b=2   |
> | 
> hdfs://localhost:20500/test-warehouse/test/b=1/e44245ad5c0ef020-a08716d_1244237483_data.0.
>  | 2B   | b=2   |
> ++--+---+
> {noformat}
> So, depending whether the user invalidates the table, they can see different 
> results. The bug is in the following code.
> {noformat}
> private FileMetadataLoadStats resetAndLoadFileMetadata(
>   Path partD

[jira] [Created] (IMPALA-7063) Miniprofile 2 compilation broken on trunk

2018-05-23 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7063:
---

 Summary: Miniprofile 2 compilation broken on trunk
 Key: IMPALA-7063
 URL: https://issues.apache.org/jira/browse/IMPALA-7063
 Project: IMPALA
  Issue Type: Task
  Components: Frontend
Reporter: Philip Zeyliger


The commit for IMPALA-7019 used {{FileStatus.isErasureCoded()}} which doesn't 
exist in Hadoop 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7051) Concurrent Maven invocations can break build

2018-05-21 Thread Philip Zeyliger (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger resolved IMPALA-7051.
-
Resolution: Fixed

> Concurrent Maven invocations can break build
> 
>
> Key: IMPALA-7051
> URL: https://issues.apache.org/jira/browse/IMPALA-7051
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Major
>
> Rarely I've seen our build fail when executing two Maven targets 
> simultaneously. Maven isn't really safe for concurrent execution (e.g., 
> ~/.m2/repository has no locking).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7054) "Top-25 tables with highest memory requirements" sorts incorrectly

2018-05-21 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16482829#comment-16482829
 ] 

Philip Zeyliger commented on IMPALA-7054:
-

Is this a dupe of:
{code:java}

commit ea4715fd76d6dba0c3777146989c2bf020efabdd
Author: stiga-huang 
Date: Thu May 3 06:44:42 2018 -0700

IMPALA-6966: sort table memory by size in catalogd web UI

This patch fix the sorting order in "Top-K Tables with Highest
Memory Requirements" in which "Estimated memory" column is sorted
as strings.

Values got from the catalog-server are changed from pretty-printed
strings to bytes numbers. So the web UI is able to sort and render
them correctly.

Change-Id: I60dc253f862f5fde6fa96147f114d8765bb31a85
Reviewed-on: http://gerrit.cloudera.org:8080/10292
Reviewed-by: Dimitris Tsirogiannis 
Tested-by: Impala Public Jenkins {code}

> "Top-25 tables with highest memory requirements" sorts incorrectly
> --
>
> Key: IMPALA-7054
> URL: https://issues.apache.org/jira/browse/IMPALA-7054
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.12.0
>Reporter: Todd Lipcon
>Priority: Minor
>
> The table on catalogd:25020/catalog has an "estimated memory" column which 
> sorts based on the stringified value. For example, "2.07 GB" sorts below 
> "23.65 MB".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7051) Concurrent Maven invocations can break build

2018-05-18 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7051:
---

 Summary: Concurrent Maven invocations can break build
 Key: IMPALA-7051
 URL: https://issues.apache.org/jira/browse/IMPALA-7051
 Project: IMPALA
  Issue Type: Task
  Components: Infrastructure
Reporter: Philip Zeyliger
Assignee: Philip Zeyliger


Rarely I've seen our build fail when executing two Maven targets 
simultaneously. Maven isn't really safe for concurrent execution (e.g., 
~/.m2/repository has no locking).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7035) Impala HDFS Encryption tests failing after OpenJDK update

2018-05-17 Thread Philip Zeyliger (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger resolved IMPALA-7035.
-
Resolution: Fixed
  Assignee: Philip Zeyliger

> Impala HDFS Encryption tests failing after OpenJDK update
> -
>
> Key: IMPALA-7035
> URL: https://issues.apache.org/jira/browse/IMPALA-7035
> Project: IMPALA
>  Issue Type: Task
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Major
>
> I have seen {{impala-py.test tests/metadata/test_hdfs_encryption.py}} fail 
> with the following error:
> {{E AssertionError: Error creating encryption zone: RemoteException: Can't 
> recover key for testkey1 from keystore 
> [file:/home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore|file:///home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore]}}
> I believe what's going on is described in 
> https://issues.apache.org/jira/browse/HDFS-13494. In short, the JDK now has a 
> special whitelist for an API as a result of a security vulnerability.
> A workaround in the KMS init script to configure $HADOOP_OPTS seems to do the 
> trick.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7035) Impala HDFS Encryption tests failing after OpenJDK update

2018-05-15 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created IMPALA-7035:
---

 Summary: Impala HDFS Encryption tests failing after OpenJDK update
 Key: IMPALA-7035
 URL: https://issues.apache.org/jira/browse/IMPALA-7035
 Project: IMPALA
  Issue Type: Task
Reporter: Philip Zeyliger


I have seen {{impala-py.test tests/metadata/test_hdfs_encryption.py}} fail with 
the following error:

{{E AssertionError: Error creating encryption zone: RemoteException: Can't 
recover key for testkey1 from keystore 
[file:/home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore|file:///home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore]}}

I believe what's going on is described in 
https://issues.apache.org/jira/browse/HDFS-13494. In short, the JDK now has a 
special whitelist for an API as a result of a security vulnerability.

A workaround in the KMS init script to configure $HADOOP_OPTS seems to do the 
trick.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7014) Disable stacktrace symbolisation by default

2018-05-11 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472424#comment-16472424
 ] 

Philip Zeyliger commented on IMPALA-7014:
-

Yep, fair enough.

> Disable stacktrace symbolisation by default
> ---
>
> Key: IMPALA-7014
> URL: https://issues.apache.org/jira/browse/IMPALA-7014
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Not Applicable
>Reporter: Tim Armstrong
>Assignee: Joe McDonnell
>Priority: Critical
>
> We got burned by the code of producing stacktrace again with IMPALA-6996. I 
> did a quick investigation into this, based on the hypothesis that the 
> symbolisation was the expensive part, rather than getting the addresses. I 
> added a stopwatch to GetStackTrace() to measure the time in nanoseconds and 
> ran a test that produces a backtrace
> The first experiment was 
> {noformat}
> $ start-impala-cluster.py --impalad_args='--symbolize_stacktrace=true' && 
> impala-py.test tests/query_test/test_scanners.py -k codec
> I0511 09:45:11.897944 30904 debug-util.cc:283] stacktrace time: 75175573
> I0511 09:45:11.897956 30904 status.cc:125] File 
> 'hdfs://localhost:20500/test-warehouse/test_bad_compression_codec_308108.db/bad_codec/bad_codec.parquet'
>  uses an unsupported compression: 5000 for column 'id'.
> @  0x18782ef  impala::Status::Status()
> @  0x2cbe96f  
> impala::ParquetMetadataUtils::ValidateRowGroupColumn()
> @  0x205f597  impala::BaseScalarColumnReader::Reset()
> @  0x1feebe6  impala::HdfsParquetScanner::InitScalarColumns()
> @  0x1fe6ff3  impala::HdfsParquetScanner::NextRowGroup()
> @  0x1fe58d8  impala::HdfsParquetScanner::GetNextInternal()
> @  0x1fe3eea  impala::HdfsParquetScanner::ProcessSplit()
> @  0x1f6ba36  impala::HdfsScanNode::ProcessSplit()
> @  0x1f6adc4  impala::HdfsScanNode::ScannerThread()
> @  0x1f6a1c4  
> _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
> @  0x1f6c2a6  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x1bd3b1a  boost::function0<>::operator()()
> @  0x1ebecd5  impala::Thread::SuperviseThread()
> @  0x1ec6e71  boost::_bi::list5<>::operator()<>()
> @  0x1ec6d95  boost::_bi::bind_t<>::operator()()
> @  0x1ec6d58  boost::detail::thread_data<>::run()
> @  0x31b3ada  thread_proxy
> @ 0x7f9be67d36ba  start_thread
> @ 0x7f9be650941d  clone
> {noformat}
> The stacktrace took 75ms, which is pretty bad! It would be worse on a 
> production system with more memory maps.
> The next experiment was to disable it:
> {noformat}
> start-impala-cluster.py --impalad_args='--symbolize_stacktrace=false' && 
> impala-py.test tests/query_test/test_scanners.py -k codec
> I0511 09:43:47.574185 29514 debug-util.cc:283] stacktrace time: 29528
> I0511 09:43:47.574193 29514 status.cc:125] File 
> 'hdfs://localhost:20500/test-warehouse/test_bad_compression_codec_cb5d0225.db/bad_codec/bad_codec.parquet'
>  uses an unsupported compression: 5000 for column 'id'.
> @  0x18782ef
> @  0x2cbe96f
> @  0x205f597
> @  0x1feebe6
> @  0x1fe6ff3
> @  0x1fe58d8
> @  0x1fe3eea
> @  0x1f6ba36
> @  0x1f6adc4
> @  0x1f6a1c4
> @  0x1f6c2a6
> @  0x1bd3b1a
> @  0x1ebecd5
> @  0x1ec6e71
> @  0x1ec6d95
> @  0x1ec6d58
> @  0x31b3ada
> @ 0x7fbdcbdef6ba
> @ 0x7fbdcbb2541d
> {noformat}
> That's 2545x faster! If the addresses are in the statically linked binary, we 
> can use addrline to get back the line numbers:
> {noformat}
> $ addr2line -e be/build/latest/service/impalad 0x2cbe96f
> /home/tarmstrong/Impala/incubator-impala/be/src/exec/parquet-metadata-utils.cc:166
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7014) Disable stacktrace symbolisation by default

2018-05-11 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472292#comment-16472292
 ] 

Philip Zeyliger commented on IMPALA-7014:
-

There's a supportability trade-off in that these log lines will now be much 
harder to look at. We'll always need the exact build to decipher a log file. 
With some effort, we could likely cache ~10,000 addresses and probably get most 
of the benefit.

The other approach is to avoid printing similar log lines more than once per N 
minutes.

Anyway, I'm open to turning it off, and that's clearly pretty safe, but we 
should keep an eye out to see if we regret it.

> Disable stacktrace symbolisation by default
> ---
>
> Key: IMPALA-7014
> URL: https://issues.apache.org/jira/browse/IMPALA-7014
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Not Applicable
>Reporter: Tim Armstrong
>Assignee: Joe McDonnell
>Priority: Critical
>
> We got burned by the code of producing stacktrace again with IMPALA-6996. I 
> did a quick investigation into this, based on the hypothesis that the 
> symbolisation was the expensive part, rather than getting the addresses. I 
> added a stopwatch to GetStackTrace() to measure the time in nanoseconds and 
> ran a test that produces a backtrace
> The first experiment was 
> {noformat}
> $ start-impala-cluster.py --impalad_args='--symbolize_stacktrace=true' && 
> impala-py.test tests/query_test/test_scanners.py -k codec
> I0511 09:45:11.897944 30904 debug-util.cc:283] stacktrace time: 75175573
> I0511 09:45:11.897956 30904 status.cc:125] File 
> 'hdfs://localhost:20500/test-warehouse/test_bad_compression_codec_308108.db/bad_codec/bad_codec.parquet'
>  uses an unsupported compression: 5000 for column 'id'.
> @  0x18782ef  impala::Status::Status()
> @  0x2cbe96f  
> impala::ParquetMetadataUtils::ValidateRowGroupColumn()
> @  0x205f597  impala::BaseScalarColumnReader::Reset()
> @  0x1feebe6  impala::HdfsParquetScanner::InitScalarColumns()
> @  0x1fe6ff3  impala::HdfsParquetScanner::NextRowGroup()
> @  0x1fe58d8  impala::HdfsParquetScanner::GetNextInternal()
> @  0x1fe3eea  impala::HdfsParquetScanner::ProcessSplit()
> @  0x1f6ba36  impala::HdfsScanNode::ProcessSplit()
> @  0x1f6adc4  impala::HdfsScanNode::ScannerThread()
> @  0x1f6a1c4  
> _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
> @  0x1f6c2a6  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x1bd3b1a  boost::function0<>::operator()()
> @  0x1ebecd5  impala::Thread::SuperviseThread()
> @  0x1ec6e71  boost::_bi::list5<>::operator()<>()
> @  0x1ec6d95  boost::_bi::bind_t<>::operator()()
> @  0x1ec6d58  boost::detail::thread_data<>::run()
> @  0x31b3ada  thread_proxy
> @ 0x7f9be67d36ba  start_thread
> @ 0x7f9be650941d  clone
> {noformat}
> The stacktrace took 75ms, which is pretty bad! It would be worse on a 
> production system with more memory maps.
> The next experiment was to disable it:
> {noformat}
> start-impala-cluster.py --impalad_args='--symbolize_stacktrace=false' && 
> impala-py.test tests/query_test/test_scanners.py -k codec
> I0511 09:43:47.574185 29514 debug-util.cc:283] stacktrace time: 29528
> I0511 09:43:47.574193 29514 status.cc:125] File 
> 'hdfs://localhost:20500/test-warehouse/test_bad_compression_codec_cb5d0225.db/bad_codec/bad_codec.parquet'
>  uses an unsupported compression: 5000 for column 'id'.
> @  0x18782ef
> @  0x2cbe96f
> @  0x205f597
> @  0x1feebe6
> @  0x1fe6ff3
> @  0x1fe58d8
> @  0x1fe3eea
> @  0x1f6ba36
> @  0x1f6adc4
> @  0x1f6a1c4
> @  0x1f6c2a6
> @  0x1bd3b1a
> @  0x1ebecd5
> @  0x1ec6e71
> @  0x1ec6d95
> @  0x1ec6d58
> @  0x31b3ada
> @ 0x7fbdcbdef6ba
> @ 0x7fbdcbb2541d
> {noformat}
> That's 2545x faster! If the addresses are in the statically linked binary, we 
> can use addrline to get back the line numbers:
> {noformat}
> $ addr2line -e be/build/latest/service/impalad 0x2cbe96f
> /home/tarmstrong/Impala/incubator-impala/be/src/exec/parquet-metadata-utils.cc:166
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org

  1   2   >