[jira] [Created] (IMPALA-7188) Consider changing the output of AVG() to decimal

2018-06-19 Thread Taras Bobrovytsky (JIRA)
Taras Bobrovytsky created IMPALA-7188:
-

 Summary: Consider changing the output of AVG() to decimal
 Key: IMPALA-7188
 URL: https://issues.apache.org/jira/browse/IMPALA-7188
 Project: IMPALA
  Issue Type: Task
  Components: Frontend
Affects Versions: Impala 3.0
Reporter: Taras Bobrovytsky


Currently AVG() returns a DOUBLE (under both Decimal V1 and V2). It 
might be a good idea to change the return type to a DECIMAL instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-7187) Skip test_group_impersonation when running inside Docker

2018-06-19 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7187 started by Fredy Wijaya.

> Skip test_group_impersonation when running inside Docker
> 
>
> Key: IMPALA-7187
> URL: https://issues.apache.org/jira/browse/IMPALA-7187
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
>  Labels: broken-build
>
> test_group_impersonation runs fine inside a standard Docker container, but it 
> fails when running with test-with-docker.py. There maybe something a bit 
> special with the way test-with-docker.py runs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7187) Skip test_group_impersonation when running inside Docker

2018-06-19 Thread Fredy Wijaya (JIRA)
Fredy Wijaya created IMPALA-7187:


 Summary: Skip test_group_impersonation when running inside Docker
 Key: IMPALA-7187
 URL: https://issues.apache.org/jira/browse/IMPALA-7187
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 2.12.0, Impala 3.0
Reporter: Fredy Wijaya
Assignee: Fredy Wijaya


test_group_impersonation runs fine inside a standard Docker container, but it 
fails when running with test-with-docker.py. There maybe something a bit 
special with the way test-with-docker.py runs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-5335) Review consistency of ADLS python client used for Impala testing

2018-06-19 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil resolved IMPALA-5335.
---
   Resolution: Fixed
Fix Version/s: Impala 2.12.0

The ADLS client has been fixed upstream so this is not an issue anymore. See 
HADOOP-14450.

> Review consistency of ADLS python client used for Impala testing
> 
>
> Key: IMPALA-5335
> URL: https://issues.apache.org/jira/browse/IMPALA-5335
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Major
>  Labels: infrastructure
> Fix For: Impala 2.12.0
>
>
> The ADLS Python client seems to have consistency issues even though ADLS 
> claims to be strongly consistent.
> Some of our tests are skipped because of this issue, with the tag 
> SkipIfADLS.slow_client.
> The documentation for the Python client doesn't seem to state or address this 
> as a known issue. It is however, a pre-release client.
> This JIRA is meant to track this issue on the Impala side, and close it once 
> it's addressed by ADLS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7186) Docs for kudu_read_mode

2018-06-19 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-7186:

Issue Type: Task  (was: Improvement)

> Docs for kudu_read_mode
> ---
>
> Key: IMPALA-7186
> URL: https://issues.apache.org/jira/browse/IMPALA-7186
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: docs, future_release_doc
>
> IMPALA-6812 added a new query option, KUDU_READ_MODE, which should be 
> documented with something like:
> KUDU_READ_MODE Query Option
> This query option allows users to set a desired consistency level for scans 
> of Kudu tables. Possible values are DEFAULT, READ_LATEST, and 
> READ_AT_SNAPSHOT. If DEFAULT is specified, the value of the startup flag 
> '--kudu_read_mode' will be used.
> READ_LATEST
> Kudu provides no consistency guarantees for this mode, expect that all 
> returned rows were committed at some point, sometimes known as 'Read 
> Committed' isolation.
> READ_AT_SNAPSHOT
> Kudu will take a snapshot of the current state of the data and perform the 
> scan over the snapshot, possibly after briefly waiting for ongoing writes to 
> complete. This provides "Read Your Writes" consistency within a single Impala 
> session, except in the case of a Kudu leader change. See the Kudu 
> documentation for more details.
> Type: string
> Default: DEFAULT
> Added in: Impala 3.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7186) Docs for kudu_read_mode

2018-06-19 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni reassigned IMPALA-7186:
---

Assignee: Alex Rodoni

> Docs for kudu_read_mode
> ---
>
> Key: IMPALA-7186
> URL: https://issues.apache.org/jira/browse/IMPALA-7186
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Docs
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: docs
>
> IMPALA-6812 added a new query option, KUDU_READ_MODE, which should be 
> documented with something like:
> KUDU_READ_MODE Query Option
> This query option allows users to set a desired consistency level for scans 
> of Kudu tables. Possible values are DEFAULT, READ_LATEST, and 
> READ_AT_SNAPSHOT. If DEFAULT is specified, the value of the startup flag 
> '--kudu_read_mode' will be used.
> READ_LATEST
> Kudu provides no consistency guarantees for this mode, expect that all 
> returned rows were committed at some point, sometimes known as 'Read 
> Committed' isolation.
> READ_AT_SNAPSHOT
> Kudu will take a snapshot of the current state of the data and perform the 
> scan over the snapshot, possibly after briefly waiting for ongoing writes to 
> complete. This provides "Read Your Writes" consistency within a single Impala 
> session, except in the case of a Kudu leader change. See the Kudu 
> documentation for more details.
> Type: string
> Default: DEFAULT
> Added in: Impala 3.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7186) Docs for kudu_read_mode

2018-06-19 Thread Thomas Tauber-Marshall (JIRA)
Thomas Tauber-Marshall created IMPALA-7186:
--

 Summary: Docs for kudu_read_mode
 Key: IMPALA-7186
 URL: https://issues.apache.org/jira/browse/IMPALA-7186
 Project: IMPALA
  Issue Type: Improvement
  Components: Docs
Affects Versions: Impala 3.1.0
Reporter: Thomas Tauber-Marshall


IMPALA-6812 added a new query option, KUDU_READ_MODE, which should be 
documented with something like:

KUDU_READ_MODE Query Option

This query option allows users to set a desired consistency level for scans of 
Kudu tables. Possible values are DEFAULT, READ_LATEST, and READ_AT_SNAPSHOT. If 
DEFAULT is specified, the value of the startup flag '--kudu_read_mode' will be 
used.

READ_LATEST
Kudu provides no consistency guarantees for this mode, expect that all returned 
rows were committed at some point, sometimes known as 'Read Committed' 
isolation.

READ_AT_SNAPSHOT
Kudu will take a snapshot of the current state of the data and perform the scan 
over the snapshot, possibly after briefly waiting for ongoing writes to 
complete. This provides "Read Your Writes" consistency within a single Impala 
session, except in the case of a Kudu leader change. See the Kudu documentation 
for more details.

Type: string
Default: DEFAULT
Added in: Impala 3.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-3825) Distribute runtime filter aggregation across cluster

2018-06-19 Thread Sailesh Mukil (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sailesh Mukil reassigned IMPALA-3825:
-

Assignee: Rahul Shivu Mahadev  (was: Sailesh Mukil)

> Distribute runtime filter aggregation across cluster
> 
>
> Key: IMPALA-3825
> URL: https://issues.apache.org/jira/browse/IMPALA-3825
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.6.0
>Reporter: Henry Robinson
>Assignee: Rahul Shivu Mahadev
>Priority: Major
>  Labels: runtime-filters
>
> Runtime filters can be tens of MB or more, and incasting all filters from all 
> shuffle joins to the coordinator can put a lot of memory pressure on that 
> node. To alleviate this we should consider spreading out the aggregation 
> operation across the cluster, so that a different node aggregates each 
> runtime filter.
> This still restricts aggregation to #runtime-filters nodes, which will 
> usually be less than the cluster size. If we want to smooth that out further 
> we could use tree-based aggregation, but let's measure the benefits of simply 
> distributing the aggregation work first.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7181) Fix flaky test shell/test_shell_commandline.py::TestImpalaShell::test_socket_opening

2018-06-19 Thread Vincent Tran (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517553#comment-16517553
 ] 

Vincent Tran commented on IMPALA-7181:
--

IMPALA-7181: Fix flaky test 
shell/test_shell_commandline.py::TestImpalaShell::test_socket_opening

test_shell_commandline.py::TestImpalaShell::test_socket_opening
uses netcat to listen to an ephemeral port to verify the expected
socket opening behavior of impala-shell.

This port number is fixed to 42000. When this port happens to
be used by another outbound socket, this test will fail.

This change refactors the test to use socket.bind(). The port used
in this test is no longer fixed and will be picked automatically.
This change also adds the proper cleanup logics to the various
subprocess.Popen objects used in the test.

Change-Id: Idd64632ded936d49fc404bcac75588dd7886be44
Reviewed-on: http://gerrit.cloudera.org:8080/10747
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 

> Fix flaky test 
> shell/test_shell_commandline.py::TestImpalaShell::test_socket_opening 
> -
>
> Key: IMPALA-7181
> URL: https://issues.apache.org/jira/browse/IMPALA-7181
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Vincent Tran
>Assignee: Vincent Tran
>Priority: Blocker
>  Labels: broken-build
>
> shell/test_shell_commandline.py::TestImpalaShell::test_socket_opening uses 
> netcat to listen to an ephemeral port to verify impala-shell socket opening 
> behavior.
> The port is hardcoded to 42000 which can fail the test if this port is used 
> by an outgoing socket.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-6873) Crash in Expr::GetConstVal() due to NULL dereference

2018-06-19 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v resolved IMPALA-6873.
---
Resolution: Fixed

[~tarmstrong] Yes.

Fixed via: 
https://github.com/apache/impala/commit/b38d9826d7ef9bc0ecff548626d30690f935e9c3#diff-7140ed1301fa7a470056719186b1d646

> Crash in Expr::GetConstVal() due to NULL dereference
> 
>
> Key: IMPALA-6873
> URL: https://issues.apache.org/jira/browse/IMPALA-6873
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.8.0, Impala 2.9.0
>Reporter: bharath v
>Priority: Blocker
>  Labels: crash
> Fix For: Impala 2.10.0
>
>
> Log file crashing frame
> {noformat}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00357f88980b, pid=564763, tid=0x7f7b0386c700
> #
> # JRE version: Java(TM) SE Runtime Environment (8.0_162-b12) (build 
> 1.8.0_162-b12)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.162-b12 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [libc.so.6+0x8980b]  memcpy+0x15b
> {noformat}
> Crashing stack, extracted from core dump
> {noformat}
> #10 0x7f4d8eaadbe7 in os::print_location(outputStream*, long, bool) () 
> from /root/usr/java/latest/jre/lib/amd64/server/libjvm.so
> #11 0x7f4d8eabcaf5 in os::print_register_info(outputStream*, void*) () 
> from /root/usr/java/latest/jre/lib/amd64/server/libjvm.so
> #12 0x7f4d8ec595a3 in VMError::report(outputStream*) () from 
> /root/usr/java/latest/jre/lib/amd64/server/libjvm.so
> #13 0x7f4d8ec5ab2a in VMError::report_and_die() () from 
> /root/usr/java/latest/jre/lib/amd64/server/libjvm.so
> #14 0x7f4d8eabd22f in JVM_handle_linux_signal () from 
> /root/usr/java/latest/jre/lib/amd64/server/libjvm.so
> #15 0x7f4d8eab3253 in signalHandler(int, siginfo*, void*) () from 
> /root/usr/java/latest/jre/lib/amd64/server/libjvm.so
> #16 
> #17 0x003b4d089750 in memcpy () from /lib64/libc.so.6
> #18 0x00845578 in impala::Expr::GetConstVal (this=0x7f430831f400, 
> state=0x7f4cdc91b750, context=0xe331540, const_val=Unhandled dwarf expression 
> opcode 0xf3
> ) at /usr/src/debug/impala-2.9.0-cdh5.12.2/be/src/exprs/expr.cc:577
> #19 0x008909b9 in impala::ScalarFnCall::Open (this=0x7f430831e600, 
> state=0x7f4cdc91b750, ctx=0xe331540, 
> scope=impala_udf::FunctionContext::FRAGMENT_LOCAL)
>     at 
> /usr/src/debug/impala-2.9.0-cdh5.12.2/be/src/exprs/scalar-fn-call.cc:189
> #20 0x0084af8c in impala::ExprContext::Open (this=Unhandled dwarf 
> expression opcode 0xf3
> ) at /usr/src/debug/impala-2.9.0-cdh5.12.2/be/src/exprs/expr-context.cc:70
> #21 0x00ab2a3f in 
> Java_org_apache_impala_service_FeSupport_NativeEvalExprsWithoutRow 
> (env=0xcca31f8, caller_class=Unhandled dwarf expression opcode 0xf3
> ) at /usr/src/debug/impala-2.9.0-cdh5.12.2/be/src/service/fe-support.cc:142
> #22 0x7f4d7b284dad in ?? ()
> #23 0x00059cabbe18 in ?? ()
> #24 0x00059cabfcd8 in ?? ()
> #25 0xb395702563a2136b in ?? ()
> #26 0x806394b0 in ?? ()
> #27 0xb39570120002 in ?? ()
> #28 0x00059cab8090 in ?? ()
> #29 0x802f3c08 in ?? ()
> #30 0x00059beef118 in ?? ()
> #31 0x7f4cdc91bf70 in ?? ()
> #32 0x7f4d7b28033c in ?? ()
> #33 0x00059cab8438 in ?? ()
> #34 0x8d567eb0 in ?? ()
> #35 0x00059cab8588 in ?? ()
> #36 0x00059cab8308 in ?? ()
> #37 0x00059cab85a0 in ?? ()
> #38 0x00059cab85d0 in ?? ()
> #39 0x001811aad009 in ?? ()
> #40 0x0008 in ?? ()
> {noformat}
>  
> Missing frames are from the JVM and are below (extracted from hs_err_pid file)
> {noformat}
> J 12167  
> org.apache.impala.service.FeSupport.NativeEvalExprsWithoutRow([B[B)[B (0 
> bytes) @ 0x7f7bad2e1cf3 [0x7f7bad2e1c80+0x73]
> J 12158 C1 
> org.apache.impala.service.FeSupport.EvalExprWithoutRow(Lorg/apache/impala/analysis/Expr;Lorg/apache/impala/thrift/TQueryCtx;)Lorg/apache/impala/thrift/TColumnValue;
>  (170 bytes) @ 0x7f7bad307bf4 [0x7f7bad305be0+0x2014]
> J 12206 C1 
> org.apache.impala.service.FeSupport.EvalPredicate(Lorg/apache/impala/analysis/Expr;Lorg/apache/impala/thrift/TQueryCtx;)Z
>  (60 bytes) @ 0x7f7bad32daac [0x7f7bad32d180+0x92c]
> J 12207 C1 
> org.apache.impala.analysis.Analyzer.isTrueWithNullSlots(Lorg/apache/impala/analysis/Expr;)Z
>  (137 bytes) @ 0x7f7bad331c54 [0x7f7bad32fe40+0x1e14]
> j  
> org.apache.impala.planner.HdfsScanNode.computeDictionaryFilterConjuncts(Lorg/apache/impala/analysis/Analyzer;)V+135
> j  
> org.apache.impala.planner.HdfsScanNode.init(Lorg/apache/impala/analysis/Analyzer;)V+22
> j  
> 

[jira] [Assigned] (IMPALA-3956) Impala shell variable substitution should ignore comments embedded in query.

2018-06-19 Thread Adam Holley (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Holley reassigned IMPALA-3956:
---

Assignee: Adam Holley

> Impala shell variable substitution should ignore comments embedded in query.
> 
>
> Key: IMPALA-3956
> URL: https://issues.apache.org/jira/browse/IMPALA-3956
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 2.6.0
>Reporter: Huaisi Xu
>Assignee: Adam Holley
>Priority: Minor
>  Labels: regression, usability
>
> {code:java}
> -- WHERE a >= "${start_date}"
> select
> *
> from --fda;
> test
> -- WHEREa >= "${start_date}"
> {code}
> {code:java}
> Starting Impala Shell without Kerberos authentication
> Connected to localhost:21000
> Server version: impalad version 2.7.0-cdh5-INTERNAL DEBUG (build 
> ebd65a142d3f3f4087eb1c9aaf25d53d2045a4cb)
> Error: Unknown substitution syntax (START_DATE). Use ${VAR:var_name}.
> Could not execute command: -- WHERE a >= "${start_date}"
> select
> *
> from --fda;
> test
> -- WHERE a >= "${start_date}"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-6969) Profile doesn't include the reason that a query couldn't be dequeued from admission controller

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-6969.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

IMPALA-6969: add AC last queued reason to profile

The reason is updated during initial admission and when the query is at
the head of the queue but can't be admitted. It is not updated while
the query is in the middle of the queue.

Together with the async admission change, this makes it possible to
determine from the profile why the query has not been admitted yet.

Testing:
Added admission control tests that check that the
string is set for queries queued based both on the
query count and the max memory.

Looped the tests overnight to confirm non-flakiness.

Change-Id: Ida9b75dc50dfb7a27f59deda91bad6ac838130a1
Reviewed-on: http://gerrit.cloudera.org:8080/10731
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---

> Profile doesn't include the reason that a query couldn't be dequeued from 
> admission controller
> --
>
> Key: IMPALA-6969
> URL: https://issues.apache.org/jira/browse/IMPALA-6969
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: admission-control, observability
> Fix For: Impala 3.1.0
>
>
> I noticed this while playing around on a local minicluster with AC enabled.
> The admission controller adds the reason for initial queuing to the profile, 
> but does not expose why the query couldn't execute when it got to the head of 
> the line. E.g. if it was initially queued because the queue was non-empty but 
> then couldn't execute once it got to the head of the line because of memory.
> {noformat}
> Request Pool: root.queueA
> Admission result: Admitted (queued)
> Admission queue details: waited 1130 ms, reason: queue is not empty (size 
> 4); queued queries are executed first
> {noformat}
> We should still include the initial reason for queuing, but also include the 
> most reason for queuing once it got to the head of the line. It's probably 
> most useful to keep the profile updated with the latest reason at all times 
> (since the details can change while the query is at the head of the line).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-7056) Changing Text Delimiter Does Not Work

2018-06-19 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-7056.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> Changing Text Delimiter Does Not Work
> -
>
> Key: IMPALA-7056
> URL: https://issues.apache.org/jira/browse/IMPALA-7056
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Docs
>Affects Versions: Impala 2.12.0
>Reporter: Alan Jackoway
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> The wording on 
> https://impala.apache.org/docs/build/html/topics/impala_alter_table.html 
> makes it seem like you can change the delimiter of text tables after they are 
> created.
> I did the following to simulate a table that needed to switch between comma 
> and pipe delimited:
> {code}
> hadoop fs -mkdir /user/alanj
> hadoop fs -mkdir /user/alanj/test_delim
> echo "A,B|C" > delim.txt
> hadoop fs -put delim.txt /user/alanj/test_delim
> {code}
> Then created in impala and tried to change delimiters:
> {code:sql}
> > create external table default.alanj_test_delim(A string, B string) ROW 
> > FORMAT DELIMITED FIELDS TERMINATED BY "," LOCATION '/user/alanj/test_delim';
> > select * from default.alanj_test_delim;
> Query: select * from default.alanj_test_delim
> +---+-+
> | a | b   |
> +---+-+
> | A | B|C |
> +---+-+
> > alter table default.alanj_test_delim set SERDEPROPERTIES 
> > ('serialization.format'='|', 'field.delim'='|');
> > select * from default.alanj_test_delim;
> +---+-+
> | a | b   |
> +---+-+
> | A | B|C |
> +---+-+
> > show create table default.alanj_test_delim;
> +--+
> | result  
>  |
> +--+
> | CREATE EXTERNAL TABLE default.alanj_test_delim (
>  |
> |   a STRING, 
>  |
> |   b STRING  
>  |
> | )   
>  |
> | ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'   
>  |
> | WITH SERDEPROPERTIES ('field.delim'='|', 'serialization.format'='|')
>  |
> | STORED AS TEXTFILE  
>  |
> | LOCATION 'hdfs://namenode:8020/user/alanj/test_delim'   
>|
> | TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='false', 'numFiles'='0', 
> 'numRows'='-1', 'rawDataSize'='-1', 'totalSize'='0') |
> +--+
> {code}
> So it shows the right serdeproperties, but impala doesn't actually use them 
> to read the data.
> If you then insert data (as the docs suggest), it writes that data with the 
> new delimiter:
> {code:sql}
> > insert into default.alanj_test_delim values('D', 'E,F');
> > select * from alanj_test_delim;
> +-+-+
> | a   | b   |
> +-+-+
> | A,B | C   |
> | D   | E,F |
> +-+-+
> # hadoop fs -cat 
> /user/alanj/test_delim/a54bb0ec14646492-a7388114_1498283208_data.0.
> D|E,F
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-5202) Debug action WAIT in PREPARE leads to hung query that cannot be cancelled.

2018-06-19 Thread Dan Hecht (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Hecht reassigned IMPALA-5202:
-

Assignee: Dan Hecht

> Debug action WAIT in PREPARE leads to hung query that cannot be cancelled.
> --
>
> Key: IMPALA-5202
> URL: https://issues.apache.org/jira/browse/IMPALA-5202
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Infrastructure
>Affects Versions: Impala 2.8.0
>Reporter: Alexander Behm
>Assignee: Dan Hecht
>Priority: Trivial
> Attachments: stacks.txt.gz
>
>
> I believe recent changes to coordination and distributed execution have 
> broken the WAIT debug action when called in some phases, e.g. PREPARE.
> The following repro leads to a hung query that cannot be cancelled. Impala's 
> WebUI hangs, so cannot cancel from there either.
> {code}
> set debug_action="0:PREPARE:WAIT";
> select 1 from functional.alltypes;
> {code}
> I tried WAIT in PREPARE with other simple queries, targeting other exec nodes 
> (e.g., top-n) with the same result.
> I am not sure why our test_failpoints.py or test_cancellation.py did not 
> catch this.
> Attached:
> I ran an experiment with a single impalad. I ran the above sequence and then 
> issued a ctrl+c from the impala shell to cancel the query. At that point, I 
> collected the stacks of all threads.
> Interesting stacks:
> {code}
> Thread 3 (Thread 0x7fda1238c700 (LWP 8872)):
> #0  0x7fda9b97183d in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
> #1  0x7fda9b9716dc in sleep () from /lib/x86_64-linux-gnu/libc.so.6
> #2  0x016ab891 in impala::ExecNode::ExecDebugAction (this=0xc965800, 
> phase=impala::TExecNodePhase::PREPARE, state=0xc965100) at 
> /home/abehm/impala/be/src/exec/exec-node.cc:430
> #3  0x016a8c4c in impala::ExecNode::Prepare (this=0xc965800, 
> state=0xc965100) at /home/abehm/impala/be/src/exec/exec-node.cc:148
> #4  0x017ee510 in impala::ScanNode::Prepare (this=0xc965800, 
> state=0xc965100) at /home/abehm/impala/be/src/exec/scan-node.cc:51
> #5  0x016dc14b in impala::HdfsScanNodeBase::Prepare (this=0xc965800, 
> state=0xc965100) at /home/abehm/impala/be/src/exec/hdfs-scan-node-base.cc:175
> #6  0x016d3516 in impala::HdfsScanNode::Prepare (this=0xc965800, 
> state=0xc965100) at /home/abehm/impala/be/src/exec/hdfs-scan-node.cc:167
> #7  0x01a716d1 in impala::PlanFragmentExecutor::PrepareInternal 
> (this=0xc9645d0, qs=0x9382800, tdesc_tbl=..., fragment_ctx=..., 
> instance_ctx=...) at 
> /home/abehm/impala/be/src/runtime/plan-fragment-executor.cc:215
> #8  0x01a6fd69 in impala::PlanFragmentExecutor::Prepare 
> (this=0xc9645d0, query_state=0x9382800, desc_tbl=..., fragment_ctx=..., 
> instance_ctx=...) at 
> /home/abehm/impala/be/src/runtime/plan-fragment-executor.cc:99
> #9  0x01a6cce5 in impala::FragmentInstanceState::Exec 
> (this=0xc964300) at 
> /home/abehm/impala/be/src/runtime/fragment-instance-state.cc:64
> #10 0x01a783d1 in impala::QueryExecMgr::ExecFInstance 
> (this=0xb870ba0, fis=0xc964300) at 
> /home/abehm/impala/be/src/runtime/query-exec-mgr.cc:110
> #11 0x01a7b1fa in boost::_mfi::mf1 impala::FragmentInstanceState*>::operator() (this=0xac8ce60, p=0xb870ba0, 
> a1=0xc964300) at 
> /home/abehm/impala/toolchain/boost-1.57.0-p1/include/boost/bind/mem_fn_template.hpp:165
> #12 0x01a7b083 in 
> boost::_bi::list2, 
> boost::_bi::value 
> >::operator() impala::FragmentInstanceState*>, boost::_bi::list0> (this=0xac8ce70, f=..., 
> a=...) at 
> /home/abehm/impala/toolchain/boost-1.57.0-p1/include/boost/bind/bind.hpp:313
> {code}
> {code}
> Thread 2 (Thread 0x7fda10b89700 (LWP 8874)):
> #0  0x7fda9bc7cd84 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #1  0x011c1f6d in boost::condition_variable::wait (this=0xc962be0, 
> m=...) at 
> /home/abehm/impala/toolchain/boost-1.57.0-p1/include/boost/thread/pthread/condition_variable.hpp:73
> #2  0x0133caf7 in impala::Promise::Get 
> (this=0xc962be0) at /home/abehm/impala/be/src/util/promise.h:67
> #3  0x01a6ff70 in impala::PlanFragmentExecutor::WaitForOpen 
> (this=0xc9629d0) at 
> /home/abehm/impala/be/src/runtime/plan-fragment-executor.cc:108
> #4  0x01a38e2f in impala::Coordinator::Wait (this=0xbe72d00) at 
> /home/abehm/impala/be/src/runtime/coordinator.cc:1063
> #5  0x0152be3c in impala::ImpalaServer::QueryExecState::WaitInternal 
> (this=0x972ac00) at /home/abehm/impala/be/src/service/query-exec-state.cc:666
> #6  0x0152b960 in impala::ImpalaServer::QueryExecState::Wait 
> (this=0x972ac00) at /home/abehm/impala/be/src/service/query-exec-state.cc:634
> #7  0x01547643 in boost::_mfi::mf0 

[jira] [Assigned] (IMPALA-3891) send authentication error messages back to coordinator

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-3891:
-

Assignee: (was: Huaisi Xu)

> send authentication error messages back to coordinator
> --
>
> Key: IMPALA-3891
> URL: https://issues.apache.org/jira/browse/IMPALA-3891
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Security
>Affects Versions: Impala 2.6.0
>Reporter: Huaisi Xu
>Priority: Major
>
> In Kerberos cluster, when authentication failed, Impala only prints error 
> messages in logs, which is hard to find. It can be extremely frustrating when 
> Impala hang/connection time out(after IMPALA-3875) at the same time.
> Instead, these messages should be sent back to the client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-3841) Avoid materializing nested collections if top-level predicates already disqualify the row.

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-3841:
-

Assignee: (was: Chris Channing)

> Avoid materializing nested collections if top-level predicates already 
> disqualify the row.
> --
>
> Key: IMPALA-3841
> URL: https://issues.apache.org/jira/browse/IMPALA-3841
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.5.0, Impala 2.6.0
>Reporter: Alexander Behm
>Priority: Minor
>  Labels: nested_types, parquet, performance
>
> Today, we fully materialize a row before evaluating the top-level conjuncts 
> when scanning Parquet. This includes materializing nested collections. We 
> should avoid materializing nested collections if top-level conjuncts already 
> discard the row. Our recent move to column-wise materialization makes this 
> improvement feasible (IMPALA-2736).
> To illustrate the problem, consider this query:
> {code}
> select * from customer c, c.orders o where c.id = 10
> {code}
> Even though we have a very selective predicate on the top-level customer, our 
> scanner will still fully materialize all orders of all customers. The 
> non-matches will be filtered, but we still pay the cost of materializing the 
> orders.
> The proposed improvement is to avoid materializing the orders of 
> non-qualifying customers.
> The improvement will several things:
> * Analyze and separate the top-level conjuncts into those that can be 
> evaluated before materializing the nested collections and those that require 
> nested collections to be materialized. In particular, we need to be careful 
> with our auto-generated !empty() predicates on nested collections.
> * Add a new SkipValues() or similar interface to the Parquet column readers 
> to advances the scanner without actually materializing values. If possible, 
> we should skip entire blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-1758) Impala ODBC Driver returns incorrect SQLState for "Table does not exist"

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-1758:
-

Assignee: (was: Justin Erickson)

> Impala ODBC Driver returns incorrect SQLState for "Table does not exist"
> 
>
> Key: IMPALA-1758
> URL: https://issues.apache.org/jira/browse/IMPALA-1758
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 2.1.1
>Reporter: Viji
>Priority: Minor
>  Labels: jdbc, odbc, sql-language, usability
>
> ODBC specification states that an ODBC driver should return SQLState 42S02 
> for a "Table does not exist" condition, but the Impala ODBC driver (version 
> 2.5.23 on Windows 7) is currently returning HY000 instead. Applications check 
> for specific SQLStates to make better code choices and rely on those 
> SQLStates to be accurate. The HY000 SQLState only indicates "General error".
> This can be reproduced with any ODBC application that displays detailed error 
> messages by simply executing:
> SELECT * FROM tablethatdoesnotexist



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-1792) ImpalaODBC: Can not get the value in the SQLGetData(m-x th column) after the SQLBindCol(m th column)

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-1792:
-

Assignee: (was: Justin Erickson)

> ImpalaODBC: Can not get the value in the SQLGetData(m-x th column) after the 
> SQLBindCol(m th column)
> 
>
> Key: IMPALA-1792
> URL: https://issues.apache.org/jira/browse/IMPALA-1792
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 2.1
> Environment: OS: CentOS release 6.4 (Final)
> Impala: 2.1.0
> ImpalaODBC: 2.5.23
>Reporter: Mitsuhiro Koga
>Priority: Critical
>  Labels: correctness, downgraded, impala, odbc
>
> Steps to reproduce
> # Create table and insert data in impala.
> {code:sql}
> create table t (c1 string, c2 string, c3 string);
> insert into t (c1, c2, c3) values ('AAA', 'BBB', 'CCC');
> {code}
> # Query the t table with the following code.
> {code}
> #include 
> #include 
> #include 
> #include 
> int main() {
> SQLHENV env;
> SQLHDBC dbc;
> SQLHSTMT stmt;
> SQLRETURN ret;
> SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, );
> SQLSetEnvAttr(env, SQL_ATTR_ODBC_VERSION, (void *) SQL_OV_ODBC3, 0); 
> SQLAllocHandle(SQL_HANDLE_DBC, env, );
> SQLDriverConnect(dbc, (SQLCHAR*)NULL, (SQLCHAR*)"DSN=Impala", SQL_NTS, 
> NULL, 0, NULL, SQL_DRIVER_COMPLETE);
> SQLAllocHandle(SQL_HANDLE_STMT, dbc, );
> SQLExecDirect(stmt, (SQLCHAR*)"select c1, c2, c3 from t", SQL_NTS);
> char szCol1[11];
> char szCol2[11];
> char szCol3[11];
> SQLLEN nColLen1;
> SQLLEN nColLen2;
> SQLLEN nColLen3;
> /*/
> / Bind 2nd column /
> /*/
> SQLBindCol(stmt, 2, SQL_C_CHAR, szCol2, sizeof(szCol2), );
> ret = SQLFetch(stmt);
> if (ret == SQL_SUCCESS || ret == SQL_SUCCESS_WITH_INFO) {
> //
> / Get 1st column /
> //
> ret = SQLGetData(stmt, 1, SQL_C_CHAR, szCol1, sizeof(szCol1), 
> );
> if (ret == SQL_SUCCESS || ret == SQL_SUCCESS_WITH_INFO) {
> printf("c1: %s\n", szCol1);
> } else {
> printf("no data\n");
> }
> printf("c2: %s\n", szCol2);
> //
> / Get 3rd column /
> //
> ret = SQLGetData(stmt, 3, SQL_C_CHAR, szCol3, sizeof(szCol3), 
> );
> if (ret == SQL_SUCCESS || ret == SQL_SUCCESS_WITH_INFO) {
> printf("c3: %s\n", szCol3);
> } else {
> printf("no data\n");
> }
> } else {
> printf("no row\n");
> }
> return 0;
> }
> {code}
> Expected result
> {code}
> c1: AAA
> c2: BBB
> c3: CCC
> {code}
> Actual result
> {code}
> no data
> c2: BBB
> c3: CCC
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-3948) Impala JDBC driver blocks concurrent queries when UseNativeQuery=0

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-3948.
---
Resolution: Invalid

The driver isn't part of Apache Impala

> Impala JDBC driver blocks concurrent queries when UseNativeQuery=0
> --
>
> Key: IMPALA-3948
> URL: https://issues.apache.org/jira/browse/IMPALA-3948
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 2.5.0
>Reporter: Julian Eberius
>Assignee: Syed A. Hashmi
>Priority: Minor
>
> When using the Impala JDBC driver verison 2.5.32 and a standard connection 
> pool in a Java application to run concurrent queries, I discovered that all 
> application threads executing Impala queries where blocked except for one. 
> All others were hanging in 
> HiveJDBCQueryExecutorWithLimitZeroPreparedStatementMetadata#execute(ExecutionContexts,
>  IWarningListener). I further investigated the decompiled code of this method 
> and found a synchronized() statement, which locks on a global singleton of 
> type IQueryTranslator and encloses an executeQuery() call. This effectively 
> makes the whole driver single-threaded, which should not be the case for 
> standards-conforming JDBC drivers as far as I know. 
> When using the parameter UseNativeQuery=1, the driver uses the class 
> HiveJDBCNativeQueryExecutor instead of 
> HiveJDBCQueryExecutorWithLimitZeroPreparedStatementMetadata, which does not 
> have this synchronized statement. In this case, all queries were submitted to 
> Impala in parallel, which is the expected behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-1856) Missing datatypes from JDBC DBMD.getTypeInfo() call

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-1856:
-

Assignee: (was: Syed A. Hashmi)

> Missing datatypes from JDBC DBMD.getTypeInfo() call
> ---
>
> Key: IMPALA-1856
> URL: https://issues.apache.org/jira/browse/IMPALA-1856
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.1
> Environment: Impala 2.1 
> C5.3
> hive-jdbc 0.13
>Reporter: Jonathan Seidman
>Priority: Minor
>  Labels: jdbc, odbc, supportability
>
> Filing this here since it seems to be an Impala defect as opposed to a driver 
> defect based on observed behavior. The following is seen from 
> DBMD.getTypeInfo() call against Hive:
> VOID 
> BOOLEAN 
> TINYINT 
> SMALLINT 
> INT 
> BIGINT 
> FLOAT 
> DOUBLE 
> STRING 
> CHAR 
> VARCHAR 
> DATE 
> TIMESTAMP 
> BINARY 
> DECIMAL 
> ARRAY 
> MAP 
> STRUCT 
> UNIONTYPE 
> USER_DEFINED 
> And against Impala (2.1):
> NULL_TYPE 
> BOOLEAN 
> TINYINT 
> SMALLINT 
> INT 
> BIGINT 
> FLOAT 
> DOUBLE 
> TIMESTAMP 
> STRING 
> BINARY
> Note the missing newer datatypes (e.g. decimal, varchar).
> This is with the Hive 0.13 driver. Please let me know if there's something 
> I'm missing and this should be filed against the driver.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-4075) Test kudu connection failed due to env KUDU_IS_SUPPORTED is set to false

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-4075.
---
Resolution: Won't Fix

> Test kudu connection failed due to env KUDU_IS_SUPPORTED is set to false
> 
>
> Key: IMPALA-4075
> URL: https://issues.apache.org/jira/browse/IMPALA-4075
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.7.0
> Environment: LSB Version: 
> :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
> Distributor ID:   CentOS
> Description:  CentOS release 6.7 (Final)
> Release:  6.7
> Codename: Final
>Reporter: hewenting
>Assignee: hewenting
>Priority: Minor
>  Labels: kudu, test
>
> Execute tests in file tests/conftest.py fail, as below shows:
> {noformat}
> Traceback (most recent call last):
>   File "conftest.py", line 21, in 
> from kudu import connect as kudu_connect
> ImportError: No module named kudu
> {noformat}
> we should check environment variable "KUDU_IS_SUPPORTED" first.
> Only if this var is set to true, we can use "from kudu import connect as 
> kudu_connect".
> *Workaround*
> * temporarily export KUDU_IS_SUPPORTED=true
> * run bin/bootstrap_toolchain.py
> * set KUDU_IS_SUPPORTED back to original value
> * tests should run now



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-3308) Get expr-test passing on PPC64LE

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-3308.
---
Resolution: Fixed

commit 9cee2b5f1376ad6286ed65edffe6d152a0012cf1
Author: segelyang 
Date:   Wed Aug 31 18:13:17 2016 +0800

IMPALA-3308: Get expr-test passing on PPC64LE

When using gcc 5+ (which introduced a new library ABI that includes new
implementations of std::string) to build Impala, the copy of the class
ExprValue(std::string) would be unsafe as string_val.ptr will not be
updated to point to the relocated string_data.data().

In order to solve this issue, we need to change how we initialize value_
so that it is initialized in-place, rather than created as a temporary
on the stack and then copied.

Change-Id: I4504ee6a52a085f530aadfcfa009bacb83c64787
Reviewed-on: http://gerrit.cloudera.org:8080/4186
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins


> Get expr-test passing on PPC64LE
> 
>
> Key: IMPALA-3308
> URL: https://issues.apache.org/jira/browse/IMPALA-3308
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.5.0
> Environment: Ubuntu 15.10 , architecture: ppc64le
>Reporter: Nishidha
>Assignee: Yang Zhi Zhi
>Priority: Minor
> Attachments: 
> 0001-Fix-exprs-string-test-garbage-characters-with-gcc-5.patch
>
>
> On ppc64le, expr-test fails.
> Attached is the log. The log contains a few additional statements logged.
> Query: select 'test' is giving incorrect results as can be seen in logs. 
> Looks like some string related issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-3644) PlannerTest.java fails when run with Java 8

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-3644.
---
Resolution: Duplicate

> PlannerTest.java fails when run with Java 8
> ---
>
> Key: IMPALA-3644
> URL: https://issues.apache.org/jira/browse/IMPALA-3644
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.6.0
>Reporter: Lars Volker
>Assignee: Alexander Behm
>Priority: Minor
> Attachments: PlannerTest8.log, PlannerTestJava7.log, 
> com.cloudera.impala.planner.PlannerTest.txt
>
>
> [~twmarshall] - I picked you a this looks like a predicate ordering issue, 
> thinking you might have an idea what’s going on here; feel free to find 
> another person or assign back to me if you're swamped.
> {{PlannerTest.java:178 - testTpchNested}} fails on my local machine with a 
> mismatch in the returned plan. The lines in error are the following:
> Returned:
> {noformat}
> 01:SCAN HDFS [tpch_nested_parquet.customer c]
>partitions=1/1 files=4 size=577.87MB
>predicates: !empty(c.c_orders)
> *  predicates on o: !empty(o.o_lineitems), o_orderstatus = 'F'
>predicates on l1: l1.l_receiptdate > l1.l_commitdate
>predicates on l3: l3.l_receiptdate > l3.l_commitdate
> {noformat}
> Expected:
> {noformat}
> 01:SCAN HDFS [tpch_nested_parquet.customer c]
>partitions=1/1 files=4 size=577.87MB
>predicates: !empty(c.c_orders)
> *  predicates on o: o_orderstatus = 'F', !empty(o.o_lineitems)
>predicates on l1: l1.l_receiptdate > l1.l_commitdate
>predicates on l3: l3.l_receiptdate > l3.l_commitdate
> {noformat}
> It looks like the predicate order is reversed. I also noticed IMPALA-3643 on 
> the same machine, which looks to me like it could be related to Java HashMap 
> iterating over elements in an unexpected order on my machine/JVM. Could this 
> also be the case here? I couldn't figure out where the order of the 
> predicates is determined. There are multiple tests failing in 
> {{PlannerTest.java}} for the same reason.
> My HEAD is at {{* 7167950 - Remove redundant test in 
> test_avro_schema_resolution.py (3 days ago) }}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-4337) Wrap long lines in explain plans

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-4337:
-

Assignee: (was: Henry Robinson)

> Wrap long lines in explain plans
> 
>
> Key: IMPALA-4337
> URL: https://issues.apache.org/jira/browse/IMPALA-4337
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 2.8.0
>Reporter: Henry Robinson
>Priority: Minor
>  Labels: newbie
>
> Explain plans can have very long lines, particularly when printing lists of 
> expressions. It should be possible to wrap, and still correctly indent, those 
> lines.
> This is trickier than it sounds because they have to be wrapped in the 
> context of their place in the plan (i.e. with appropriate prefixes etc). It's 
> a good opportunity to split out explain plan generation from presentation, 
> centralizing the logic so that this kind of change is easy to make.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-3116) Reuse session for executing queries (Hive on Spark)

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-3116.
---
Resolution: Won't Fix

Feel free to reopen if needed.

> Reuse session for executing queries (Hive on Spark)
> ---
>
> Key: IMPALA-3116
> URL: https://issues.apache.org/jira/browse/IMPALA-3116
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.5.0
>Reporter: Kapil Rastogi
>Assignee: Kapil Rastogi
>Priority: Minor
>
> Workload runner infrastructure opens a connection to HS2, executes a query 
> and then tears down the connection.
> For Hive on Spark we want to make sure that containers are allocated 
> (pre-warming) before we measure performance. To enable this we need to reuse 
> session while executing queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-2361) Using AVX intrinsic to accelerate the sort operation

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-2361:
-

Assignee: (was: Youwei Wang)

> Using AVX intrinsic to accelerate the sort operation
> 
>
> Key: IMPALA-2361
> URL: https://issues.apache.org/jira/browse/IMPALA-2361
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.2.4
>Reporter: Youwei Wang
>Priority: Minor
>  Labels: performance
>
> Using AVX intrinsic to accelerate the sort operation



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-2949) Use -Werror in ASAN and release builds

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-2949:
-

Assignee: (was: Alexander Behm)

> Use -Werror in ASAN and release builds
> --
>
> Key: IMPALA-2949
> URL: https://issues.apache.org/jira/browse/IMPALA-2949
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 2.5.0
>Reporter: Tim Armstrong
>Priority: Minor
>  Labels: test-infra
>
> Compiler warnings should be treated as errors to avoid introducing commits 
> introducing new warnings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-2263) Insert into table always fails at 3 row

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-2263:
-

Assignee: (was: Syed A. Hashmi)

> Insert into table always fails at 3 row
> ---
>
> Key: IMPALA-2263
> URL: https://issues.apache.org/jira/browse/IMPALA-2263
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 2.2
> Environment: v2.2.0-cdh5.4.2 (b7f0e80) built on Tue May 19 16:45:28 
> PDT 2015
> Cloudera_ImpalaJDBC4_2.5.24.zip
>Reporter: Dzmitry Stsiapanau
>Priority: Minor
> Attachments: TestSimba.java, console_out.txt
>
>
> Insert statement always failed at third row. Tried different data, different 
> column types, different statements. Batch and single updates. Always the same 
> two first is ok - third is broken.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-2638) Retry queries that fail during scheduling

2018-06-19 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-2638:
-

Assignee: (was: Henry Robinson)

> Retry queries that fail during scheduling
> -
>
> Key: IMPALA-2638
> URL: https://issues.apache.org/jira/browse/IMPALA-2638
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.3.0
>Reporter: Henry Robinson
>Priority: Minor
>  Labels: scalability
>
> An important building block for node-decommissioning is the ability to retry 
> queries if they fail during scheduling for some recoverable reason (e.g. RPC 
> failed due to unreachable host, fragment could not be started due to memory 
> pressure). 
> To do this we can detect failures during {{Coordinator::Exec()}}, cancel the 
> running query and then re-start from somewhere in 
> {{QueryExecState::ExecQueryOrDmlRequest()}} - updating a local blacklist of 
> nodes so that we know to avoid those that have caused failures.
> There are some subtleties though:
> * Queries shouldn't be retried more than a small number of times, in case 
> they *cause* the outage (there might be a good way to figure that out at the 
> time)
> * If the query is restarted from the scheduling step (rather than completely 
> restarting), some care will have to be taken to ensure that none of the old 
> query's fragments that are being cancelled can affect the new query's 
> operation in any way (there are several ways to do this). 
> Eventually the failures will propagate to the rest of the cluster via the 
> statestore - this mechanism allows queries to recover and continue while the 
> statestore detects the failure. 
> This JIRA doesn't address restarting queries that have suffered failures 
> part-way through execution, because that's strictly harder and not (as) 
> needed for decommissioning.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-6918) Implement COMMENT ON COLUMN

2018-06-19 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-6918 started by Fredy Wijaya.

> Implement COMMENT ON COLUMN
> ---
>
> Key: IMPALA-6918
> URL: https://issues.apache.org/jira/browse/IMPALA-6918
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Minor
>
> Syntax:
> {noformat}
> COMMENT ON COLUMN my_table.my_column IS 'Employee ID Number';{noformat}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7101) Builds are timing out/hanging

2018-06-19 Thread Balazs Jeszenszky (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517051#comment-16517051
 ] 

Balazs Jeszenszky commented on IMPALA-7101:
---

[~twmarshall] / [~dhecht], what versions does this issue affect? Seems like 
it's been around for a while.

> Builds are timing out/hanging
> -
>
> Key: IMPALA-7101
> URL: https://issues.apache.org/jira/browse/IMPALA-7101
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Thomas Tauber-Marshall
>Assignee: Dan Hecht
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> We've seen a large number of builds in the last week or two that appear to 
> have hung and gotten killed after a 24-hour timeout.
> Exactly where the hang is occurring is different in each build, but II 
> suspect it has something to do with cancellation no working correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7064) Local memory admitted metric does not get updated immediately

2018-06-19 Thread Bikramjeet Vig (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-7064:
---
Description: 
Due to a delay in updating (host_mem_reserved_[coord_id]) memory reserved 
locally on a host in impala, multiple queries submitted in succession after 
every previous succeeds can result in queries being queued momentarily due to 
not enough memory being available. This happens because unlike pool stats that 
immediately update local metrics, the host level memory reserved metric is only 
updated on the next statestore update. This delay can cause the admission 
controller to believe that memory reserved for a previous query has not been 
released.

Repro steps:

start an impala cluster using:
 start-impala-cluster.py  --impalad_args="--default_pool_max_requests=1 
--queue_wait_timeout_ms=100 --mem_limit=5G --default_pool_mem_limit=10G 
--queue_wait_timeout_ms=1"

run queries in succession like this:

./bin/impala-shell.sh -q "set mem_limit=5G;set num_nodes=1;select * from 
functional.alltypesagg, (select 1) B limit 1;select * from 
functional.alltypesagg, (select 1) B limit 1;"

Due to queue wait timeout set to 1ms, you will notice that the second query 
fails.

  was:
Due to a delay in updating memory admitted locally on a host in impala, 
multiple queries submitted in succession after every previous succeeds can 
result in queries being queued momentarily due to not enough memory being 
available. This happens because unlike pool stats that immediately update local 
metrics, the host level memory admitted metric is only updated on the next 
statestore update. This delay can cause the admission controller to believe 
that memory admitted for a previous query has not been released.

Repro steps:

start an impala cluster using:
start-impala-cluster.py  --impalad_args="--default_pool_max_requests=1 
--queue_wait_timeout_ms=100 --mem_limit=5G --default_pool_mem_limit=10G 
--queue_wait_timeout_ms=1"

run queries in succession like this:

./bin/impala-shell.sh -q "set mem_limit=5G;set num_nodes=1;select * from 
functional.alltypesagg, (select 1) B limit 1;select * from 
functional.alltypesagg, (select 1) B limit 1;"

Due to queue wait timeout set to 1ms, you will notice that the second query 
fails.


> Local memory admitted metric does not get updated immediately 
> --
>
> Key: IMPALA-7064
> URL: https://issues.apache.org/jira/browse/IMPALA-7064
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control
>
> Due to a delay in updating (host_mem_reserved_[coord_id]) memory reserved 
> locally on a host in impala, multiple queries submitted in succession after 
> every previous succeeds can result in queries being queued momentarily due to 
> not enough memory being available. This happens because unlike pool stats 
> that immediately update local metrics, the host level memory reserved metric 
> is only updated on the next statestore update. This delay can cause the 
> admission controller to believe that memory reserved for a previous query has 
> not been released.
> Repro steps:
> start an impala cluster using:
>  start-impala-cluster.py  --impalad_args="--default_pool_max_requests=1 
> --queue_wait_timeout_ms=100 --mem_limit=5G --default_pool_mem_limit=10G 
> --queue_wait_timeout_ms=1"
> run queries in succession like this:
> ./bin/impala-shell.sh -q "set mem_limit=5G;set num_nodes=1;select * from 
> functional.alltypesagg, (select 1) B limit 1;select * from 
> functional.alltypesagg, (select 1) B limit 1;"
> Due to queue wait timeout set to 1ms, you will notice that the second query 
> fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-6169) Implement DATE type

2018-06-19 Thread Attila Jeges (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges reassigned IMPALA-6169:


Assignee: Attila Jeges

> Implement DATE type
> ---
>
> Key: IMPALA-6169
> URL: https://issues.apache.org/jira/browse/IMPALA-6169
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Reporter: Quanlong Huang
>Assignee: Attila Jeges
>Priority: Major
>
> In Hive, the Date type describes a particular year/month/day, in the form 
> -­MM-­DD.
> Hive has supported Date type in Parquet two years ago in Hive-1.2.0. (See 
> https://issues.apache.org/jira/browse/HIVE-8119 and 
> https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-VersionsandLimitations.)
> We should add support for Date type too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org