[jira] [Commented] (IMPALA-9870) summary and profile command in impala-shell should show both original and retried info

2020-09-18 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198604#comment-17198604
 ] 

Sahil Takiar commented on IMPALA-9870:
--

The 'profile' part of this was done in IMPALA-9229, we still need support for 
the 'summary' command.

> summary and profile command in impala-shell should show both original and 
> retried info
> --
>
> Key: IMPALA-9870
> URL: https://issues.apache.org/jira/browse/IMPALA-9870
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Major
>
> If a query is retried, impala-shell still uses the original query handle 
> containing the original query id. Subsequent "summary" and "profile" commands 
> will return results of the original query. We should consider return both the 
> original and retried information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9229) Link failed and retried runtime profiles

2020-09-18 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-9229.
--
Fix Version/s: Impala 4.0
   Resolution: Fixed

Marking as resolved. The Web UI improvements are tracked in a separate JIRA.

> Link failed and retried runtime profiles
> 
>
> Key: IMPALA-9229
> URL: https://issues.apache.org/jira/browse/IMPALA-9229
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Critical
> Fix For: Impala 4.0
>
>
> There should be a way for clients to link the runtime profiles from failed 
> queries to all retry attempts (whether successful or not), and vice versa.
> There are a few ways to do this:
>  * The simplest way would be to include the query id of the retried query in 
> the runtime profile of the failed query, and vice versa; users could then 
> manually create a chain of runtime profiles in order to fetch all failed / 
> successful attempts
>  * Extend TGetRuntimeProfileReq to include an option to fetch all runtime 
> profiles for the given query id + all retry attempts (or add a new Thrift 
> call TGetRetryQueryIds(TQueryId) which returns a list of retried ids for a 
> given query id)
>  * The Impala debug UI should include a simple way to view all the runtime 
> profiles of a query (the failed attempts + all retry attempts) side by side 
> (perhaps the query_profile?query_id profile should include tabs to easily 
> switch between the runtime profiles of each attempt)
> These are not mutually exclusive, and it might be good to stage these changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10180) Add average size of fetch requests in runtime profile

2020-09-18 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-10180:
-

 Summary: Add average size of fetch requests in runtime profile
 Key: IMPALA-10180
 URL: https://issues.apache.org/jira/browse/IMPALA-10180
 Project: IMPALA
  Issue Type: Improvement
  Components: Clients
Reporter: Sahil Takiar


When queries with a high {{ClientFetchWaitTimer}} it would be useful to know 
the average number of rows requested by the client per fetch request. This can 
help determine if setting a higher fetch size would help improve fetch 
performance where the network RTT between the client and Impala is high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10171) Create query options for convert_legacy_hive_parquet_utc_timestamps and use_local_tz_for_unix_timestamp_conversions

2020-09-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198584#comment-17198584
 ] 

ASF subversion and git services commented on IMPALA-10171:
--

Commit 5dccf8024b17148e3736473d03426588f434af48 in impala's branch 
refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5dccf80 ]

IMPALA-10171: Create query options for local time related flags

convert_legacy_hive_parquet_utc_timestamps and
use_local_tz_for_unix_timestamp_conversions were controllable only by
flags until now. After this change the old flags are only used on the
Coordinator to set the defaults for the query options.

If default_query_options also sets these query options, it will
take precedence over the old flags.

Possible follow up work:
- the old flags could be deprecated, as default_query_options
  can be used to set this on a server level
- testing these functionalities no longer needs custom cluster
  tests - rewriting the existing tests could speed up test
  execution

Testing:
- expr-test was rewritten to use the query option instead of the flag
- extended the custom cluster tests for the old flags to also use the
  query options
- ran related tests

Change-Id: I5c4252d1c8f8e224c1d0e8234f09374bcc0c6f68
Reviewed-on: http://gerrit.cloudera.org:8080/16469
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Create query options for convert_legacy_hive_parquet_utc_timestamps and 
> use_local_tz_for_unix_timestamp_conversions
> ---
>
> Key: IMPALA-10171
> URL: https://issues.apache.org/jira/browse/IMPALA-10171
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Csaba Ringhofer
>Priority: Major
>
> convert_legacy_hive_parquet_utc_timestamps and 
> use_local_tz_for_unix_timestamp_conversions are flags that can be set on all 
> coordinators and executors. Possible inconsistencies could be avoided by 
> always using the flag's value on the coordinator, or adding a query options 
> for these settings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10051) impala-shell exits with ValueError with WITH clauses

2020-09-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198583#comment-17198583
 ] 

ASF subversion and git services commented on IMPALA-10051:
--

Commit 3ef77566286c0077b89c0b8ce529ea9985018dd6 in impala's branch 
refs/heads/master from Tamas Mate
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=3ef7756 ]

IMPALA-10051: impala-shell exits with ValueError with WITH clauses

When a query contains WITH clause impala-shell tries to identify whether
it is a DML query or not, so that later it can provide appropriate
result messages. Earlier shlex was used to create tokens and assess the
query type based on that. However shlex can misinterpret some query
strings where whitespace charachters are mixed with quotes, because it
splits the string based on whitespace charachters. In some scenarios
'ValueError: No closing quotation' error can occur.

This change moves the tokenization from shlex to sqlparse.

Testing:
 - Added unit test to cover queries that contain mixed whitespaces
   and strings

Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad
Reviewed-on: http://gerrit.cloudera.org:8080/16389
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> impala-shell exits with ValueError with WITH clauses
> 
>
> Key: IMPALA-10051
> URL: https://issues.apache.org/jira/browse/IMPALA-10051
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 4.0
>Reporter: Tamas Mate
>Assignee: Tamas Mate
>Priority: Major
>
> Some strings can cause shlex to throw an exception in WITH clauses, for 
> example in a regexp_replace. This should be handled more gracefully and 
> correctly.
> Working query (impala-shell forwards the query for analysis):
> {code:java}
> impala-shell.sh -q 'with select regexp_replace(column_name, "[a-zA-Z]", "+ 
> ");'
> {code}
> While same query fails with ValueError when empty spaces are removed from the 
> arguments of the regexp_replace:
> {code:java}
> tmate@tmate-box:~/Projects/Impala$ impala-shell.sh -q 'with select 
> regexp_replace(column_name,"[a-zA-Z]","+ ");'
> Starting Impala Shell with no authentication using Python 2.7.16
> Warning: live_progress only applies to interactive shell sessions, and is 
> being skipped for now.
> Opened TCP connection to localhost:21000
> Connected to localhost:21000
> Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
> b29cb4ca82a4f05ea7dc0eadc330a64fbe685ef0)
> Traceback (most recent call last):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1973, in 
> 
> impala_shell_main()
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1927, in 
> impala_shell_main
> if execute_queries_non_interactive_mode(options, query_options):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1731, in 
> execute_queries_non_interactive_mode
> shell.execute_query_list(queries))
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1564, in 
> execute_query_list
> if self.onecmd(q) is CmdStatus.ERROR:
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 675, in 
> onecmd
> return func(arg)
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1276, in 
> do_with
> tokens = shlex.split(strip_comments(query.lstrip()), posix=False)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 279, in split
> return list(lex)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 269, in next
> token = self.get_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 96, in get_token
> raw = self.read_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 172, in read_token
> raise ValueError, "No closing quotation"
> ValueError: No closing quotation
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10179) After inverting a join's inputs the join's parallelism does not get reset

2020-09-18 Thread Aman Sinha (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198580#comment-17198580
 ] 

Aman Sinha commented on IMPALA-10179:
-

This is related to IMPALA-5612, although that issue was fixing the costing to 
account for the parallelism when deciding to invert or not.
Based on an initial look, it seems the fix for this issue is simply 
re-computing the stats after the inversion is done.

> After inverting a join's inputs the join's parallelism does not get reset
> -
>
> Key: IMPALA-10179
> URL: https://issues.apache.org/jira/browse/IMPALA-10179
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.4.0
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>Priority: Major
>
> In the following query, the left semi join gets flipped to a right semi join 
> due to the cardinality of the tables but the parallelism of the HashJoin 
> fragment (see Fragment F01) remains as hosts=1, instances=1.  The right 
> behavior should be to inherit the parallelism of the new probe input table 
> store_sales, so it should be hosts=3, instances=3 to avoid 
> under-parallelizing the HashJoin.
> {noformat}
> [localhost:21000] default> set explain_level=2;
> EXPLAIN_LEVEL set to 2
> [localhost:21000] default> use tpcds_parquet;
> Query: use tpcds_parquet
> [localhost:21000] tpcds_parquet> explain select count(*) from store_returns 
> where sr_customer_sk in (select ss_customer_sk from store_sales);
> Query: explain select count(*) from store_returns where sr_customer_sk in 
> (select ss_customer_sk from store_sales)
> Max Per-Host Resource Reservation: Memory=10.31MB Threads=6
> Per-Host Resource Estimates: Memory=85MB
> Analyzed query: SELECT count(*) FROM tpcds_parquet.store_returns LEFT SEMI 
> JOIN
> (SELECT ss_customer_sk FROM tpcds_parquet.store_sales) `$a$1` (`$c$1`) ON
> sr_customer_sk = `$a$1`.`$c$1`
> ""
> F03:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=10.02MB mem-reservation=0B 
> thread-reservation=1
> PLAN-ROOT SINK
> |  output exprs: count(*)
> |  mem-estimate=0B mem-reservation=0B thread-reservation=0
> |
> 09:AGGREGATE [FINALIZE]
> |  output: count:merge(*)
> |  mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB 
> thread-reservation=0
> |  tuple-ids=3 row-size=8B cardinality=1
> |  in pipelines: 09(GETNEXT), 04(OPEN)
> |
> 08:EXCHANGE [UNPARTITIONED]
> |  mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
> |  tuple-ids=3 row-size=8B cardinality=1
> |  in pipelines: 04(GETNEXT)
> |
> F01:PLAN FRAGMENT [HASH(tpcds_parquet.store_sales.ss_customer_sk)] hosts=1 
> instances=1
> Per-Host Resources: mem-estimate=23.88MB mem-reservation=5.81MB 
> thread-reservation=1 runtime-filters-memory=1.00MB
> 04:AGGREGATE
> |  output: count(*)
> |  mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB 
> thread-reservation=0
> |  tuple-ids=3 row-size=8B cardinality=1
> |  in pipelines: 04(GETNEXT), 06(OPEN)
> |
> 03:HASH JOIN [RIGHT SEMI JOIN, PARTITIONED]
> |  hash predicates: tpcds_parquet.store_sales.ss_customer_sk = sr_customer_sk
> |  runtime filters: RF000[bloom] <- sr_customer_sk
> |  mem-estimate=2.88MB mem-reservation=2.88MB spill-buffer=128.00KB 
> thread-reservation=0
> |  tuple-ids=0 row-size=4B cardinality=287.51K
> |  in pipelines: 06(GETNEXT), 00(OPEN)
> |
> |--07:EXCHANGE [HASH(sr_customer_sk)]
> |  |  mem-estimate=1.10MB mem-reservation=0B thread-reservation=0
> |  |  tuple-ids=0 row-size=4B cardinality=287.51K
> |  |  in pipelines: 00(GETNEXT)
> |  |
> |  F02:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=24.00MB mem-reservation=1.00MB 
> thread-reservation=2
> |  00:SCAN HDFS [tpcds_parquet.store_returns, RANDOM]
> | HDFS partitions=1/1 files=1 size=15.43MB
> | stored statistics:
> |   table: rows=287.51K size=15.43MB
> |   columns: all
> | extrapolated-rows=disabled max-scan-range-rows=287.51K
> | mem-estimate=24.00MB mem-reservation=1.00MB thread-reservation=1
> | tuple-ids=0 row-size=4B cardinality=287.51K
> | in pipelines: 00(GETNEXT)
> |
> 06:AGGREGATE [FINALIZE]
> |  group by: tpcds_parquet.store_sales.ss_customer_sk
> |  mem-estimate=10.00MB mem-reservation=1.94MB spill-buffer=64.00KB 
> thread-reservation=0
> |  tuple-ids=6 row-size=4B cardinality=90.63K
> |  in pipelines: 06(GETNEXT), 01(OPEN)
> |
> 05:EXCHANGE [HASH(tpcds_parquet.store_sales.ss_customer_sk)]
> |  mem-estimate=142.01KB mem-reservation=0B thread-reservation=0
> |  tuple-ids=6 row-size=4B cardinality=90.63K
> |  in pipelines: 01(GETNEXT)
> |
> F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> Per-Host Resources: mem-estimate=27.00MB mem-reservation=3.50MB 
> thread-reservation=2 

[jira] [Created] (IMPALA-10179) After inverting a join's inputs the join's parallelism does not get reset

2020-09-18 Thread Aman Sinha (Jira)
Aman Sinha created IMPALA-10179:
---

 Summary: After inverting a join's inputs the join's parallelism 
does not get reset
 Key: IMPALA-10179
 URL: https://issues.apache.org/jira/browse/IMPALA-10179
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.4.0
Reporter: Aman Sinha
Assignee: Aman Sinha


In the following query, the left semi join gets flipped to a right semi join 
due to the cardinality of the tables but the parallelism of the HashJoin 
fragment (see Fragment F01) remains as hosts=1, instances=1.  The right 
behavior should be to inherit the parallelism of the new probe input table 
store_sales, so it should be hosts=3, instances=3 to avoid under-parallelizing 
the HashJoin.

{noformat}
[localhost:21000] default> set explain_level=2;
EXPLAIN_LEVEL set to 2
[localhost:21000] default> use tpcds_parquet;
Query: use tpcds_parquet
[localhost:21000] tpcds_parquet> explain select count(*) from store_returns 
where sr_customer_sk in (select ss_customer_sk from store_sales);
Query: explain select count(*) from store_returns where sr_customer_sk in 
(select ss_customer_sk from store_sales)
Max Per-Host Resource Reservation: Memory=10.31MB Threads=6
Per-Host Resource Estimates: Memory=85MB
Analyzed query: SELECT count(*) FROM tpcds_parquet.store_returns LEFT SEMI JOIN
(SELECT ss_customer_sk FROM tpcds_parquet.store_sales) `$a$1` (`$c$1`) ON
sr_customer_sk = `$a$1`.`$c$1`
""
F03:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=10.02MB mem-reservation=0B 
thread-reservation=1
PLAN-ROOT SINK
|  output exprs: count(*)
|  mem-estimate=0B mem-reservation=0B thread-reservation=0
|
09:AGGREGATE [FINALIZE]
|  output: count:merge(*)
|  mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB 
thread-reservation=0
|  tuple-ids=3 row-size=8B cardinality=1
|  in pipelines: 09(GETNEXT), 04(OPEN)
|
08:EXCHANGE [UNPARTITIONED]
|  mem-estimate=16.00KB mem-reservation=0B thread-reservation=0
|  tuple-ids=3 row-size=8B cardinality=1
|  in pipelines: 04(GETNEXT)
|
F01:PLAN FRAGMENT [HASH(tpcds_parquet.store_sales.ss_customer_sk)] hosts=1 
instances=1
Per-Host Resources: mem-estimate=23.88MB mem-reservation=5.81MB 
thread-reservation=1 runtime-filters-memory=1.00MB
04:AGGREGATE
|  output: count(*)
|  mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB 
thread-reservation=0
|  tuple-ids=3 row-size=8B cardinality=1
|  in pipelines: 04(GETNEXT), 06(OPEN)
|
03:HASH JOIN [RIGHT SEMI JOIN, PARTITIONED]
|  hash predicates: tpcds_parquet.store_sales.ss_customer_sk = sr_customer_sk
|  runtime filters: RF000[bloom] <- sr_customer_sk
|  mem-estimate=2.88MB mem-reservation=2.88MB spill-buffer=128.00KB 
thread-reservation=0
|  tuple-ids=0 row-size=4B cardinality=287.51K
|  in pipelines: 06(GETNEXT), 00(OPEN)
|
|--07:EXCHANGE [HASH(sr_customer_sk)]
|  |  mem-estimate=1.10MB mem-reservation=0B thread-reservation=0
|  |  tuple-ids=0 row-size=4B cardinality=287.51K
|  |  in pipelines: 00(GETNEXT)
|  |
|  F02:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=24.00MB mem-reservation=1.00MB 
thread-reservation=2
|  00:SCAN HDFS [tpcds_parquet.store_returns, RANDOM]
| HDFS partitions=1/1 files=1 size=15.43MB
| stored statistics:
|   table: rows=287.51K size=15.43MB
|   columns: all
| extrapolated-rows=disabled max-scan-range-rows=287.51K
| mem-estimate=24.00MB mem-reservation=1.00MB thread-reservation=1
| tuple-ids=0 row-size=4B cardinality=287.51K
| in pipelines: 00(GETNEXT)
|
06:AGGREGATE [FINALIZE]
|  group by: tpcds_parquet.store_sales.ss_customer_sk
|  mem-estimate=10.00MB mem-reservation=1.94MB spill-buffer=64.00KB 
thread-reservation=0
|  tuple-ids=6 row-size=4B cardinality=90.63K
|  in pipelines: 06(GETNEXT), 01(OPEN)
|
05:EXCHANGE [HASH(tpcds_parquet.store_sales.ss_customer_sk)]
|  mem-estimate=142.01KB mem-reservation=0B thread-reservation=0
|  tuple-ids=6 row-size=4B cardinality=90.63K
|  in pipelines: 01(GETNEXT)
|
F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
Per-Host Resources: mem-estimate=27.00MB mem-reservation=3.50MB 
thread-reservation=2 runtime-filters-memory=1.00MB
02:AGGREGATE [STREAMING]
|  group by: tpcds_parquet.store_sales.ss_customer_sk
|  mem-estimate=10.00MB mem-reservation=2.00MB spill-buffer=64.00KB 
thread-reservation=0
|  tuple-ids=6 row-size=4B cardinality=90.63K
|  in pipelines: 01(GETNEXT)
|
01:SCAN HDFS [tpcds_parquet.store_sales, RANDOM]
   HDFS partitions=1824/1824 files=1824 size=200.95MB
   runtime filters: RF000[bloom] -> tpcds_parquet.store_sales.ss_customer_sk
   stored statistics:
 table: rows=2.88M size=200.95MB
 partitions: 1824/1824 rows=2.88M
 columns: all
   extrapolated-rows=disabled max-scan-range-rows=130.09K
   mem-estimate=16.00MB mem-reservation=512.00KB thread-reservation=1
   tuple-ids=1 row-size=4B 

[jira] [Commented] (IMPALA-6686) Change the DESCRIBE DATABASE output to look more like Hive output

2020-09-18 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-6686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198421#comment-17198421
 ] 

Tim Armstrong commented on IMPALA-6686:
---

[~csringhofer] I think it's OK to change. We can keep the "formatted" output in 
the row-oriented format, maybe, as a fallback option (Hive doesn't have that).

{noformat}
[localhost.EXAMPLE.COM:21000] default> describe database functional;
Query: describe database functional
++--+-+
| name   | location | comment |
++--+-+
| functional | hdfs://172.19.0.1:20500/test-warehouse/functional.db | |
++--+-+
Fetched 1 row(s) in 0.01s
[localhost.EXAMPLE.COM:21000] default> describe database extended functional;
Query: describe database extended functional
++--+-+
| name   | location | comment |
++--+-+
| functional | hdfs://172.19.0.1:20500/test-warehouse/functional.db | |
| Owner: |  | |
|| tarmstrong   | USER|
++--+-+
Fetched 3 row(s) in 0.01s
[localhost.EXAMPLE.COM:21000] default> describe database formatted functional;
Query: describe database formatted functional
++--+-+
| name   | location | comment |
++--+-+
| functional | hdfs://172.19.0.1:20500/test-warehouse/functional.db | |
| Owner: |  | |
|| tarmstrong   | USER|
++--+-+
Fetched 3 row(s) in 0.01s
{noformat}

> Change the DESCRIBE DATABASE output to look more like Hive output
> -
>
> Key: IMPALA-6686
> URL: https://issues.apache.org/jira/browse/IMPALA-6686
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Fredy Wijaya
>Priority: Minor
>  Labels: compatibility, incompatibility
>
> In Hive:
> {noformat}
> describe database functional;
> +--+--++-+-+-+
> | db_name  | comment  | location   | 
> owner_name  | owner_type  | parameters  |
> +--+--++-+-+-+
> | tpch |  | hdfs://localhost:20500/test-warehouse/tpch.db  | foo  
>| USER| |
> +--+--++-+-+-+{noformat}
> In Impala:
> {noformat}
> describe database extended functional;
> +-+---+-+
> | name| location  | comment |
> +-+---+-+
> | tpch| hdfs://localhost:20500/test-warehouse/tpch.db | |
> | Owner:  |   | |
> | | foo   | USER|
> +-+---+-+
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-10153) Support time travel for Iceberg tables

2020-09-18 Thread Shant Hovsepian (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198142#comment-17198142
 ] 

Shant Hovsepian edited comment on IMPALA-10153 at 9/18/20, 3:59 PM:


A design doc for time travel might be a good idea. Typically the "AS OF" clause 
is used for this, there had been a patch for Impala support with Kudu time 
travel as an example https://gerrit.cloudera.org/c/13342/

[~gaborkaszab] for history I believe Iceberg has the syntax SELECT * from 
table.history in other systems. This approach with a select statement 
convenient in some more advanced use cases as it lets you put the metadata 
history query into a subquery so you could do something like

{{AS OF (select snapshot_id from table.history where x=y)}}


was (Author: superdupershant):
A design doc for time travel might be a good idea. Typically the "AS OF" clause 
is used for this, there had been a patch for Impala support with Kudu time 
travel as an example https://issues.apache.org/jira/browse/KUDU-3177

[~gaborkaszab] for history I believe Iceberg has the syntax SELECT * from 
table.history in other systems. This approach with a select statement 
convenient in some more advanced use cases as it lets you put the metadata 
history query into a subquery so you could do something like

{{AS OF (select snapshot_id from table.history where x=y)}}

> Support time travel for Iceberg tables
> --
>
> Key: IMPALA-10153
> URL: https://issues.apache.org/jira/browse/IMPALA-10153
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Zoltán Borók-Nagy
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: impala-iceberg
>
> Iceberg tables support snapshots/data versioning/time travel.
> It means we can query an older version of the table.
> Probably we'll need to extend Impala's SQL syntax to support such queries 
> (Hive will also support such queries, so we should use the same syntax).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10153) Support time travel for Iceberg tables

2020-09-18 Thread Shant Hovsepian (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198416#comment-17198416
 ] 

Shant Hovsepian commented on IMPALA-10153:
--

It think it depends on how much we want to have similar semantics and behavior 
with how Iceberg is used elsewhere. i.e. 
https://iceberg.apache.org/spark/#time-travel

> Support time travel for Iceberg tables
> --
>
> Key: IMPALA-10153
> URL: https://issues.apache.org/jira/browse/IMPALA-10153
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Zoltán Borók-Nagy
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: impala-iceberg
>
> Iceberg tables support snapshots/data versioning/time travel.
> It means we can query an older version of the table.
> Probably we'll need to extend Impala's SQL syntax to support such queries 
> (Hive will also support such queries, so we should use the same syntax).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10178) Run-time profile shall report skews

2020-09-18 Thread Qifan Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qifan Chen reassigned IMPALA-10178:
---

Assignee: Qifan Chen

> Run-time profile shall report skews
> ---
>
> Key: IMPALA-10178
> URL: https://issues.apache.org/jira/browse/IMPALA-10178
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Qifan Chen
>Assignee: Qifan Chen
>Priority: Minor
>
> Currently per fragment,  in addition to fragment instance profile, only an 
> average profile is provided.  One has to go over fragment instance profiles 
> to figure out whether there exist skews. By skews we mean certain operator 
> instances process much more rows than others for the same fragment. 
> It will be useful if skew information can be provided in the run-time 
> profile. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10178) Run-time profile shall report skews

2020-09-18 Thread Qifan Chen (Jira)
Qifan Chen created IMPALA-10178:
---

 Summary: Run-time profile shall report skews
 Key: IMPALA-10178
 URL: https://issues.apache.org/jira/browse/IMPALA-10178
 Project: IMPALA
  Issue Type: Improvement
Reporter: Qifan Chen


Currently per fragment,  in addition to fragment instance profile, only an 
average profile is provided.  One has to go over fragment instance profiles to 
figure out whether there exist skews. By skews we mean certain operator 
instances process much more rows than others for the same fragment. 

It will be useful if skew information can be provided in the run-time profile. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10176) test_describe_formatted is broken

2020-09-18 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-10176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy updated IMPALA-10176:
---
Description: 
test_describe_formatted hasn't been doing anything in a while since the 
hive-jdbc jar is not in the classpath, hence bin/run-jdbc-client.sh always 
produces empty STDOUT. So exec_and_compare_hive_and_impala_hs2 always compare 
two empty result sets and always succeed.

I found the issue during [https://gerrit.cloudera.org/#/c/16450/] This sets the 
classpath for run-jdbc-client.sh correctly. But this change request failed 
during the pre-commit tests because it activated test_describe_formatted and 
turned out it is failing.

Seems like the problem is that Hive outputs some extra table statistics under 
"Table parameters", e.g.:

'','numFiles ','24 ' 
'','numPartitions ','24 ' 
'','numRows ','0 ' 
'','rawDataSize ','0 ' 
'','totalSize ','489934 '

  was:
test_describe_formatted hasn't been doing anything in a while since the 
hive-jdbc jar is not in the classpath, hence bin/run-jdbc-client.sh always 
produces empty STDOUT. So exec_and_compare_hive_and_impala_hs2 always compare 
two empty result sets and always succeed.

I found the issue during [https://gerrit.cloudera.org/#/c/16450/] This sets the 
classpath for run-jdbc-client.sh correctly. But this change request failed 
during the pre-commit tests because it activated test_describe_formatted and 
turned out it is failing.


> test_describe_formatted is broken
> -
>
> Key: IMPALA-10176
> URL: https://issues.apache.org/jira/browse/IMPALA-10176
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Priority: Major
>
> test_describe_formatted hasn't been doing anything in a while since the 
> hive-jdbc jar is not in the classpath, hence bin/run-jdbc-client.sh always 
> produces empty STDOUT. So exec_and_compare_hive_and_impala_hs2 always compare 
> two empty result sets and always succeed.
> I found the issue during [https://gerrit.cloudera.org/#/c/16450/] This sets 
> the classpath for run-jdbc-client.sh correctly. But this change request 
> failed during the pre-commit tests because it activated 
> test_describe_formatted and turned out it is failing.
> Seems like the problem is that Hive outputs some extra table statistics under 
> "Table parameters", e.g.:
> '','numFiles ','24 ' 
> '','numPartitions ','24 ' 
> '','numRows ','0 ' 
> '','rawDataSize ','0 ' 
> '','totalSize ','489934 '



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10153) Support time travel for Iceberg tables

2020-09-18 Thread Gabor Kaszab (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198321#comment-17198321
 ] 

Gabor Kaszab commented on IMPALA-10153:
---

I figured that the "SHOW TABLE HISTORY" is self explanatory and straightforward 
to use but didn't think about the subquery use case you showed above.
Let's then give "SELECT * from table.history" a try and see how complicated it 
is to implement.

Thanks for your inputs, [~superdupershant]!

> Support time travel for Iceberg tables
> --
>
> Key: IMPALA-10153
> URL: https://issues.apache.org/jira/browse/IMPALA-10153
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Zoltán Borók-Nagy
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: impala-iceberg
>
> Iceberg tables support snapshots/data versioning/time travel.
> It means we can query an older version of the table.
> Probably we'll need to extend Impala's SQL syntax to support such queries 
> (Hive will also support such queries, so we should use the same syntax).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10165) Support all partition transforms for Iceberg

2020-09-18 Thread Gabor Kaszab (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab reassigned IMPALA-10165:
-

Assignee: Gabor Kaszab

> Support all partition transforms for Iceberg
> 
>
> Key: IMPALA-10165
> URL: https://issues.apache.org/jira/browse/IMPALA-10165
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Zoltán Borók-Nagy
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: impala-iceberg
>
> Currently the identity and datetime (year, month, day, hour) Iceberg 
> partition transformations are supported by Impala.
> There are also TRUNCATE and BUCKET partition transformations in Iceberg that 
> needs to be supported. They can also take parameters, i.e. truncation width 
> and number of buckets.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10177) run-hive-jdbc.sh throws ClassNotFoundException exception.

2020-09-18 Thread Jira
Zoltán Borók-Nagy created IMPALA-10177:
--

 Summary: run-hive-jdbc.sh throws ClassNotFoundException exception.
 Key: IMPALA-10177
 URL: https://issues.apache.org/jira/browse/IMPALA-10177
 Project: IMPALA
  Issue Type: Bug
Reporter: Zoltán Borók-Nagy


Currently the hive-jdbc jar is added under the scope 'test'.

Therefore it won't be included in the build-classpath.txt because it only 
includes jars from the 'runtime' scope. So in the end run-hive-jdbc.sh throws 
ClassNotFoundException exception.

We could create a separate file 'test-classpath.txt' that would include the 
jars from the test scope and use the contents of it during e2e tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10176) test_describe_formatted is broken

2020-09-18 Thread Jira
Zoltán Borók-Nagy created IMPALA-10176:
--

 Summary: test_describe_formatted is broken
 Key: IMPALA-10176
 URL: https://issues.apache.org/jira/browse/IMPALA-10176
 Project: IMPALA
  Issue Type: Bug
Reporter: Zoltán Borók-Nagy


test_describe_formatted hasn't been doing anything in a while since the 
hive-jdbc jar is not in the classpath, hence bin/run-jdbc-client.sh always 
produces empty STDOUT. So exec_and_compare_hive_and_impala_hs2 always compare 
two empty result sets and always succeed.

I found the issue during [https://gerrit.cloudera.org/#/c/16450/] This sets the 
classpath for run-jdbc-client.sh correctly. But this change request failed 
during the pre-commit tests because it activated test_describe_formatted and 
turned out it is failing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10175) Extend error message when cast(..format..) fails in parse phase

2020-09-18 Thread Gabor Kaszab (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-10175:
--
Labels: supportability  (was: )

> Extend error message when cast(..format..) fails in parse phase
> ---
>
> Key: IMPALA-10175
> URL: https://issues.apache.org/jira/browse/IMPALA-10175
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: supportability
>
> {code:java}
> select cast('0;367' as date format 'YY;DDD'); 
> ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "0;367"
> {code}
> Here the output contains the input string but would be more helpful for 
> debugging if it also contained the original format string as well.
> This applies to String to Date conversions only as String to Timestamp 
> failures currently doesn't raise an error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10175) Extend error message when cast(..format..) fails in parse phase

2020-09-18 Thread Gabor Kaszab (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-10175:
--
Issue Type: Improvement  (was: New Feature)

> Extend error message when cast(..format..) fails in parse phase
> ---
>
> Key: IMPALA-10175
> URL: https://issues.apache.org/jira/browse/IMPALA-10175
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Major
>
> {code:java}
> select cast('0;367' as date format 'YY;DDD'); 
> ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "0;367"
> {code}
> Here the output contains the input string but would be more helpful for 
> debugging if it also contained the original format string as well.
> This applies to String to Date conversions only as String to Timestamp 
> failures currently doesn't raise an error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10175) Extend error message when cast(..format..) fails in parse phase

2020-09-18 Thread Gabor Kaszab (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab updated IMPALA-10175:
--
Description: 
{code:java}
select cast('0;367' as date format 'YY;DDD'); 
ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "0;367"
{code}

Here the output contains the input string but would be more helpful for 
debugging if it also contained the original format string as well.

This applies to String to Date conversions only as String to Timestamp failures 
currently doesn't raise an error.

  was:
{code:java}
select cast('0;367' as date format 'YY;DDD'); 
ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "0;367"
{code}

Here the output contains the input string but would be more helpful for 
debugging if it also contained the original format string as well.


> Extend error message when cast(..format..) fails in parse phase
> ---
>
> Key: IMPALA-10175
> URL: https://issues.apache.org/jira/browse/IMPALA-10175
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Major
>
> {code:java}
> select cast('0;367' as date format 'YY;DDD'); 
> ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "0;367"
> {code}
> Here the output contains the input string but would be more helpful for 
> debugging if it also contained the original format string as well.
> This applies to String to Date conversions only as String to Timestamp 
> failures currently doesn't raise an error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10175) Extend error message when cast(..format..) fails in parse phase

2020-09-18 Thread Gabor Kaszab (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab reassigned IMPALA-10175:
-

Assignee: Gabor Kaszab

> Extend error message when cast(..format..) fails in parse phase
> ---
>
> Key: IMPALA-10175
> URL: https://issues.apache.org/jira/browse/IMPALA-10175
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Major
>
> {code:java}
> select cast('0;367' as date format 'YY;DDD'); 
> ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "0;367"
> {code}
> Here the output contains the input string but would be more helpful for 
> debugging if it also contained the original format string as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-10175) Extend error message when cast(..format..) fails in parse phase

2020-09-18 Thread Gabor Kaszab (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-10175 started by Gabor Kaszab.
-
> Extend error message when cast(..format..) fails in parse phase
> ---
>
> Key: IMPALA-10175
> URL: https://issues.apache.org/jira/browse/IMPALA-10175
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Major
>
> {code:java}
> select cast('0;367' as date format 'YY;DDD'); 
> ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "0;367"
> {code}
> Here the output contains the input string but would be more helpful for 
> debugging if it also contained the original format string as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10175) Extend error message when cast(..format..) fails in parse phase

2020-09-18 Thread Gabor Kaszab (Jira)
Gabor Kaszab created IMPALA-10175:
-

 Summary: Extend error message when cast(..format..) fails in parse 
phase
 Key: IMPALA-10175
 URL: https://issues.apache.org/jira/browse/IMPALA-10175
 Project: IMPALA
  Issue Type: New Feature
  Components: Backend
Reporter: Gabor Kaszab


{code:java}
select cast('0;367' as date format 'YY;DDD'); 
ERROR: UDF ERROR: String to Date parse failed. Invalid string val: "0;367"
{code}

Here the output contains the input string but would be more helpful for 
debugging if it also contained the original format string as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9652) CTAS doesn't respect transactional properties

2020-09-18 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-9652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy reassigned IMPALA-9652:
-

Assignee: Zoltán Borók-Nagy

> CTAS doesn't respect transactional properties
> -
>
> Key: IMPALA-9652
> URL: https://issues.apache.org/jira/browse/IMPALA-9652
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Csaba Ringhofer
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: impala-acid
>
> To reproduce 
> {code}
> set DEFAULT_TRANSACTIONAL_TYPE=insert_only;
> create table t as select 1;
> show files in tctas;
> {code}
> The result on my machine is
> hdfs://localhost:20500/test-warehouse/managed/tctas/ae4eb75d6ad848d2-92cd5d31_1108910383_data.0.txt
>  
> which is wrong as the file was created in the root directory of the table and 
> not in a delta/base directory. This doesn't cause a visible error for users, 
> as the file will be still considered to the part of the table due to 
> "upgraded table" logic until the first major compaction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6686) Change the DESCRIBE DATABASE output to look more like Hive output

2020-09-18 Thread Csaba Ringhofer (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-6686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198271#comment-17198271
 ] 

Csaba Ringhofer commented on IMPALA-6686:
-

This came up during IMPALA-10169, as I am adding a new field to the output of 
DESCRIBE DATABASE, "managedlocation". Hive already has this field.

If I want to keep the Impala way then I should add another row to the output, 
while this may be a good opportunity to switch to the Hive way.

> Change the DESCRIBE DATABASE output to look more like Hive output
> -
>
> Key: IMPALA-6686
> URL: https://issues.apache.org/jira/browse/IMPALA-6686
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Fredy Wijaya
>Priority: Minor
>  Labels: compatibility, incompatibility
>
> In Hive:
> {noformat}
> describe database functional;
> +--+--++-+-+-+
> | db_name  | comment  | location   | 
> owner_name  | owner_type  | parameters  |
> +--+--++-+-+-+
> | tpch |  | hdfs://localhost:20500/test-warehouse/tpch.db  | foo  
>| USER| |
> +--+--++-+-+-+{noformat}
> In Impala:
> {noformat}
> describe database extended functional;
> +-+---+-+
> | name| location  | comment |
> +-+---+-+
> | tpch| hdfs://localhost:20500/test-warehouse/tpch.db | |
> | Owner:  |   | |
> | | foo   | USER|
> +-+---+-+
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10172) Support Hive metastore managed locations for databases

2020-09-18 Thread Csaba Ringhofer (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer reassigned IMPALA-10172:


Assignee: Csaba Ringhofer

> Support Hive metastore managed locations for databases
> --
>
> Key: IMPALA-10172
> URL: https://issues.apache.org/jira/browse/IMPALA-10172
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Tristan Stevens
>Assignee: Csaba Ringhofer
>Priority: Major
>
> In Hive 3 a database can have both managed and (unmanaged) locations.
> Hive DDL syntax is as follows:
> {noformat}
> CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name
>   [COMMENT database_comment]
>   [LOCATION hdfs_path]
>   [MANAGEDLOCATION hdfs_path]
>   [WITH DBPROPERTIES (property_name=property_value, ...)];
> ALTER (DATABASE|SCHEMA) database_name SET MANAGEDLOCATION hdfs_path;
> {noformat}
> Right now, Impala does not appear to support this syntax.
> Also the {{DESCRIBE FORMATTED}} and {{DESCRIBE EXTENDED}} statements should 
> display both the {{LOCATION}} and {{MANAGEDLOCATION}}.
> Example:
> {noformat}
> impala-shell -i host-2.user1-c5.my.example.com -d default -k --ssl 
> --ca_cert=/opt/cloudera/security/pki/chain.pem
> Starting Impala Shell using Kerberos authentication
> Using service name 'impala'
> SSL is enabled
> Opened TCP connection to host-2.user1-c5.my.example.com:21000
> Connected to host-2.user1-c5.my.example.com:21000
> Server version: impalad version 3.4.0-SNAPSHOT RELEASE (build 
> 25402784335c39cc24076d71dab7a3ccbd562094)
> Query: use `default`
> ***
> Welcome to the Impala shell.
> (Impala Shell v3.4.0-SNAPSHOT (2540278) built on Wed Aug  5 11:07:32 UTC 2020)
> To see a summary of a query's progress that updates in real-time, run 'set
> LIVE_PROGRESS=1;'.
> ***
> Query: use `default`
> [host-2.user1-c5.my.example.com:21000] default> create database dbnew4 
> LOCATION 'hdfs://host-1.user1-c5.my.example.com:8020/data/dbnewnew/external' 
> MANAGEDLOCATION 
> 'hdfs://host-1.user1-c5.my.example.com:8020/data/dbnewnew/managed';
> Query: create database dbnew4 LOCATION 
> 'hdfs://host-1.user1-c5.my.example.com:8020/data/dbnewnew/external' 
> MANAGEDLOCATION 
> 'hdfs://host-1.user1-c5.my.example.com:8020/data/dbnewnew/managed'
> ERROR: ParseException: Syntax error in line 1:
> ...0/data/dbnewnew/external' MANAGEDLOCATION 'hdfs://ccyc...
>  ^
> Encountered: IDENTIFIER
> Expected: AS, CACHED, PARTITION, TBLPROPERTIES, UNCACHED
> CAUSED BY: Exception: Syntax error
> [host-2.user1-c5.my.example.com:21000] default> alter database dbnewnew SET 
> MANAGEDLOCATION='hdfs://host-1.user1-c5.my.example.com:8020/data/dbnewnew/managed';
> Query: alter database dbnewnew SET 
> MANAGEDLOCATION='hdfs://host-1.user1-c5.my.example.com:8020/data/dbnewnew/managed'
> ERROR: ParseException: Syntax error in line 1:
> ...newnew SET MANAGEDLOCATION='hdfs://host-1.user1-...
>  ^
> Encountered: =
> Expected: ROLE, IDENTIFIER
> CAUSED BY: Exception: Syntax error
> [host-2.user1-c5.my.example.com:21000] default> describe database formatted 
> db_cust_loc3 ;
> Query: describe database formatted db_cust_loc3
> +--+---+-+
> | name | location 
>  | comment |
> +--+---+-+
> | db_cust_loc3 | hdfs://host-1.user1-c5.my.example.com:8020/data/db_cust_loc3 
> | |
> | Owner:   |  
>  | |
> |  | admin
>  | USER|
> +--+---+-+
> Fetched 3 row(s) in 0.03s
> [host-2.user1-c5.my.example.com:21000] default> describe database extended 
> db_cust_loc3 ;
> Query: describe database extended db_cust_loc3
> +--+---+-+
> | name | location 
>  | comment |
> +--+---+-+
> | db_cust_loc3 | hdfs://host-1.user1-c5.my.example.com:8020/data/db_cust_loc3 
> | |
> | Owner:   |  
>  | |
> |  | admin
>  | USER|
> 

[jira] [Commented] (IMPALA-10153) Support time travel for Iceberg tables

2020-09-18 Thread Shant Hovsepian (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198142#comment-17198142
 ] 

Shant Hovsepian commented on IMPALA-10153:
--

A design doc for time travel might be a good idea. Typically the "AS OF" clause 
is used for this, there had been a patch for Impala support with Kudu time 
travel as an example https://issues.apache.org/jira/browse/KUDU-3177

[~gaborkaszab] for history I believe Iceberg has the syntax SELECT * from 
table.history in other systems. This approach with a select statement 
convenient in some more advanced use cases as it lets you put the metadata 
history query into a subquery so you could do something like

{{AS OF (select snapshot_id from table.history where x=y)}}

> Support time travel for Iceberg tables
> --
>
> Key: IMPALA-10153
> URL: https://issues.apache.org/jira/browse/IMPALA-10153
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Zoltán Borók-Nagy
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: impala-iceberg
>
> Iceberg tables support snapshots/data versioning/time travel.
> It means we can query an older version of the table.
> Probably we'll need to extend Impala's SQL syntax to support such queries 
> (Hive will also support such queries, so we should use the same syntax).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org