[jira] [Created] (IMPALA-8937) Fine grained table metadata loading on Catalog server

2019-09-10 Thread bharath v (Jira)
bharath v created IMPALA-8937:
-

 Summary: Fine grained table metadata loading on Catalog server
 Key: IMPALA-8937
 URL: https://issues.apache.org/jira/browse/IMPALA-8937
 Project: IMPALA
  Issue Type: Improvement
  Components: Catalog, Frontend
Affects Versions: Impala 2.12.0, Impala 3.3.0
Reporter: bharath v


*Background*:

Currently the table _on the Catalog server_ is either in a loaded or unloaded 
state (IncompleteTable). When Catalog server starts for the first time, we 
first fetch a list of table names for each databases and every table in this 
list starts as an unloaded table. The table lists are propagated to the 
coordinators so that they know whether a table with a given name exists or not 
and they can start analyzing the queries. No metadata is loaded in the 
incomplete tables (like schema/ownership, comments etc.)

The table metadata is loaded lazily (and the table moves into a loaded state) 
when it is referenced in any query. When a load request comes in, all the table 
metadata is loaded including file block information. 

*Problem:* 

Coordinators need some additional information when analyzing unloaded tables. 
For example: IMPALA-8228. The ownership information is a part of the HMS table 
schema which is not loaded until the table is marked fully loaded. While this 
is not a problem for regular queries (like select * from ), it is an issue 
with queries like "show tables" which do not trigger a table load. In this 
particular case, due to the lack of ownership information, the output of the 
table listing could be different depending on whether the table is loaded. 
Another example is IMPALA-8606 where the GET_TABLES request does not return the 
table comments because they are not available for unloaded tables.

*Ask:*

We need to consider finer grained loading on the Catalog server in general. 
Instead of having a binary state (loaded vs unloaded), the table could be in a 
partially loaded state. We could also start with aggressively fetching certain 
pieces of information that we think could aid with analysis and lazily load the 
remaining pieces of metadata. Finer grained loading also integrates well with 
the LocalCatalog implementation on the coordinators where the the entire table 
need not be loaded on the Catalog server to serve partial meta information 
(e.g: show partitions ).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8936) Make queuing reason for unhealthy executor groups more generic

2019-09-10 Thread Lars Volker (Jira)
Lars Volker created IMPALA-8936:
---

 Summary: Make queuing reason for unhealthy executor groups more 
generic
 Key: IMPALA-8936
 URL: https://issues.apache.org/jira/browse/IMPALA-8936
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Lars Volker
Assignee: Lars Volker


In some situations, users might actually expect not having a healthy executor 
group around, e.g. when they're starting one and it takes a while to come 
online. We should make the queuing reason more generic and drop the "unhealthy" 
concept from it to reduce confusion.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-2138) Get rid of unused columns by upstream operators at points of materialization

2019-09-10 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927123#comment-16927123
 ] 

Tim Armstrong commented on IMPALA-2138:
---

I rebased the change here - 
https://github.com/timarmstrong/impala/tree/projection

> Get rid of unused columns by upstream operators at points of materialization
> 
>
> Key: IMPALA-2138
> URL: https://issues.apache.org/jira/browse/IMPALA-2138
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 1.4, Impala 2.0, Impala 2.2
>Reporter: Ippokratis Pandis
>Priority: Major
>  Labels: performance
> Attachments: 0001-Projection-prototype.patch
>
>
> It would be a very good performance improvement if we were able to get rid of 
> columns as soon as we know that they are not going to be used from any other 
> operators upstream. The amount of data we are handling will reduce making the 
> network and I/O (spilling) transfers more efficient. It will also improve 
> cache performance. 
> The current row-wise in-memory format does not make it very easy to get rid 
> of such unused columns. However, there are points of materialization where we 
> copy-out the tuples and we can actually perform these projections. There are 
> multiple points of materialization, notably:
> * The exchange operator
> * The build side of hash join
> * The probe side of hash join when we have spilling
> * The aggregation
> * Sorts and analytic function evaluation
> In order to do these projections we need to modify the FE and know at each 
> operator what's the minimum set of columns that are being referenced by this 
> operator and all the upstream ones. (That minimum set is very easy to be 
> calculated during an additional top-down traversal of the plan.) We also need 
> to modify the BE and make the copy-out operation aware of such projections.
> Assigning first to Alex, because of the needed FE changes. Happy to take care 
> of the needed BE changes. Perhaps we could split this issue into 2 sub-tasks, 
> the FE and the BE changes.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8548) Include Documentation About Ordinal Substitution

2019-09-10 Thread Alex Rodoni (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927122#comment-16927122
 ] 

Alex Rodoni commented on IMPALA-8548:
-

https://gerrit.cloudera.org/#/c/14209/

> Include Documentation About Ordinal Substitution 
> -
>
> Key: IMPALA-8548
> URL: https://issues.apache.org/jira/browse/IMPALA-8548
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Docs
>Affects Versions: Impala 2.0, Impala 3.0
>Reporter: David Mollitor
>Assignee: Alex Rodoni
>Priority: Minor
>
> Update Impala docs to include information on the 'ordinal substitution' 
> feature.
>  
> [https://github.com/apache/impala/blob/master/docs/shared/impala_common.xml#L1104]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8548) Include Documentation About Ordinal Substitution

2019-09-10 Thread Alex Rodoni (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8548 started by Alex Rodoni.
---
> Include Documentation About Ordinal Substitution 
> -
>
> Key: IMPALA-8548
> URL: https://issues.apache.org/jira/browse/IMPALA-8548
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Docs
>Affects Versions: Impala 2.0, Impala 3.0
>Reporter: David Mollitor
>Assignee: Alex Rodoni
>Priority: Minor
>
> Update Impala docs to include information on the 'ordinal substitution' 
> feature.
>  
> [https://github.com/apache/impala/blob/master/docs/shared/impala_common.xml#L1104]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8548) Include Documentation About Ordinal Substitution

2019-09-10 Thread Alex Rodoni (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-8548:

Issue Type: Improvement  (was: Documentation)

> Include Documentation About Ordinal Substitution 
> -
>
> Key: IMPALA-8548
> URL: https://issues.apache.org/jira/browse/IMPALA-8548
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Docs
>Affects Versions: Impala 2.0, Impala 3.0
>Reporter: David Mollitor
>Assignee: Alex Rodoni
>Priority: Minor
>
> Update Impala docs to include information on the 'ordinal substitution' 
> feature.
>  
> [https://github.com/apache/impala/blob/master/docs/shared/impala_common.xml#L1104]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8580) Impala Doc: Explain SimpleDateFormat in Impala

2019-09-10 Thread Alex Rodoni (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-8580:

Description: 
The docs should probably state what SimpleDateFormat masks/values are supported 
and what are not.

https://gerrit.cloudera.org/#/c/14207/

  was:The docs should probably state what SimpleDateFormat masks/values are 
supported and what are not.


> Impala Doc: Explain SimpleDateFormat in Impala
> --
>
> Key: IMPALA-8580
> URL: https://issues.apache.org/jira/browse/IMPALA-8580
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>
> The docs should probably state what SimpleDateFormat masks/values are 
> supported and what are not.
> https://gerrit.cloudera.org/#/c/14207/



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8580) Impala Doc: Explain SimpleDateFormat in Impala

2019-09-10 Thread Alex Rodoni (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8580 started by Alex Rodoni.
---
> Impala Doc: Explain SimpleDateFormat in Impala
> --
>
> Key: IMPALA-8580
> URL: https://issues.apache.org/jira/browse/IMPALA-8580
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>
> The docs should probably state what SimpleDateFormat masks/values are 
> supported and what are not.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8935) Add links to other daemons from webui

2019-09-10 Thread Thomas Tauber-Marshall (Jira)
Thomas Tauber-Marshall created IMPALA-8935:
--

 Summary: Add links to other daemons from webui
 Key: IMPALA-8935
 URL: https://issues.apache.org/jira/browse/IMPALA-8935
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Reporter: Thomas Tauber-Marshall
Assignee: Thomas Tauber-Marshall


It would be convenient for all of the debug webuis to have links to the other 
debug webuis within a single cluster.

For impalads, it would be easy to add links to each other impalad on the 
/backends page (from IMPALA-210 it looks like this even used to be the case, 
but everything has changed a ton since then, eg. we weren't even using 
templates at the time, so it got lost somewhere along the way). Its also fairly 
straight forward to add a link to the statestored and catalogd, eg. maybe on 
the index page or else on the nav bar.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8917) Including hostnames in debug UI URLs breaks a lot of use cases

2019-09-10 Thread Thomas Tauber-Marshall (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall resolved IMPALA-8917.

Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Including hostnames in debug UI URLs breaks a lot of use cases
> --
>
> Key: IMPALA-8917
> URL: https://issues.apache.org/jira/browse/IMPALA-8917
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.4.0
>Reporter: Tim Armstrong
>Assignee: Thomas Tauber-Marshall
>Priority: Blocker
> Fix For: Impala 3.4.0
>
>
> I've run into multiple cases that got broken by IMPALA-8897. The general 
> problem is the assumption that the hostname that the Impala server refers to 
> itself is resolvable by the client accessing the web UI. Cases I've run into:
> * In docker and kubernetes, where the internal hostnames aren't visible 
> outside of the internal network
> * On systems without a DNS-resolvable hostname, e.g. my Ubuntu desktop that I 
> access via a static IP
> I'm not sure what a fix would look like, but I think we at least need some 
> way to work around the problem in situations like this.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8897) Fully qualify all paths on the webserver

2019-09-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926833#comment-16926833
 ] 

ASF subversion and git services commented on IMPALA-8897:
-

Commit de77c61f383794601c79d9783d36235482740417 in impala's branch 
refs/heads/master from Thomas Tauber-Marshall
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=de77c61 ]

IMPALA-8917: Remove hostname from webui links if Knox isn't being used

IMPALA-8897 added the hostname to all links on the debug webui in
order to facilitate proxying connections through Apache Knox. This
makes the webui difficult to use in situations where the hostname is
not DNS-resolvable.

This patch fixes this by only including the hostname with links if
Knox proxying is actually being used, which we determine by looking
for the 'x-forwarded-context' header in the request, which Knox adds
to all requests.

It also removes the hidden form fields that were added to support Knox
integration when not being accessed through Knox.

It also adds a class comment on Webserver explaining the requirements
for keeping the webui compatible with Knox.

Testing:
- Added a test that checks that links on the webui are made absolute
  when the 'x-forwarded-context' header is in the request.

Change-Id: Ifcf77058dc6ce1d72422a9e3ca7868cdffacff76
Reviewed-on: http://gerrit.cloudera.org:8080/14199
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Impala Public Jenkins 


> Fully qualify all paths on the webserver
> 
>
> Key: IMPALA-8897
> URL: https://issues.apache.org/jira/browse/IMPALA-8897
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.4.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> In order to support Knox proxying of the debug webui, we should fully qualify 
> all links on our debug web ui with the host:port.
> This will allow us to create rewrite rules that do the transform:
> ...
> =>
> 

[jira] [Commented] (IMPALA-8917) Including hostnames in debug UI URLs breaks a lot of use cases

2019-09-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926832#comment-16926832
 ] 

ASF subversion and git services commented on IMPALA-8917:
-

Commit de77c61f383794601c79d9783d36235482740417 in impala's branch 
refs/heads/master from Thomas Tauber-Marshall
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=de77c61 ]

IMPALA-8917: Remove hostname from webui links if Knox isn't being used

IMPALA-8897 added the hostname to all links on the debug webui in
order to facilitate proxying connections through Apache Knox. This
makes the webui difficult to use in situations where the hostname is
not DNS-resolvable.

This patch fixes this by only including the hostname with links if
Knox proxying is actually being used, which we determine by looking
for the 'x-forwarded-context' header in the request, which Knox adds
to all requests.

It also removes the hidden form fields that were added to support Knox
integration when not being accessed through Knox.

It also adds a class comment on Webserver explaining the requirements
for keeping the webui compatible with Knox.

Testing:
- Added a test that checks that links on the webui are made absolute
  when the 'x-forwarded-context' header is in the request.

Change-Id: Ifcf77058dc6ce1d72422a9e3ca7868cdffacff76
Reviewed-on: http://gerrit.cloudera.org:8080/14199
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Impala Public Jenkins 


> Including hostnames in debug UI URLs breaks a lot of use cases
> --
>
> Key: IMPALA-8917
> URL: https://issues.apache.org/jira/browse/IMPALA-8917
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.4.0
>Reporter: Tim Armstrong
>Assignee: Thomas Tauber-Marshall
>Priority: Blocker
>
> I've run into multiple cases that got broken by IMPALA-8897. The general 
> problem is the assumption that the hostname that the Impala server refers to 
> itself is resolvable by the client accessing the web UI. Cases I've run into:
> * In docker and kubernetes, where the internal hostnames aren't visible 
> outside of the internal network
> * On systems without a DNS-resolvable hostname, e.g. my Ubuntu desktop that I 
> access via a static IP
> I'm not sure what a fix would look like, but I think we at least need some 
> way to work around the problem in situations like this.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8934) Add failpoint tests to result spooling code

2019-09-10 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-8934:


 Summary: Add failpoint tests to result spooling code
 Key: IMPALA-8934
 URL: https://issues.apache.org/jira/browse/IMPALA-8934
 Project: IMPALA
  Issue Type: Sub-task
Affects Versions: Impala 3.2.0
Reporter: Sahil Takiar
Assignee: Sahil Takiar


IMPALA-8924 was discovered while running {{test_failpoints.py}} with results 
spooling enabled. The goal of this JIRA is to add similar failpoint coverage to 
{{test_result_spooling.py}} so that we have sufficient coverage for the various 
failure paths when result spooling is enabled.

The failure paths that should be covered include:
* Failures while executing the exec tree should be handled correctly



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8934) Add failpoint tests to result spooling code

2019-09-10 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926818#comment-16926818
 ] 

Sahil Takiar commented on IMPALA-8934:
--

Going to take another pass at the code and check if there are more places 
failpoint tests should be added.

> Add failpoint tests to result spooling code
> ---
>
> Key: IMPALA-8934
> URL: https://issues.apache.org/jira/browse/IMPALA-8934
> Project: IMPALA
>  Issue Type: Sub-task
>Affects Versions: Impala 3.2.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> IMPALA-8924 was discovered while running {{test_failpoints.py}} with results 
> spooling enabled. The goal of this JIRA is to add similar failpoint coverage 
> to {{test_result_spooling.py}} so that we have sufficient coverage for the 
> various failure paths when result spooling is enabled.
> The failure paths that should be covered include:
> * Failures while executing the exec tree should be handled correctly



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7351) Add memory estimates for plan nodes and sinks with missing estimates

2019-09-10 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-7351.

Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Add memory estimates for plan nodes and sinks with missing estimates
> 
>
> Key: IMPALA-7351
> URL: https://issues.apache.org/jira/browse/IMPALA-7351
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, resource-management
> Fix For: Impala 3.4.0
>
>
> Many plan nodes and sinks, e.g. KuduScanNode, KuduTableSink, ExchangeNode, 
> etc are missing memory estimates entirely. 
> We should add a basic estimate for all these cases based on experiments and 
> data from real workloads. In some cases 0 may be the right estimate (e.g. for 
> streaming nodes like SelectNode that just pass through data) but we should 
> remove TODOs and document the reasoning in those cases.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7312) Non-blocking mode for Fetch() RPC

2019-09-10 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-7312.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Non-blocking mode for Fetch() RPC
> -
>
> Key: IMPALA-7312
> URL: https://issues.apache.org/jira/browse/IMPALA-7312
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Clients
>Reporter: Tim Armstrong
>Assignee: Sahil Takiar
>Priority: Major
>  Labels: resource-management
> Fix For: Impala 3.4.0
>
>
> Currently Fetch() can block for an arbitrary amount of time until a batch of 
> rows is produced. It might be helpful to have a mode where it returns quickly 
> when there is no data available, so that threads and RPC slots are not tied 
> up.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8704) SQL:2016 datetime patterns - Milestone 2

2019-09-10 Thread Gabor Kaszab (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8704 started by Gabor Kaszab.

> SQL:2016 datetime patterns - Milestone 2
> 
>
> Key: IMPALA-8704
> URL: https://issues.apache.org/jira/browse/IMPALA-8704
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 2.2.4
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: ramp-up
>
> Design doc for SQL:2016 datetime patterns:
> https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/
> Milestone 2 content:
> - Nested strings
> - FM/FX modifiers



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8704) SQL:2016 datetime patterns - Milestone 2

2019-09-10 Thread Gabor Kaszab (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab reassigned IMPALA-8704:


Assignee: Gabor Kaszab

> SQL:2016 datetime patterns - Milestone 2
> 
>
> Key: IMPALA-8704
> URL: https://issues.apache.org/jira/browse/IMPALA-8704
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 2.2.4
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: ramp-up
>
> Design doc for SQL:2016 datetime patterns:
> https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/
> Milestone 2 content:
> - Nested strings
> - FM/FX modifiers



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5098) Correct handling of DISTINCT in the select list

2019-09-10 Thread N Campbell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926532#comment-16926532
 ] 

N Campbell commented on IMPALA-5098:


Any plans for 3.x to be enhanced?

i.e. using 3.1


select distinct 
 sum ( cx ) over ( partition by c1 ) 
from (
 select c1, c2, sum ( c3 ) cx
 from CERT.TOLAP
 group by c1, c2 
) T1 

Error: [Cloudera][ImpalaJDBCDriver](500051) ERROR processing query/statement. 
Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, 
errorMessage:AnalysisException: cannot combine SELECT DISTINCT with analytic 
functions
), Query: select distinct 
 sum ( cx ) over ( partition by c1 ) 
from (
 select c1, c2, sum ( c3 ) cx
 from CERT.TOLAP
 group by c1, c2 
) T1.
SQLState: HY000
ErrorCode: 500051

vs


select distinct *
from ( select 
 sum ( cx ) over ( partition by c1 ) 
 from ( select c1, c2, sum ( c3 ) cx
 from CERT.TOLAP
 group by c1, c2 
 ) T1 
) T2

> Correct handling of DISTINCT in the select list
> ---
>
> Key: IMPALA-5098
> URL: https://issues.apache.org/jira/browse/IMPALA-5098
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.6.0
>Reporter: N Campbell
>Priority: Major
>  Labels: ansi-sql, sql-language
>
> DB2, ORACLE and various other systems will support the following statement 
> but Impala will not
> {noformat}
> [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error 
> Code: 0, 
> SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000,
> errorMessage:AnalysisException: cannot combine SELECT DISTINCT with analytic 
> functions
> ), Query: SELECT DISTINCT 
> `sno` AS `c1`, 
> `pno` AS `c2`, 
> SUM(`qty`)
> OVER(
> ) AS `c3`
> FROM
> `cert`.`tsupply` 
> ORDER BY 
> `sno` ASC NULLS LAST, 
> `pno` ASC NULLS LAST.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8932) impala shell shouldn't retry with kerberos when connecting over http

2019-09-10 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-8932.
---
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> impala shell shouldn't retry with kerberos when connecting over http
> 
>
> Key: IMPALA-8932
> URL: https://issues.apache.org/jira/browse/IMPALA-8932
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> {noformat}
> Error connecting: EOFError, 
> Kerberos ticket found in the credentials cache, retrying the connection with 
> a secure transport.
> Warning: --connect_timeout_ms is currently ignored with HTTP transport.
> Kerberos not supported with HTTP endpoints.
> Error connecting: NotImplementedError, 
> {noformat}
> The NotImplementedError is confusing.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8508) Use Python 3 from toolchain for impala-python

2019-09-10 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-8508:
-

Assignee: (was: Tim Armstrong)

> Use Python 3 from toolchain for impala-python
> -
>
> Key: IMPALA-8508
> URL: https://issues.apache.org/jira/browse/IMPALA-8508
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Tim Armstrong
>Priority: Major
> Attachments: 
> 0001-WIP-IMPALA-8508-download-Python-2.7-from-toolchain-i.patch
>
>
> We should standardise on a single python version to use for tests and other 
> infrastructure. Python 2.7 is going EOL soon.
> I started adding it to the toolchain - https://gerrit.cloudera.org/#/c/14161/



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8904) Daemons fails fast when statestore has not started up

2019-09-10 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-8904.
---
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Daemons fails fast when statestore has not started up
> -
>
> Key: IMPALA-8904
> URL: https://issues.apache.org/jira/browse/IMPALA-8904
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.1.0, Impala 3.2.0, Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> If you start the statestored and the other services at the same time, there 
> is a race between the statestore starting and the other services trying to 
> register with it. If the other services "win" the race, they abort startup 
> because they can't register with the statestore.
> The log looks like.
> {noformat}
> │ I0828 00:19:10.46 1 statestore-subscriber.cc:219] Starting 
> statestore subscriber 
>   
>  ││ I0828 
> 00:19:10.461310 1 thrift-server.cc:451] ThriftServer 
> 'StatestoreSubscriber' started on port: 23000 
>   
>  │
> │ I0828 00:19:10.461320 1 statestore-subscriber.cc:247] Registering with 
> statestore
>   
>  ││ I0828 00:19:10.461309   
> 299 TAcceptQueueServer.cpp:314] connection_setup_thread_pool_size is set to 2 
>   
>   
>   │
> │ I0828 00:19:10.462744 1 statestore-subscriber.cc:253] statestore 
> registration unsuccessful: RPC Error: Client for statestored:24000 hit an 
> unexpected exception: No more data to read., type: 
> N6apache6thrift9transport19TTransportExceptionE, rpc: 
> N6impala27TRegisterSubscriberRe ││ sponseE, send: done
>   
>   
>   
>│
> │ E0828 00:19:10.462818 1 impalad-main.cc:90] Impalad services did not 
> start correctly, exiting.  Error: RPC Error: Client for statestored:24000 hit 
> an unexpected exception: No more data to read., type: 
> N6apache6thrift9transport19TTransportExceptionE, rpc: N6impala27TRegisterS ││ 
> ubscriberResponseE, send: done
>   
>   
>   │
> │ Statestore subscriber did not start up. 
>   
> {noformat}
> Most management systems will automatically restart failed processes, so 
> typically the impalads will come back up and find the statestore, but the 
> crash loop is unnecessary.
> I propose that the services should retry for a while before giving up (we 
> still want the services to fail when there genuinely isn't a statestore 
> available).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8932) impala shell shouldn't retry with kerberos when connecting over http

2019-09-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926349#comment-16926349
 ] 

ASF subversion and git services commented on IMPALA-8932:
-

Commit 954e810b0ec67faca66e68b924b83dd805f455db in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=954e810 ]

IMPALA-8932: shell shouldn't retry kerberos over http

Change-Id: I5dde277a6a0ddbe5a919bcf376bbc19f0b48e95e
Reviewed-on: http://gerrit.cloudera.org:8080/14201
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> impala shell shouldn't retry with kerberos when connecting over http
> 
>
> Key: IMPALA-8932
> URL: https://issues.apache.org/jira/browse/IMPALA-8932
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>
> {noformat}
> Error connecting: EOFError, 
> Kerberos ticket found in the credentials cache, retrying the connection with 
> a secure transport.
> Warning: --connect_timeout_ms is currently ignored with HTTP transport.
> Kerberos not supported with HTTP endpoints.
> Error connecting: NotImplementedError, 
> {noformat}
> The NotImplementedError is confusing.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8904) Daemons fails fast when statestore has not started up

2019-09-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926348#comment-16926348
 ] 

ASF subversion and git services commented on IMPALA-8904:
-

Commit 19cb8dc1c1c2247e91adc4bf62cab27a7c1e4381 in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=19cb8dc ]

IMPALA-8904: retry statestore RegisterSubscriber() RPC

Previously connection failures triggered a retry, but
failures on the actual RPC did not trigger a retry. This
change moves the retry loop to DoRpcWithRetry(), instead
of relying on the ClientCache to retry the connection.

Note that DoRpcWithRetry() for thrift was dead code since
most backend RPCs were ported to KRPC, but should still work.

Testing:
Added targeted test with debug action to inject error on first
subscribe RPC.

Change-Id: I5d4e6283b5ec83170a1d1d03075b3384a9f108b5
Reviewed-on: http://gerrit.cloudera.org:8080/14198
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Daemons fails fast when statestore has not started up
> -
>
> Key: IMPALA-8904
> URL: https://issues.apache.org/jira/browse/IMPALA-8904
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.1.0, Impala 3.2.0, Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>
> If you start the statestored and the other services at the same time, there 
> is a race between the statestore starting and the other services trying to 
> register with it. If the other services "win" the race, they abort startup 
> because they can't register with the statestore.
> The log looks like.
> {noformat}
> │ I0828 00:19:10.46 1 statestore-subscriber.cc:219] Starting 
> statestore subscriber 
>   
>  ││ I0828 
> 00:19:10.461310 1 thrift-server.cc:451] ThriftServer 
> 'StatestoreSubscriber' started on port: 23000 
>   
>  │
> │ I0828 00:19:10.461320 1 statestore-subscriber.cc:247] Registering with 
> statestore
>   
>  ││ I0828 00:19:10.461309   
> 299 TAcceptQueueServer.cpp:314] connection_setup_thread_pool_size is set to 2 
>   
>   
>   │
> │ I0828 00:19:10.462744 1 statestore-subscriber.cc:253] statestore 
> registration unsuccessful: RPC Error: Client for statestored:24000 hit an 
> unexpected exception: No more data to read., type: 
> N6apache6thrift9transport19TTransportExceptionE, rpc: 
> N6impala27TRegisterSubscriberRe ││ sponseE, send: done
>   
>   
>   
>│
> │ E0828 00:19:10.462818 1 impalad-main.cc:90] Impalad services did not 
> start correctly, exiting.  Error: RPC Error: Client for statestored:24000 hit 
> an unexpected exception: No more data to read., type: 
> N6apache6thrift9transport19TTransportExceptionE, rpc: N6impala27TRegisterS ││ 
> ubscriberResponseE, send: done
>   
>   
>   │
> │ Statestore subscriber did not start up. 
>   
> {noformat}
> Most management systems will automatically restart failed processes, so 
> typically the impalads will come back up and find the statestore, but the 
> crash loop is unnecessary.
> I propose that the services should retry for a while before giving up (we 
> still want the services to fail when there genuinely isn't a statestore 
> available).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: 

[jira] [Commented] (IMPALA-7312) Non-blocking mode for Fetch() RPC

2019-09-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926350#comment-16926350
 ] 

ASF subversion and git services commented on IMPALA-7312:
-

Commit 151835116a7972b15a646f8eae6bd8a593bb3564 in impala's branch 
refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1518351 ]

IMPALA-7312: Non-blocking mode for Fetch() RPC

Adds the query option FETCH_ROWS_TIMEOUT_MS to control the client
timeout when fetching rows. Set to 10 seconds by default to avoid
unnecessary fetch requests. Timeout applies when result spooling is
enabled or disabled.

When result spooling is disabled, the timeout controls how long the
client thread will wait for a single RowBatch to be produced by the
coordinator fragment. When result spooling is enabled, a client can
fetch multiple RowBatches at a time, so the timeout controls the total
time spent waiting for RowBatches to be produced.

The timeout applies to both waiting for rows to be sent by the fragment
instance thread, and waiting for rows to be materialized (e.g. the time
measured by RowMaterializationTimer).

Testing:
* Added new tests to test_fetch.py
* Ran core tests

Change-Id: I331acaba23a65dab43cca48e9dc0dc957b9c632d
Reviewed-on: http://gerrit.cloudera.org:8080/14157
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Non-blocking mode for Fetch() RPC
> -
>
> Key: IMPALA-7312
> URL: https://issues.apache.org/jira/browse/IMPALA-7312
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Clients
>Reporter: Tim Armstrong
>Assignee: Sahil Takiar
>Priority: Major
>  Labels: resource-management
>
> Currently Fetch() can block for an arbitrary amount of time until a batch of 
> rows is produced. It might be helpful to have a mode where it returns quickly 
> when there is no data available, so that threads and RPC slots are not tied 
> up.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org