[jira] [Resolved] (IMPALA-7312) Non-blocking mode for Fetch() RPC

2019-09-10 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-7312.
--
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Non-blocking mode for Fetch() RPC
> -
>
> Key: IMPALA-7312
> URL: https://issues.apache.org/jira/browse/IMPALA-7312
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Clients
>Reporter: Tim Armstrong
>Assignee: Sahil Takiar
>Priority: Major
>  Labels: resource-management
> Fix For: Impala 3.4.0
>
>
> Currently Fetch() can block for an arbitrary amount of time until a batch of 
> rows is produced. It might be helpful to have a mode where it returns quickly 
> when there is no data available, so that threads and RPC slots are not tied 
> up.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (IMPALA-7351) Add memory estimates for plan nodes and sinks with missing estimates

2019-09-10 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-7351.

Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Add memory estimates for plan nodes and sinks with missing estimates
> 
>
> Key: IMPALA-7351
> URL: https://issues.apache.org/jira/browse/IMPALA-7351
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, resource-management
> Fix For: Impala 3.4.0
>
>
> Many plan nodes and sinks, e.g. KuduScanNode, KuduTableSink, ExchangeNode, 
> etc are missing memory estimates entirely. 
> We should add a basic estimate for all these cases based on experiments and 
> data from real workloads. In some cases 0 may be the right estimate (e.g. for 
> streaming nodes like SelectNode that just pass through data) but we should 
> remove TODOs and document the reasoning in those cases.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IMPALA-8934) Add failpoint tests to result spooling code

2019-09-10 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-8934:


 Summary: Add failpoint tests to result spooling code
 Key: IMPALA-8934
 URL: https://issues.apache.org/jira/browse/IMPALA-8934
 Project: IMPALA
  Issue Type: Sub-task
Affects Versions: Impala 3.2.0
Reporter: Sahil Takiar
Assignee: Sahil Takiar


IMPALA-8924 was discovered while running {{test_failpoints.py}} with results 
spooling enabled. The goal of this JIRA is to add similar failpoint coverage to 
{{test_result_spooling.py}} so that we have sufficient coverage for the various 
failure paths when result spooling is enabled.

The failure paths that should be covered include:
* Failures while executing the exec tree should be handled correctly



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (IMPALA-8917) Including hostnames in debug UI URLs breaks a lot of use cases

2019-09-10 Thread Thomas Tauber-Marshall (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall resolved IMPALA-8917.

Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Including hostnames in debug UI URLs breaks a lot of use cases
> --
>
> Key: IMPALA-8917
> URL: https://issues.apache.org/jira/browse/IMPALA-8917
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.4.0
>Reporter: Tim Armstrong
>Assignee: Thomas Tauber-Marshall
>Priority: Blocker
> Fix For: Impala 3.4.0
>
>
> I've run into multiple cases that got broken by IMPALA-8897. The general 
> problem is the assumption that the hostname that the Impala server refers to 
> itself is resolvable by the client accessing the web UI. Cases I've run into:
> * In docker and kubernetes, where the internal hostnames aren't visible 
> outside of the internal network
> * On systems without a DNS-resolvable hostname, e.g. my Ubuntu desktop that I 
> access via a static IP
> I'm not sure what a fix would look like, but I think we at least need some 
> way to work around the problem in situations like this.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IMPALA-8935) Add links to other daemons from webui

2019-09-10 Thread Thomas Tauber-Marshall (Jira)
Thomas Tauber-Marshall created IMPALA-8935:
--

 Summary: Add links to other daemons from webui
 Key: IMPALA-8935
 URL: https://issues.apache.org/jira/browse/IMPALA-8935
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Reporter: Thomas Tauber-Marshall
Assignee: Thomas Tauber-Marshall


It would be convenient for all of the debug webuis to have links to the other 
debug webuis within a single cluster.

For impalads, it would be easy to add links to each other impalad on the 
/backends page (from IMPALA-210 it looks like this even used to be the case, 
but everything has changed a ton since then, eg. we weren't even using 
templates at the time, so it got lost somewhere along the way). Its also fairly 
straight forward to add a link to the statestored and catalogd, eg. maybe on 
the index page or else on the nav bar.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IMPALA-8936) Make queuing reason for unhealthy executor groups more generic

2019-09-10 Thread Lars Volker (Jira)
Lars Volker created IMPALA-8936:
---

 Summary: Make queuing reason for unhealthy executor groups more 
generic
 Key: IMPALA-8936
 URL: https://issues.apache.org/jira/browse/IMPALA-8936
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Lars Volker
Assignee: Lars Volker


In some situations, users might actually expect not having a healthy executor 
group around, e.g. when they're starting one and it takes a while to come 
online. We should make the queuing reason more generic and drop the "unhealthy" 
concept from it to reduce confusion.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IMPALA-8937) Fine grained table metadata loading on Catalog server

2019-09-10 Thread bharath v (Jira)
bharath v created IMPALA-8937:
-

 Summary: Fine grained table metadata loading on Catalog server
 Key: IMPALA-8937
 URL: https://issues.apache.org/jira/browse/IMPALA-8937
 Project: IMPALA
  Issue Type: Improvement
  Components: Catalog, Frontend
Affects Versions: Impala 2.12.0, Impala 3.3.0
Reporter: bharath v


*Background*:

Currently the table _on the Catalog server_ is either in a loaded or unloaded 
state (IncompleteTable). When Catalog server starts for the first time, we 
first fetch a list of table names for each databases and every table in this 
list starts as an unloaded table. The table lists are propagated to the 
coordinators so that they know whether a table with a given name exists or not 
and they can start analyzing the queries. No metadata is loaded in the 
incomplete tables (like schema/ownership, comments etc.)

The table metadata is loaded lazily (and the table moves into a loaded state) 
when it is referenced in any query. When a load request comes in, all the table 
metadata is loaded including file block information. 

*Problem:* 

Coordinators need some additional information when analyzing unloaded tables. 
For example: IMPALA-8228. The ownership information is a part of the HMS table 
schema which is not loaded until the table is marked fully loaded. While this 
is not a problem for regular queries (like select * from ), it is an issue 
with queries like "show tables" which do not trigger a table load. In this 
particular case, due to the lack of ownership information, the output of the 
table listing could be different depending on whether the table is loaded. 
Another example is IMPALA-8606 where the GET_TABLES request does not return the 
table comments because they are not available for unloaded tables.

*Ask:*

We need to consider finer grained loading on the Catalog server in general. 
Instead of having a binary state (loaded vs unloaded), the table could be in a 
partially loaded state. We could also start with aggressively fetching certain 
pieces of information that we think could aid with analysis and lazily load the 
remaining pieces of metadata. Finer grained loading also integrates well with 
the LocalCatalog implementation on the coordinators where the the entire table 
need not be loaded on the Catalog server to serve partial meta information 
(e.g: show partitions ).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)