[jira] [Created] (IMPALA-9142) Failure in PlannerTest.testSpillableBufferSizing: SCAN HDFS columns missing.

2019-11-08 Thread Anurag Mantripragada (Jira)
Anurag Mantripragada created IMPALA-9142:


 Summary: Failure in PlannerTest.testSpillableBufferSizing: SCAN 
HDFS columns missing.
 Key: IMPALA-9142
 URL: https://issues.apache.org/jira/browse/IMPALA-9142
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Reporter: Anurag Mantripragada
Assignee: Tim Armstrong


Not sure if the mt_dop planner changes has anything to do with this. Please 
feel free to reassign this.

It seems like the columns are missing for functional_parquet.alltypestiny table.

Happened here: 
[https://master-02.jenkins.cloudera.com/job/impala-asf-master-exhaustive/866]
{code:java}
Actual:
|--03:EXCHANGE [BROADCAST]
|  |  mem-estimate=251.92KB mem-reservation=0B thread-reservation=0
|  |  tuple-ids=1 row-size=80B cardinality=unavailable
|  |  in pipelines: 01(GETNEXT)
|  |
|  F01:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|  Per-Host Resources: mem-estimate=16.00MB mem-reservation=88.00KB 
thread-reservation=2
|  01:SCAN HDFS [functional_parquet.alltypestiny, RANDOM]
| HDFS partitions=4/4 files=4 size=11.92KB
| stored statistics:
|   table: rows=unavailable size=unavailable
|   partitions: 0/4 rows=unavailable
|   columns: unavailable


Expected:
|--03:EXCHANGE [BROADCAST]
|  |  mem-estimate=251.92KB mem-reservation=0B thread-reservation=0
|  |  tuple-ids=1 row-size=80B cardinality=unavailable
|  |  in pipelines: 01(GETNEXT)
|  |
|  F01:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|  Per-Host Resources: mem-estimate=16.00MB mem-reservation=88.00KB 
thread-reservation=2
|  01:SCAN HDFS [functional_parquet.alltypestiny, RANDOM]
| HDFS partitions=4/4 files=4 size=11.67KB
| stored statistics:
|   table: rows=unavailable size=unavailable
|   partitions: 0/4 rows=unavailable
|   columns missing stats: id, bool_col, tinyint_col, smallint_col, 
int_col, bigint_col, float_col, double_col, date_string_col, string_col, 
timestamp_col {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9141) Impala Doc: Document SQL:2016 datetime patterns - Milestone 2

2019-11-08 Thread Alexandra Rodoni (Jira)
Alexandra Rodoni created IMPALA-9141:


 Summary: Impala Doc: Document SQL:2016 datetime patterns - 
Milestone 2
 Key: IMPALA-9141
 URL: https://issues.apache.org/jira/browse/IMPALA-9141
 Project: IMPALA
  Issue Type: Sub-task
  Components: Docs
Reporter: Alexandra Rodoni
Assignee: Alexandra Rodoni






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9129) Provide a way for negative tests to remove intentionally generated core dumps

2019-11-08 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-9129.
-
Resolution: Fixed

> Provide a way for negative tests to remove intentionally generated core dumps
> -
>
> Key: IMPALA-9129
> URL: https://issues.apache.org/jira/browse/IMPALA-9129
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
>
> Occasionally, tests (esp. custom cluster tests) will inject an error or set 
> some invalid config, expecting Impala to generate a core dump.
> We should have a general way for such files to delete the bogus core dumps, 
> otherwise they can complicate/confuse later triaging efforts of legitimate 
> test failures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9140) Get rid of the unnecessary load submitter thread pool in tblLoadingMgr

2019-11-08 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created IMPALA-9140:
---

 Summary: Get rid of the unnecessary load submitter thread pool in 
tblLoadingMgr
 Key: IMPALA-9140
 URL: https://issues.apache.org/jira/browse/IMPALA-9140
 Project: IMPALA
  Issue Type: Bug
Reporter: Vihang Karajgaonkar


This JIRA is created as a followup on the discussion on 
https://gerrit.cloudera.org/#/c/14611 related to various pools used for loading 
tables.

It looks like there are 2 pools of threads both of the size 
{{num_metadata_loading_threads}}. One pool is used to submit the load requests 
to another pool {{tblLoadingPool_}} which does the actual loading of the 
tables. I think we can get rid of the pool which submits the tasks since it is 
not very time-consuming operation and can be done synchronously (all it needs 
to do submit the task in the queue in the front or back based on whether its a 
prioritized load or background load). This will simplify the loading code and  
reduce unnecessary number of threads being created by {{TblLoadingMgr}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9139) Invalidate metadata adds all the tables to background loading pool unnecessarily

2019-11-08 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created IMPALA-9139:
---

 Summary: Invalidate metadata adds all the tables to background 
loading pool unnecessarily
 Key: IMPALA-9139
 URL: https://issues.apache.org/jira/browse/IMPALA-9139
 Project: IMPALA
  Issue Type: Bug
Reporter: Vihang Karajgaonkar


I see the following code in the reset() method of CatalogServiceCatalog
{code:java}
  // Build a new DB cache, populate it, and replace the existing cache in 
one
  // step.
  Map newDbCache = new ConcurrentHashMap();
  List tblsToBackgroundLoad = new ArrayList<>();
  try (MetaStoreClient msClient = getMetaStoreClient()) {
List allDbs = msClient.getHiveClient().getAllDatabases();
int numComplete = 0;
for (String dbName: allDbs) {
  if (isBlacklistedDb(dbName)) {
LOG.info("skip blacklisted db: " + dbName);
continue;
  }
  String annotation = String.format("invalidating metadata - %s/%s dbs 
complete",
  numComplete++, allDbs.size());
  try (ThreadNameAnnotator tna = new ThreadNameAnnotator(annotation)) {
dbName = dbName.toLowerCase();
Db oldDb = oldDbCache.get(dbName);
Pair> invalidatedDb = invalidateDb(msClient,
dbName, oldDb);
if (invalidatedDb == null) continue;
newDbCache.put(dbName, invalidatedDb.first);
tblsToBackgroundLoad.addAll(invalidatedDb.second);
  }
}
  }
  dbCache_.set(newDbCache);

  // Identify any deleted databases and add them to the delta log.
  Set oldDbNames = oldDbCache.keySet();
  Set newDbNames = newDbCache.keySet();
  oldDbNames.removeAll(newDbNames);
  for (String dbName: oldDbNames) {
Db removedDb = oldDbCache.get(dbName);
updateDeleteLog(removedDb);
  }

  // Submit tables for background loading.
  for (TTableName tblName: tblsToBackgroundLoad) {
tableLoadingMgr_.backgroundLoad(tblName);
  }
{code}

If you notice above, the tables are being added to the backgroundLoad with 
checking the flag {{loadInBackground_}}. This means that even if the flag is 
unset, after we issue a invalidate metadata command, all the tables in the 
system are being loaded in the background. Note that this code is only loading 
the tables, not adding the loaded tables to the catalog which is good otherwise 
the memory footprint of catalog would be increased after every invalidate 
metadata command.

This bug has 2 implications:
1. We are obviously wasting a lot of cpu cycles without getting anything out of 
it.
2. The more subtle side-effect is that this would fill up the 
{{tableLoadingDeque_}}. This means any other background load task will take a 
longer duration to complete.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9138) Classify certain errors as retryable

2019-11-08 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-9138:


 Summary: Classify certain errors as retryable
 Key: IMPALA-9138
 URL: https://issues.apache.org/jira/browse/IMPALA-9138
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Sahil Takiar
Assignee: Sahil Takiar


Impala should be able to classify certain errors as "retryable". This can be 
done by modifying the {{TStatus}} object to have a "type". For now, the only 
types can be "GENERAL" and "RETRYABLE". This way when a {{TStatus}} object is 
created, it can be marked as retryable. If the {{TStatus}} is retryable, the 
Coordinator can trigger a retry of the query.

This approach allows us to incrementally mark more and more errors as retryable 
as necessary. For now, just RPC failures will be marked as retryable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9137) Blacklist node if a DataStreamService RPC to the node fails

2019-11-08 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-9137:


 Summary: Blacklist node if a DataStreamService RPC to the node 
fails
 Key: IMPALA-9137
 URL: https://issues.apache.org/jira/browse/IMPALA-9137
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Sahil Takiar
Assignee: Sahil Takiar


If a query fails because a RPC to a specific node failed, the query error 
message will of the form:

{{ERROR: TransmitData() to 10.65.30.141:27000 failed: Network error: recv got 
EOF from 10.65.30.141:27000 (error 108)}}

or

{{ERROR: TransmitData() to 10.65.29.251:27000 failed: Network error: recv error 
from 0.0.0.0:0: Transport endpoint is not connected (error 107)}}

or

{{ERROR: TransmitData() to 10.65.26.254:27000 failed: Network error: Client 
connection negotiation failed: client connection to 10.65.26.254:27000: 
connect: Connection refused (error 111)}}

or

{{ERROR: EndDataStream() to 127.0.0.1:27002 failed: Network error: recv error 
from 0.0.0.0:0: Transport endpoint is not connected (error 107)}}

RPCs are already retried, so it is likely that something is wrong with the 
target node. Perhaps it crashed or is so overloaded that it can't process RPC 
requests. In any case, the Impala Coordinator should blacklist the target of 
the failed RPC so that future queries don't fail with the same error.

If the node crashed, the statestore will eventually remove the failed node from 
the cluster as well. However, the statestore can take a while to detect a 
failed node because it has a long timeout. The issue is that queries can still 
fail in within the timeout window. 

This is necessary for transparent query retries because if a node does crash, 
it will take too long for the statestore to remove the crashed node from the 
cluster. So any attempt at retrying a query will just fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8648) Impala ACID read stress tests

2019-11-08 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-8648.
---
Resolution: Fixed

> Impala ACID read stress tests
> -
>
> Key: IMPALA-8648
> URL: https://issues.apache.org/jira/browse/IMPALA-8648
> Project: IMPALA
>  Issue Type: Test
>Reporter: Dinesh Garg
>Assignee: Zoltán Borók-Nagy
>Priority: Critical
>  Labels: impala-acid
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)