[1/9] impala git commit: IMPALA-7622: adds profile metrics when fetching incremental stats

2018-09-30 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master ac33c0c42 -> ddef2cb9b IMPALA-7622: adds profile metrics when fetching incremental stats When computing incremental statistics by fetching the stats directly from catalogd, a potentially expensive RPC is made from the impalad coordinator t

[9/9] impala git commit: IMPALA-376: add built-in functions for parsing JSON

2018-09-30 Thread tarmstrong
IMPALA-376: add built-in functions for parsing JSON This patch implements the same function as Hive UDF get_json_object. We reuse RapidJson to parse the json string. In order to track the memory used in RapidJson, we wrap FunctionContext into an allocator. get_json_object accepts two parameters:

[2/9] impala git commit: IMPALA-7503: SHOW GRANT USER not showing all privileges.

2018-09-30 Thread tarmstrong
http://git-wip-us.apache.org/repos/asf/impala/blob/34e666f5/tests/authorization/test_show_grant_user.py -- diff --git a/tests/authorization/test_show_grant_user.py b/tests/authorization/test_show_grant_user.py new file mode 100644

[3/9] impala git commit: IMPALA-7503: SHOW GRANT USER not showing all privileges.

2018-09-30 Thread tarmstrong
IMPALA-7503: SHOW GRANT USER not showing all privileges. This patch fixes the SHOW GRANT USER statement to show all privileges granted to a user, either directly via object ownership, or granted through a role via a group the user belongs to. The output for SHOW GRANT USER will have two additional

[4/9] impala git commit: IMPALA-7492: Add support for DATE text parser/formatter

2018-09-30 Thread tarmstrong
http://git-wip-us.apache.org/repos/asf/impala/blob/cb493716/be/src/runtime/timestamp-parse-util.cc -- diff --git a/be/src/runtime/timestamp-parse-util.cc b/be/src/runtime/timestamp-parse-util.cc index 3063728..27568a3 100644 --- a

[7/9] impala git commit: Minor cleanup of hash table probe counts

2018-09-30 Thread tarmstrong
Minor cleanup of hash table probe counts * Increment num_probes_ in Probe() instead of in every caller * Removed dead num_failed_probes_ variable. Initialise some members to constants inline while we're here. Change-Id: I179d42300b069d0a34da30bb593d8f97b5846dc8 Reviewed-on: http://gerrit.clouder

[6/9] impala git commit: IMPALA-6271: Impala daemon should log a message when it's being shut down

2018-09-30 Thread tarmstrong
IMPALA-6271: Impala daemon should log a message when it's being shut down Currently Impalad does not log any message when SIGTERM is sent to impalad to terminate or to do a graceful shut down. This change logs a message when SIGTERM is received by impalad/catalogd/statestored. This logging will as

[8/9] impala git commit: IMPALA-7599: make the number of local cache retries configurable

2018-09-30 Thread tarmstrong
IMPALA-7599: make the number of local cache retries configurable Under heavy read/write load, the number of retries needed for queries in order to skip over inconsistent metadata exceptions needs to be set higher. This change makes the number of retries configurable. It can be set with the newly a

[5/9] impala git commit: IMPALA-7492: Add support for DATE text parser/formatter

2018-09-30 Thread tarmstrong
IMPALA-7492: Add support for DATE text parser/formatter This change is the first step in implementing support for DATE type (IMPALA-6169). The DATE parser/formatter is implemented by the new DateParser class. - The parser supports parsing both default and custom formatted DATE values. CCTZ is use

[3/4] impala git commit: IMPALA-7632: fix erasure coding build for custom cluster tests

2018-09-27 Thread tarmstrong
IMPALA-7632: fix erasure coding build for custom cluster tests Fix tests to always pass query options via the query_options parameter. Modified the infrastructure to fail on non-erasure-coding builds if tests pass in default query options in the wrong way. Skip an restart test that makes assumpt

[2/4] impala git commit: [DOCS] A typo and an empty Example section fixed

2018-09-27 Thread tarmstrong
[DOCS] A typo and an empty Example section fixed Change-Id: I28a769c21757f2684cff8325b786d1327e71d3a3 Reviewed-on: http://gerrit.cloudera.org:8080/11538 Reviewed-by: Alex Rodoni Tested-by: Impala Public Jenkins Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit: http://git-wip-

[1/4] impala git commit: IMPALA-110 (part 3): Add multiple DISTINCT support to query generator

2018-09-27 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master ba27b0381 -> ac33c0c42 IMPALA-110 (part 3): Add multiple DISTINCT support to query generator Previously, Impala was only able to support DISTINCT in aggregate functions over a single expr per SELECT list. IMPALA-110 removes this restrictio

[4/4] impala git commit: IMPALA-7531: Daemon level catalog cache metrics

2018-09-27 Thread tarmstrong
IMPALA-7531: Daemon level catalog cache metrics This patch adds the aggregated CatalogdMetaProvider cache stats to the catalog metrics on the coordinators. They can be accessed under :/metrics#catalog. These metrics are refreshed at the end of planning, for each query run. Testing: --- Visu

[2/7] impala git commit: IMPALA-7596. Adding JvmPauseMonitor (and other GC) metrics to Impala metrics.

2018-09-27 Thread tarmstrong
IMPALA-7596. Adding JvmPauseMonitor (and other GC) metrics to Impala metrics. Following up to IMPALA-6857, it's useful for monitoring tools to see if the pause monitor is getting triggered, and to see other GC metrics. The Java side here, and the Thrift side, were easy enough. However, the Impal

[6/7] impala git commit: IMPALA-7631: Add sentry.db.explicit.grants.permitted in sentry-site*.xml

2018-09-27 Thread tarmstrong
IMPALA-7631: Add sentry.db.explicit.grants.permitted in sentry-site*.xml SENTRY-2413 requires a new configuration: sentry.db.explicit.grants.permitted to be added into sentry-site*.xml to specify which privileges are permitted to be granted explicitly. Testing: - Ran all FE tests - Ran authorizat

[5/7] impala git commit: IMPALA-7606: Fix IllegalStateException in CatalogTableInvalidator

2018-09-27 Thread tarmstrong
IMPALA-7606: Fix IllegalStateException in CatalogTableInvalidator CatalogdTableInvalidator detects if a table is in a normal state using Table.isLoaded() function. This is wrong because if there is an error during the loading of a table, isLoaded() returns true. This patch checks if the table is a

[1/7] impala git commit: Added dumping of minidumps to finalize.sh

2018-09-27 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 09150f04c -> ba27b0381 Added dumping of minidumps to finalize.sh Adds a step to bin/jenkins/finalize.sh that checks the log dir for any minidumps, and if it finds any dumps the symbols and symbolizes the minidump. The extracted stacks are

[3/7] impala git commit: IMPALA-7616: Remove the use of privilege name in TPrivilege

2018-09-27 Thread tarmstrong
IMPALA-7616: Remove the use of privilege name in TPrivilege Prior to this patch, privilege name was a field in TPrivilege Thrift message. The privilege name was constructed from any other fields in the TPrivilege. This is very error-prone since setting privilege name prior to setting any other fie

[4/7] impala git commit: IMPALA-7353: Improve memory estimates for Hbase Scan Nodes

2018-09-27 Thread tarmstrong
IMPALA-7353: Improve memory estimates for Hbase Scan Nodes Currently for hbase scan nodes we use a constant estimate of 1GB which is generally a gross over-estimation. This patch improves upon those estimates by using huerestics based on how hbase rows are stored and fetched and how the scanners i

[7/7] impala git commit: IMPALA-6202. mod and % should be equivalent.

2018-09-27 Thread tarmstrong
IMPALA-6202. mod and % should be equivalent. Currently in DECIMAL V2 mode, typeof(9.9 % 3) is DECIMAL(2,1) and typeof(mod(9.9, 3)) is DECIMAL(4,1), while both are expected to be DECIMAL(2,1). This jira fixes V2 mode by replacing "mod" with "%" at parser stage thus they share the same code path

impala git commit: IMPALA-7628: skip test_tls_ecdh on Python 2.6

2018-09-26 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master ce145ffee -> 09150f04c IMPALA-7628: skip test_tls_ecdh on Python 2.6 This is a temporary workaround. On the CentOS 6 build that failed test_tls_v12, test_wildcard_san_ssl and test_wildcard_ssl were all skipped so I figured this will unbloc

impala git commit: IMPALA-7600: bump mem_limit for test_kudu_scan_mem_usage

2018-09-26 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master df53ec238 -> ce145ffee IMPALA-7600: bump mem_limit for test_kudu_scan_mem_usage The estimate for memory consumption for this scan is 9 columns * 384kb per column = 3.375mb. So if we set the mem_limit to 6.5mb, we should still not get more

[02/11] impala git commit: IMPALA-7537: REVOKE GRANT OPTION regression

2018-09-25 Thread tarmstrong
IMPALA-7537: REVOKE GRANT OPTION regression This patch fixes several issues around granting and revoking of privileges. This includes: - REVOKE ALL ON SERVER where the privilege has the grant option was removing from the cache but not Sentry. - With the addition of the grantoption to the name i

[11/11] impala git commit: IMPALA-110: Support for multiple DISTINCT

2018-09-25 Thread tarmstrong
IMPALA-110: Support for multiple DISTINCT This patch adds support for having multiple aggregate functions in a single SELECT block that use DISTINCT over different sets of columns. Planner design: - The existing tree-based plan shape with a two-phased aggregation is maintained. - Existing plans

[05/11] impala git commit: IMPALA-1760: Implement shutdown command

2018-09-25 Thread tarmstrong
IMPALA-1760: Implement shutdown command This is the same patch except with fixes for the test failures on EC and S3 noted in the JIRA. This allows graceful shutdown of executors and partially graceful shutdown of coordinators (new operations fail, old operations can continue). Details: * In orde

[08/11] impala git commit: IMPALA-110: Support for multiple DISTINCT

2018-09-25 Thread tarmstrong
http://git-wip-us.apache.org/repos/asf/impala/blob/df53ec23/testdata/workloads/functional-planner/queries/PlannerTest/multiple-distinct-materialization.test -- diff --git a/testdata/workloads/functional-planner/queries/PlannerTest

[07/11] impala git commit: IMPALA-110: Support for multiple DISTINCT

2018-09-25 Thread tarmstrong
http://git-wip-us.apache.org/repos/asf/impala/blob/df53ec23/testdata/workloads/functional-planner/queries/PlannerTest/multiple-distinct.test -- diff --git a/testdata/workloads/functional-planner/queries/PlannerTest/multiple-distin

[09/11] impala git commit: IMPALA-110: Support for multiple DISTINCT

2018-09-25 Thread tarmstrong
http://git-wip-us.apache.org/repos/asf/impala/blob/df53ec23/fe/src/main/java/org/apache/impala/planner/AggregationNode.java -- diff --git a/fe/src/main/java/org/apache/impala/planner/AggregationNode.java b/fe/src/main/java/org/apa

[03/11] impala git commit: IMPALA-7456: Deprecate file-based authorization

2018-09-25 Thread tarmstrong
IMPALA-7456: Deprecate file-based authorization This patch simply adds a warning message to the log when the authorization_policy_file run-time flag is used. Sentry has deprecated the use of policy files and they do not support user level privileges which are required for object ownership. Here i

[04/11] impala git commit: IMPALA-1760: Implement shutdown command

2018-09-25 Thread tarmstrong
http://git-wip-us.apache.org/repos/asf/impala/blob/f46de211/tests/custom_cluster/test_restart_services.py -- diff --git a/tests/custom_cluster/test_restart_services.py b/tests/custom_cluster/test_restart_services.py index f2bb7fb.

[10/11] impala git commit: IMPALA-110: Support for multiple DISTINCT

2018-09-25 Thread tarmstrong
http://git-wip-us.apache.org/repos/asf/impala/blob/df53ec23/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java -- diff --git a/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java b/fe/src/main/java/org/apach

[06/11] impala git commit: IMPALA-7624: Workaround docker/kernel bug causing test-with-docker to sometimes hang.

2018-09-25 Thread tarmstrong
IMPALA-7624: Workaround docker/kernel bug causing test-with-docker to sometimes hang. I've observed that builds of test-with-docker that have "suite parallelism" sometimes hang when the Docker containers are being created. (The implementation had multiple threads calling "docker create" simultane

[01/11] impala git commit: IMPALA-7546: [DOCS] A new TIMEZONE query option

2018-09-25 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master e38715e25 -> df53ec238 IMPALA-7546: [DOCS] A new TIMEZONE query option Documented the new TIMEZONE query option to set a time TIMEZONE to be used in timestamp conversions. Change-Id: I734b8b37ae2360422fce269ed87507a04e8c05ac Reviewed-on:

impala git commit: IMPALA-7306: regression test for non-removed transient updates

2018-09-24 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master f8b472ee6 -> e38715e25 IMPALA-7306: regression test for non-removed transient updates Adds a test for IMPALA-7305 that reproduces the bug by delaying heartbeats and updates. Increased some timeouts in the test because they were hit once a

[3/4] impala git commit: IMPALA-7352: Account for clustering in HdfsTableSink

2018-09-24 Thread tarmstrong
IMPALA-7352: Account for clustering in HdfsTableSink Previously, HdfsTableSink::computeResourceProfile() didn't account for clustering while estimating the memory requirement of an insert fragment. This change ensures that the resource estimates produced account for the fact that clustered inserts

[2/4] impala git commit: Make IMPALA_KUDU_* variables override-able

2018-09-24 Thread tarmstrong
Make IMPALA_KUDU_* variables override-able Allows the IMPALA_KUDU_VERSION and IMPALA_KUDU_URL environment variables to be override by impala-config-branch.sh Also adds a feature to bootstrap-toolchain.py that optionally substitutes the CDH platform label into override values for IMPALA_(CDH_COMPO

[1/4] impala git commit: Ignore auto-generated sentry-site*.xml

2018-09-24 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 15e40a3c9 -> f8b472ee6 Ignore auto-generated sentry-site*.xml IMPALA-7074 introduced few new generated sentry-site*.xml files and those generated files should be in .gitignore. Change-Id: I788efdac1112cdfd029a79412ffa91fd391af53a Reviewed

[4/4] impala git commit: IMPALA-7595: Revert "IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions"

2018-09-24 Thread tarmstrong
IMPALA-7595: Revert "IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions" This reverts commit 2ee8caeb3053dfa2c434c680ffb2ac756627ee38. Change-Id: Ie51c8ac0d5c9a7065b52d4cd7e6bb70eaded8526 Reviewed-on: http://gerrit.cloudera.org:8080/11504 Reviewed-by: Impala Public Jenkins T

[5/6] impala git commit: IMPALA-7519: Support elliptic curve ssl ciphers

2018-09-24 Thread tarmstrong
IMPALA-7519: Support elliptic curve ssl ciphers Thrift's SSLSocketFactory class does not support setting ciphers that use ecdh. This patch modifies our existing subclass of SSLSocketFactory to override the ciphers() method and enable ECDH. The code for this was taken from be/src/kudu/security/tls

[6/6] impala git commit: IMPALA-589: Add sql function returning the impalad coordinator hostname.

2018-09-24 Thread tarmstrong
IMPALA-589: Add sql function returning the impalad coordinator hostname. In every execution of an Impala query, one of the impalad daemons acts as the coordinator node. In some cases, such as when using a proxy, a user cannot predict which host will act as the coordinator. To aid in diagnosis, we

[3/6] impala git commit: IMPALA-7589: default query options for custom cluster

2018-09-24 Thread tarmstrong
IMPALA-7589: default query options for custom cluster The bug that caused the erasure coding test failure was that the default query options specified by the test overrode the allow_erasure_coded_files option that was added by the custom cluster test infrastructure when running erasure coded test

[2/6] impala git commit: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-24 Thread tarmstrong
IMPALA-7527: add fetch-from-catalogd cache info to profile This patch adds a Java wrapper for a RuntimeProfile object. The wrapper supports some basic operations like non-hierarchical counters and informational strings. During planning, a profile is created, and passed back to the backend as part

[4/6] impala git commit: Mark certain vendored JS/CSS files as "binary" to avoid them showing up in "git grep".

2018-09-24 Thread tarmstrong
Mark certain vendored JS/CSS files as "binary" to avoid them showing up in "git grep". This instructs git to pretend that certain "minimized" and debugging Bootstrap-related files are treated as binary by git tools, especially "git grep". As a result, doing something like "git grep pause" won't y

[1/6] impala git commit: IMPALA-7498: Fix log spew from LocalCatalog startup

2018-09-24 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 083352b1e -> 15e40a3c9 IMPALA-7498: Fix log spew from LocalCatalog startup Frontend calls LocalCatalog#waitForCatalog() in a tight loop during startup and ends up spewing tons of log messages if the MetaProvider takes some time to initiali

[2/3] impala git commit: IMPALA-7579: fix test_query_profile_contains_all_events on S3

2018-09-19 Thread tarmstrong
IMPALA-7579: fix test_query_profile_contains_all_events on S3 This bug was introduced by IMPALA-6568 which added the test_query_profile_contains_all_events test. This test creates a file in the filesystem so that it can be used by 'load data inpath'. The test was using the hdfs_client object to do

[3/3] impala git commit: IMPALA-7420: different error code for internal cancellation

2018-09-19 Thread tarmstrong
IMPALA-7420: different error code for internal cancellation I started by converting scan and spill-to-disk because the cancellation there is always meant to be internal to the scan and spill-to-disk subsystems. I updated all places that checked for TErrorCode::CANCELLED to treat CANCELLED_INTERNA

[1/3] impala git commit: IMPALA-7488: Fix hang in test_cancellation

2018-09-19 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 109028e89 -> 4845f98be IMPALA-7488: Fix hang in test_cancellation test_cancellation runs a impala-shell process with a query specified then sends a SIGINT and confirms that the shell cancels the query and exits. The hang was happening bec

impala git commit: IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions

2018-09-17 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 7793124b6 -> 2ee8caeb3 IMPALA-7521: Speed up sub-second unix time->TimestampValue conversions Impala used to convert from sub-second unix time to TimestampValue (which is split to date_ and time_ similarly to boost::posix_time::ptime) by f

[8/8] impala git commit: IMPALA-7487: fix stress test handling of AC memory rejection

2018-09-14 Thread tarmstrong
IMPALA-7487: fix stress test handling of AC memory rejection The bug was introduced in IMPALA-7356 part 1, which started classifying some memory errors as AC rejection errors. The binary search has special handling of mem_limit_exceeded, and ended up generating incorrect runtime info. Fix the err

[2/8] impala git commit: IMPALA-7448: Invalidate recently unused tables from catalogd

2018-09-14 Thread tarmstrong
IMPALA-7448: Invalidate recently unused tables from catalogd This patch implements an automatic invalidation mechanism in catalogd. There are two invalidation strategies: 1. Periodically the HDFS tables that are not used in a configured period "invalidate_tables_timeout_s" is invalidated from c

[5/8] impala git commit: Revert "IMPALA-1760: Implement shutdown command"

2018-09-14 Thread tarmstrong
Revert "IMPALA-1760: Implement shutdown command" This reverts commit fda44aed9d4df818e95dc1b7d02ac080cdff1102. A couple of the tests broken on S3 and erasure coding. Reverting to unblock testing until we can come up with a proper fix. Change-Id: Icef47b3aa67bc056c40592d47e93c4ebc57be98c Reviewed

[4/8] impala git commit: Revert "IMPALA-1760: Implement shutdown command"

2018-09-14 Thread tarmstrong
http://git-wip-us.apache.org/repos/asf/impala/blob/16a04ce8/tests/custom_cluster/test_restart_services.py -- diff --git a/tests/custom_cluster/test_restart_services.py b/tests/custom_cluster/test_restart_services.py index 9869a2f.

[3/8] impala git commit: IMPALA-7569: [DOCS] Removed "safety valves" from docs

2018-09-14 Thread tarmstrong
IMPALA-7569: [DOCS] Removed "safety valves" from docs Removed the mentions of "safety valves" to avoid confusions with the Cloudera safety valves feature. Change-Id: Iba99d3710bd507052962d0015a36a4e3ba41cd74 Reviewed-on: http://gerrit.cloudera.org:8080/11438 Reviewed-by: Alex Rodoni Tested-by: A

[6/8] impala git commit: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-14 Thread tarmstrong
http://git-wip-us.apache.org/repos/asf/impala/blob/e5b424ba/fe/src/test/resources/sentry-site_oo.xml.template -- diff --git a/fe/src/test/resources/sentry-site_oo.xml.template b/fe/src/test/resources/sentry-site_oo.xml.template ne

[7/8] impala git commit: IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER

2018-09-14 Thread tarmstrong
IMPALA-7074: Update OWNER privilege on CREATE, DROP, and SET OWNER This patch adds calls to automatically create or remove owner privileges in the catalog based on the statement. This is similar to the existing pattern where after privileges are granted in Sentry, they are created in the catalog

[1/8] impala git commit: IMPALA-7559: Disable stat filtering for UTC-normalized timestamp columns

2018-09-14 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 5129cf94c -> 7793124b6 IMPALA-7559: Disable stat filtering for UTC-normalized timestamp columns If convert_legacy_hive_parquet_utc_timestamps=true and the Parquet file is by parquet-mr (also used by Hive), then timestamps are converted fro

[1/4] impala git commit: [DOCS] Removed the references to the enterprise names and links

2018-09-11 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master d91bc4402 -> fda44aed9 [DOCS] Removed the references to the enterprise names and links Change-Id: I5552a344bfb34b0c5bec8fd8d61388ec3ddcd49d Reviewed-on: http://gerrit.cloudera.org:8080/11424 Reviewed-by: Alex Rodoni Tested-by: Impala Publ

[3/4] impala git commit: IMPALA-1760: Implement shutdown command

2018-09-11 Thread tarmstrong
http://git-wip-us.apache.org/repos/asf/impala/blob/fda44aed/tests/custom_cluster/test_restart_services.py -- diff --git a/tests/custom_cluster/test_restart_services.py b/tests/custom_cluster/test_restart_services.py index f2bb7fb.

[2/4] impala git commit: [DOCS] 2 typos fixed in impala_parquet_dictionary_filtering.xml

2018-09-11 Thread tarmstrong
[DOCS] 2 typos fixed in impala_parquet_dictionary_filtering.xml Change-Id: Ifbd56920d886c4161fedcd4ec4326d1cdb478f7a Reviewed-on: http://gerrit.cloudera.org:8080/11426 Reviewed-by: Alex Rodoni Tested-by: Impala Public Jenkins Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit:

[4/4] impala git commit: IMPALA-1760: Implement shutdown command

2018-09-11 Thread tarmstrong
IMPALA-1760: Implement shutdown command This allows graceful shutdown of executors and partially graceful shutdown of coordinators (new operations fail, old operations can continue). Details: * In order to allow future admin commands, this is implemented with function-like syntax and does not a

[3/3] impala git commit: IMPALA-7516: Fix query location accounting

2018-09-10 Thread tarmstrong
IMPALA-7516: Fix query location accounting As a side-effect of IMPALA-5216, queries that were scheduled to be executed but eventually got rejected or canceled before starting execution (initializing the coordinator object) got added to the 'query_locations' map but never got removed. As a result,

[1/3] impala git commit: IMPALA-6442: Misleading file offset reporting in error messages.

2018-09-10 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 9cfa228c2 -> ab6bd74ff IMPALA-6442: Misleading file offset reporting in error messages. The error message described in IMPALA-6442 incorrectly reported the file offset where the Parquet footer starts, as if the offset is counted from the

[2/3] impala git commit: Avoid long line flake8 warnings for gen_ir_descriptions.py

2018-09-10 Thread tarmstrong
Avoid long line flake8 warnings for gen_ir_descriptions.py Change-Id: I025981176fbd4500ddfeed009514faaae75b8bf0 Reviewed-on: http://gerrit.cloudera.org:8080/11412 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit: http://git-

impala git commit: IMPALA-7483: abort stuck impalad/catalogd on JVM deadlock

2018-08-25 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master a8f8c8d6f -> c285ab979 IMPALA-7483: abort stuck impalad/catalogd on JVM deadlock The polling interval can be adjusted with --jvm_deadlock_detector_interval_s. Setting the interval to <= 0 disables the hang check. I don't think this should

impala git commit: IMPALA-7356 (part 2 of ?): restrict number of coordinators

2018-08-24 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 063d2c9d5 -> b206aeb71 IMPALA-7356 (part 2 of ?): restrict number of coordinators The immediate motivation is to allow testing admission control with a single coordinator where distributed overadmission is not possible. Testing: Ran local

[2/2] impala git commit: IMPALA-7418: Return first non-ok status from HdfsScanNode::GetNext()

2018-08-23 Thread tarmstrong
IMPALA-7418: Return first non-ok status from HdfsScanNode::GetNext() Previously, HdfsScanNode::GetNext passed the status returned by IssueInitialScanRanges() without inspecting the HdfsScanNodeBase::status_. This resulted in the error status being lost in case a scanner thread hit an error and can

[1/2] impala git commit: [DOCS] Rewrote the opening paragraph for a better readability

2018-08-23 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 6ce7ba295 -> 1d84d6855 [DOCS] Rewrote the opening paragraph for a better readability Change-Id: I86116cb9163b91133d17b1bb3f40af9e636f02cf Reviewed-on: http://gerrit.cloudera.org:8080/11309 Reviewed-by: Alex Rodoni Tested-by: Impala Public

[2/6] impala git commit: IMPALA-7479: Harmonize parquet versions.

2018-08-23 Thread tarmstrong
IMPALA-7479: Harmonize parquet versions. We have a copy of parquet-avro in testdata/ that wasn't using the same verion of parquet as everywhere else; fixing that. I ran core tests. Change-Id: Ia47b0871f25171510d7cb39593f3e94aadb9adeb Reviewed-on: http://gerrit.cloudera.org:8080/11299 Reviewed-by

[5/6] impala git commit: IMPALA-5937: [DOCS] Documented ENABLE_EXPR_REWRITES query option

2018-08-23 Thread tarmstrong
IMPALA-5937: [DOCS] Documented ENABLE_EXPR_REWRITES query option Change-Id: I82a27172a6a6570f9c3cebe1a516a29c755e6d58 Reviewed-on: http://gerrit.cloudera.org:8080/11206 Tested-by: Impala Public Jenkins Reviewed-by: Thomas Marshall Project: http://git-wip-us.apache.org/repos/asf/impala/repo Com

[3/6] impala git commit: IMPALA-7399: Emit a junit xml report when trapping errors

2018-08-23 Thread tarmstrong
IMPALA-7399: Emit a junit xml report when trapping errors This patch will cause a junitxml file to be emitted in the case of errors in build scripts. Instead of simply echoing a message to the console, we set up a trap function that also writes out to a junit xml report that can be consumed by jen

[6/6] impala git commit: docs: typo fix in PARQUET_ARRAY_RESOLUTION

2018-08-23 Thread tarmstrong
docs: typo fix in PARQUET_ARRAY_RESOLUTION Change-Id: I84fcc3f13215879ea4c5bc9737f5188baeaa5749 Reviewed-on: http://gerrit.cloudera.org:8080/11284 Tested-by: Impala Public Jenkins Reviewed-by: Alex Rodoni Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit: http://git-wip-us.apa

[1/6] impala git commit: IMPALA-6373: Allow primitive type widening on parquet tables

2018-08-23 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 971cf179f -> 6ce7ba295 IMPALA-6373: Allow primitive type widening on parquet tables This patch implements support for primitive type widening on parquet tables. It only supports conversion to those types without any loss of precision. - ti

[4/6] impala git commit: IMPALA-7433: reduce logging on executors

2018-08-23 Thread tarmstrong
31250 impala-internal-service.cc:49] ExecQueryFInstances(): query_id=fd4ae28bc993236e:27343be1 coord=tarmstrong-box:22000 #instances=2 I0813 12:10:50.250722 31256 query-state.cc:477] Executing instance. instance_id=fd4ae28bc993236e:27343be10006 fragment_idx=1

impala git commit: IMPALA-7402: fix DCHECK when releasing reservation in scan

2018-08-21 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 3ff4cde77 -> bdd904922 IMPALA-7402: fix DCHECK when releasing reservation in scan The bug is that ScannerContext::Stream::GetNextBuffer(), when reading past the end of a scan range and ScanRange::GetNext() returned cancelled, did not wait

impala git commit: IMPALA-6844: Fix possible NULL dereference in to_date() builtin

2018-08-21 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master bddd7def9 -> 3ff4cde77 IMPALA-6844: Fix possible NULL dereference in to_date() builtin If result.ptr allocation fails for some reason inside the StringVal constructor, we still overwrite result.len and continue. This change checks that th

[1/3] impala git commit: IMPALA-7470: SentryServicePinger logs error messages on success

2018-08-21 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 3fa05604a -> bddd7def9 IMPALA-7470: SentryServicePinger logs error messages on success SentryServicePinger checks if Sentry is running by calling a Sentry API to get a list of roles. If Sentry is not yet running, an exception will be throw

[3/3] impala git commit: IMPALA-7465: fix test_kudu_scan_mem_usage

2018-08-21 Thread tarmstrong
IMPALA-7465: fix test_kudu_scan_mem_usage The issue was that the row batch queue could grow a lot if the consumer was slow. Also add an additional test to exercise the OOM code path in Kudu for completeness. Testing: Added sleep to kudu-scan-node.cc that reproduced the problem. Looped modified t

[2/3] impala git commit: IMPALA-7449: Fix network throughput calculation of DataStreamSender

2018-08-21 Thread tarmstrong
IMPALA-7449: Fix network throughput calculation of DataStreamSender Currently, the network throughput presented in the query profile for DataStreamSender is computed by dividing the total bytes sent by the total network time which is the sum of observed network time of all individual RPCs. This is

impala git commit: IMPALA-7096: restore scanner thread memory heuristics

2018-08-16 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 59435fe0a -> 7ccf73690 IMPALA-7096: restore scanner thread memory heuristics This restores some of the heuristics removed in IMPALA-4835 that can help scans from hitting OOM conditions. The heuristics are implemented at the query level rat

[3/6] impala git commit: IMPALA-7342: Add initial support for user-level permissions

2018-08-15 Thread tarmstrong
http://git-wip-us.apache.org/repos/asf/impala/blob/a23e6f29/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java -- diff --git a/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java b/fe/src/main/java/org

[2/6] impala git commit: IMPALA-6481: [DOCS] Documented WIDTH_BUCKET function

2018-08-15 Thread tarmstrong
IMPALA-6481: [DOCS] Documented WIDTH_BUCKET function Change-Id: Ife9577a65fe342fde160c7cb5fa666e407d5b093 Reviewed-on: http://gerrit.cloudera.org:8080/11170 Tested-by: Impala Public Jenkins Reviewed-by: Zoltan Borok-Nagy Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit: http:

[6/6] impala git commit: IMPALA-7442: reduce mem requirement of semi-joins-exhaustive

2018-08-15 Thread tarmstrong
IMPALA-7442: reduce mem requirement of semi-joins-exhaustive The test started running into IMPALA-7446, maybe because of a timing change. This appears to have always been possible. The fix is to reduce the memory requirement of the test. IMPALA-2256 is no longer really possible because the Buffer

[5/6] impala git commit: IMPALA-7392: [DOCS] SCAN_BYTES_LIMIT query option documented

2018-08-15 Thread tarmstrong
IMPALA-7392: [DOCS] SCAN_BYTES_LIMIT query option documented Change-Id: I6430e06cabe21b8080239f3225d3bfdd5cc502cb Reviewed-on: http://gerrit.cloudera.org:8080/11240 Tested-by: Impala Public Jenkins Reviewed-by: Tim Armstrong Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit: h

[1/6] impala git commit: IMPALA-7206: [DOCS] THREAD_RESERVATION_LIMIT & THREAD_RESERVATION_AGGREGATE_LIMIT

2018-08-15 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master b1a5aebb8 -> 29905d1ea IMPALA-7206: [DOCS] THREAD_RESERVATION_LIMIT & THREAD_RESERVATION_AGGREGATE_LIMIT Change-Id: I94b560f6098711a6458ac5a7c8584f6fe3e3fdb8 Reviewed-on: http://gerrit.cloudera.org:8080/11237 Reviewed-by: Tim Armstrong T

[4/6] impala git commit: IMPALA-7342: Add initial support for user-level permissions

2018-08-15 Thread tarmstrong
IMPALA-7342: Add initial support for user-level permissions This patch refactors the authorization code in preparation to add initial support for for user-level permissions (IMPALA-6794) and object ownership (IMPALA-7075). It introduces the notion of Principal that can be either Role or User. The

[1/2] impala git commit: IMPALA-7364: Bump RapidJSON version to 1.1.0

2018-08-15 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 1455548c8 -> b1a5aebb8 IMPALA-7364: Bump RapidJSON version to 1.1.0 There're three kinds of broken APIs to fix: * Document::AddMember can't accept parameters in const char* types. These parameters should be wrapped with rapidjson::String

[2/2] impala git commit: IMPALA-7434: Heimdal Kerberos is not supported in Impala

2018-08-15 Thread tarmstrong
IMPALA-7434: Heimdal Kerberos is not supported in Impala Change-Id: Id488665d8e6037e3743750dbd32b48def5b58b0f Reviewed-on: http://gerrit.cloudera.org:8080/11204 Reviewed-by: Michael Ho Tested-by: Impala Public Jenkins Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit: http://g

impala git commit: Download gdb from the toolchain and add it to the path

2018-08-15 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master cddb35be9 -> 1455548c8 Download gdb from the toolchain and add it to the path This patch extends the toolchain bootstrap code with the toolchain version of GDB (v7.9.1, built in the toolchain since its inception), and adds it to the path.

[1/2] impala git commit: IMPALA-7440: remove --nlj-filter from stress test

2018-08-14 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master c0ff4fe8f -> 2a2c3daaa IMPALA-7440: remove --nlj-filter from stress test This hides the option and makes it a no-op to avoid breaking any driver scripts that pass in the option. Testing: Started local stress test with and without setting

[2/2] impala git commit: IMPALA-7356 (part 1 of ?): admission control stress

2018-08-14 Thread tarmstrong
IMPALA-7356 (part 1 of ?): admission control stress Add initial support for running the stress test in a mode where it tests that memory-based admission control prevents out-of-memory. A new mode is added that can be enabled by passing --test-admission-control=true to concurrent_select.py In this

[1/5] impala git commit: IMPALA-7231: group plan nodes into pipelines

2018-08-10 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 4af3a7853 -> 9961c33e8 http://git-wip-us.apache.org/repos/asf/impala/blob/b7d509d7/testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test

[3/5] impala git commit: IMPALA-7231: group plan nodes into pipelines

2018-08-10 Thread tarmstrong
http://git-wip-us.apache.org/repos/asf/impala/blob/b7d509d7/testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test -- diff --git a/testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validati

[5/5] impala git commit: IMPALA-7415: Fix flakiness in test_multiline_queries_in_history

2018-08-10 Thread tarmstrong
IMPALA-7415: Fix flakiness in test_multiline_queries_in_history This fixes a flakiness in test_multiline_queries_in_history wherein a part of the shell prompt would be absorbed in a previous regex search that would ultimately result in the failure of the subsequent regex search that looks for the

[4/5] impala git commit: IMPALA-7231: group plan nodes into pipelines

2018-08-10 Thread tarmstrong
IMPALA-7231: group plan nodes into pipelines This adds some informational output to explain plans and sends the information to the backend. The idea is that this will make it easier to explain how Impala's pipelined execution works and also enable future work on profile analysis that can more int

[2/5] impala git commit: IMPALA-7231: group plan nodes into pipelines

2018-08-10 Thread tarmstrong
http://git-wip-us.apache.org/repos/asf/impala/blob/b7d509d7/testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test -- diff --git a/testdata/workloads/functional-planner/queries/PlannerTest/resource-re

[1/2] impala git commit: IMPALA-7203. Support UDFs in LocalCatalog

2018-08-10 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master 3e17705ec -> 4af3a7853 IMPALA-7203. Support UDFs in LocalCatalog This adds support to LocalCatalog to load persistent UDFs (both Java and native) from the HMS. Transient UDFs are not supported, since, without a central component to store t

[2/2] impala git commit: IMPALA-7333: remove MarkNeedsDeepCopy() in agg and BTS

2018-08-10 Thread tarmstrong
IMPALA-7333: remove MarkNeedsDeepCopy() in agg and BTS This takes advantage of work (e.g. IMPALA-3200, IMPALA-5844) to remove a couple of uses of the API. Testing: Ran core, ASAN and exhaustive builds. Added unit tests to directly test the attaching behaviour. Change-Id: I5f5b8a418d4816f603a64d

impala git commit: IMPALA-6034: Add scanned bytes limits per query

2018-08-09 Thread tarmstrong
ter the resources have been used. IMPALA-7318 tracks enabling this. Query profile is updated to include query wide and per backend metrics for CPU and scanned bytes. Example from "select count(*) from tpch_parquet.lineitem": Per Node Peak Memory Usage: tarmstrong-box:22000(289.50 KB) t

[1/2] impala git commit: IMPALA-7386. Replace CatalogObjectVersionQueue with a multiset

2018-08-08 Thread tarmstrong
Repository: impala Updated Branches: refs/heads/master a6c356850 -> 3c9fef2ae IMPALA-7386. Replace CatalogObjectVersionQueue with a multiset The implementation of CatalogObjectVersionQueue was based on a priority queue, which is not a good choice of data structure for the use case. This class

[2/2] impala git commit: Switch a couple of lists to deques in AnalyticEvalNode

2018-08-08 Thread tarmstrong
Switch a couple of lists to deques in AnalyticEvalNode I noticed this inefficiency while looking at this code. std::deque generally offers better performance because it does fewer memory allocations and has better memory locality. The main advantages of std::list are O(1) insert/delete in the mid

<    8   9   10   11   12   13   14   15   16   17   >