[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10347 )

Change subject: Fix diagnostics path to not include the parent dir structure
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
Gerrit-Change-Number: 10347
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 09:29:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10347 )

Change subject: Fix diagnostics path to not include the parent dir structure
..

Fix diagnostics path to not include the parent dir structure

Without the fix, the diagnostics tar file included the entire
directory structure of the diagnostics root dir.

Before:
===
$ tar tf /tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh.tar.gz
tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/
tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/stacks/
tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/stacks/jstack-0.txt


After:
=
$ tar tf /tmp/impala-diagnostics-2018-05-08-12-01-51-Y0nlQI.tar.gz
impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/
impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/stacks/
impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/stacks/jstack-0.txt
.

Tested with python 2.6

Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
Reviewed-on: http://gerrit.cloudera.org:8080/10347
Reviewed-by: Bharath Vissapragada 
Tested-by: Impala Public Jenkins 
---
M bin/diagnostics/collect_diagnostics.py
1 file changed, 4 insertions(+), 1 deletion(-)

Approvals:
  Bharath Vissapragada: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
Gerrit-Change-Number: 10347
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-10 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Zoltan Borok-Nagy, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10116

to look at the new patch set (#10).

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..

IMPALA-6131: Track time of last statistics update in metadata

The timestamp of the last COMPUTE STATS operation is saved to
table property "impala.lastComputeStatsTime". The format is
the same as in "transient_lastDdlTime", so the two can be
compared to check if the schema has changed since computing
statistics.

Other changes:
- Handling of "transient_lastDdlTime" is simplified - the old
  logic set it to current time + 1, if the old version was
  >= current time, to ensure that it is always increased by
  DDL operations. This was useful in the past, as IMPALA-387
  used lastDdlTime to check if partition data needs to be
  reloaded, but since IMPALA-1480, Impala does not rely on
  lastDdlTime at all.

- Computing / setting stats on HDFS tables no longer increases
  "transient_lastDdlTime".

- When Kudu tables are (re)loaded, it is checked if their
  HMS representation is up to date, and if it is, then
  IMetaStoreClient.alter_table() is not called. The old
  logic always called alter_table() after loading metadata
  from Kudu. This change was needed to ensure that
  "transient_lastDdlTime" works similarly in HDFS and Kudu
  tables, and should also make (re)loading Kudu tables faster.

Notes:
- Kudu will be able to sync its tables to HMS in the near
  future (see KUDU-2191), so the Kudu metadata handling in
  Impala may need to be redesigned.

Testing:
tests/metadata/test_last_ddl_time_update.py is extended by
- also checking "impala.lastComputeStatsTime"
- testing more SQL statements
- tests for Kudu tables

Note that test_last_ddl_time_update.py is ran only in
exhaustive testing.

Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M tests/metadata/test_last_ddl_time_update.py
7 files changed, 203 insertions(+), 161 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/10116/10
--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db

2018-05-10 Thread Attila Jeges (Code Review)
Hello Gabor Kaszab, Zoltan Borok-Nagy, Csaba Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9986

to look at the new patch set (#5).

Change subject: IMPALA-3307: Add support for IANA time-zone db
..

IMPALA-3307: Add support for IANA time-zone db

Impala currently uses two different libraries for timestamp
manipulations: boost and glibc.

Issues with boost:
- Time-zone database is currently hard coded in timezone_db.cc.
  Impala admins cannot update it without upgrading Impala.
- Time-zone database is flat, therefore can’t track year-to-year
  changes.
- Time-zone database is not updated on a regular basis.

Issues with glibc:
- Uses /usr/share/zoneinfo/ database which could be out of sync on
  some of the nodes in the Impala cluster.
- Uses the host system’s local time-zone. Different nodes in the
  Impala cluster might use a different local time-zone.
- Conversion functions take a global lock, which causes severe
  performance degradation.

In addition to the issues above, the fact that /usr/share/zoneinfo/
and the hard-coded boost time-zone database are both in use is a
source of inconsistency in itself.

This patch makes the following changes:
- Instead of boost and glibc, impalad uses Google's CCTZ to implement
  time-zone conversions.
- Introduces a new startup flag (--hdfs_zone_info_dir) to impalad to
  specify an HDFS/S3/ADLS location that contains the shared compiled
  IANA time-zone database. If the startup flag is set, impalad will
  use the specified time-zone database. Otherwise, impalad will use
  the default /usr/share/zoneinfo time-zone database.
- impalad reads the entire time-zone database into an in-memory
  map on startup for fast lookups.
- The name of the coordinator node’s local time-zone is saved to the
  query context when preparing query execution. This time-zone is used
  whenever the current time-zone is referred afterwards in an
  execution node.
- Introduces a new startup flag (--hdfs_zone_abbrev_conf) to impalad
  to specify an HDFS/S3/ADLS path to a shared config file that
  contains definitions for non-standard time-zone abbreviations.

Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
---
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/common/global-types.h
M be/src/exec/data-source-scan-node.cc
M be/src/exec/data-source-scan-node.h
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exprs/CMakeLists.txt
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/decimal-operators-ir.cc
M be/src/exprs/decimal-operators.h
M be/src/exprs/expr-test.cc
M be/src/exprs/literal.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
A be/src/exprs/timezone_db-test.cc
M be/src/exprs/timezone_db.cc
M be/src/exprs/timezone_db.h
M be/src/runtime/raw-value-test.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M be/src/runtime/timestamp-value.inline.h
M be/src/service/impala-server.cc
M be/src/service/impalad-main.cc
M be/src/util/filesystem-util-test.cc
M be/src/util/filesystem-util.cc
M be/src/util/filesystem-util.h
M be/src/util/hdfs-util-test.cc
M be/src/util/hdfs-util.cc
M be/src/util/hdfs-util.h
M be/src/util/time-test.cc
M be/src/util/time.cc
M be/src/util/time.h
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
A cmake_modules/FindCctz.cmake
M common/thrift/ImpalaInternalService.thrift
M common/thrift/metrics.json
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
M testdata/bin/create-load-data.sh
M testdata/data/timezoneverification.csv
A testdata/tzdb/abbrev.conf
A testdata/tzdb/zoneinfo/AmerICA/ArgeNTINA/MendOZA
A testdata/tzdb/zoneinfo/AmerICA/CancUN
A testdata/tzdb/zoneinfo/UTC
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
A tests/custom_cluster/test_custom_tzdb.py
53 files changed, 2,556 insertions(+), 1,096 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/9986/5
--
To view, visit http://gerrit.cloudera.org:8080/9986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
Gerrit-Change-Number: 9986
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db

2018-05-10 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9986 )

Change subject: IMPALA-3307: Add support for IANA time-zone db
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@526
PS4, Line 526:   const string& tz_name = (start_lookup.abbr != nullptr) ? 
start_lookup.abbr :
 :   context->impl()->state()->local_time_zone()->name();
> What is the goal of this logic? To print timezone abbreviations instead of
Both, I guess. I just replicated the original behavior: localtime_r() sets 
tzone.tm_zone to the time-zone abbreviation. Added a comment.

I don't think that the TimestampValue class would be a good place for this 
function. The timezone abbreviation corresponds to 'start_unix_millis', which 
is not a TimestampValue. I'll keep this logic here for now.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc@93
PS4, Line 93: inline bool CheckIfDateOutOfRange(const cctz::civil_day& date) {
:   static const cctz::civil_day 
max_date(TimestampFunctions::MAX_YEAR, 12, 31);
:   static const cctz::civil_day 
min_date(TimestampFunctions::MIN_YEAR, 1, 1);
:   return date < min_date || date > max_date;
: }
> This could be simpler and possibly faster by expecting a cctz::civil_second
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc@128
PS4, Line 128:
> cctz explains pretty well the handling of dst boundaries, maybe we could ad
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc@129
PS4, Line 129:   // In case of ambiguity invalidate TimestampValue.
 :   const cctz::time_zone::civil_lookup from_cl = 
local_tz->lookup(from_cs);
 :   if (UNLIKELY(from_cl.kind != 
cctz::time_zone::civil_lookup::UNIQUE)) {
 : SetToInvalidDateTime();
 :   } else {
 : cctz::time_point from_tp = from_cl.pre;
 :
 : // Convert 'from_tp' time_point to civil_second assuming 
'UTC' time-zone.
 : cctz::civil_second to_cs = cctz::convert(from_tp, 
TimezoneDatabase::GetUtcTimezone());
 :
 : // boost::gregorian::date() throws 
boost::gregorian::bad_year if year is not in the
 : // 1400.. range. Need to check validity before creating 
the date object.
 : if (UNLIKELY(CheckIfDateOutOfRange(cctz::civil_day(to_cs 
{
> I may be possible to get TimestampValue from cctz::time_zone::civil_lookup
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc@146
PS4, Line 146:   time_ = boost::posix_time::time_duration(to_cs.hour(), 
to_cs.minute(), to_cs.second(),
> nit: long line
Done



--
To view, visit http://gerrit.cloudera.org:8080/9986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
Gerrit-Change-Number: 9986
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 10 May 2018 15:28:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT

2018-05-10 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10360 )

Change subject: Update version to 3.1.0-SNAPSHOT
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167
Gerrit-Change-Number: 10360
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 10 May 2018 15:30:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10360 )

Change subject: Update version to 3.1.0-SNAPSHOT
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2445/


--
To view, visit http://gerrit.cloudera.org:8080/10360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167
Gerrit-Change-Number: 10360
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 10 May 2018 15:31:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6999: Upgrade to sqlparse-0.1.19 for Impala shell

2018-05-10 Thread David Knupp (Code Review)
David Knupp has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10354 )

Change subject: IMPALA-6999: Upgrade to sqlparse-0.1.19 for Impala shell
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ide51ef3ac52d25a96b0fa832e29b6535197d23cb
Gerrit-Change-Number: 10354
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 16:02:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6999: Upgrade to sqlparse-0.1.19 for Impala shell

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10354 )

Change subject: IMPALA-6999: Upgrade to sqlparse-0.1.19 for Impala shell
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2447/


--
To view, visit http://gerrit.cloudera.org:8080/10354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ide51ef3ac52d25a96b0fa832e29b6535197d23cb
Gerrit-Change-Number: 10354
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 16:03:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-10 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10116 )

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..


Patch Set 10: Code-Review+1

Sorry, some of the authorization test changes were missing from patch 9.
Carry +1


--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 10 May 2018 16:10:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6998: test bloom wait time fails due to late arrival of filters on Isilon

2018-05-10 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10366


Change subject: IMPALA-6998: test_bloom_wait_time fails due to late arrival of 
filters on Isilon
..

IMPALA-6998: test_bloom_wait_time fails due to late arrival of filters on Isilon

This test has been failing on Isilon runs, most likely due to timing issues
which makes it a test issue rather than a product bug.

This patch disables the test for Isilon. We should revisit what tests we run
on non-HDFS filesystems later on, but until then, this should unblock
the build.

Change-Id: I2df6983a65a50b7efdd482124b70f518ee4c3229
---
M tests/query_test/test_runtime_filters.py
1 file changed, 3 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/10366/1
--
To view, visit http://gerrit.cloudera.org:8080/10366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2df6983a65a50b7efdd482124b70f518ee4c3229
Gerrit-Change-Number: 10366
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 


[Impala-ASF-CR](2.x) IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2

2018-05-10 Thread Joe McDonnell (Code Review)
Hello Philip Zeyliger, Impala Public Jenkins,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/10367

to review the following change.


Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2
..

IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2

There is a Hive bug in Hive 1.1.0 that can result
in a NullPointerException when doing parallel Hive
operations (see IMPALA-6532). Since dataload goes
parallel on Hive loads starting with IMPALA-6372,
dataload can hit this error on Hive 1.1.0 (i.e.
IMPALA_MINICLUSTER_PROFILE=2). This is impacting
builds on the 2.x branch.

This disables parallel dataload for IMPALA_MINICLUSTER_PROFILE=2.

IMPALA_MINICLUSTER_PROFILE=3 uses a newer version
of Hive that has a fix for this, so this continues
to use parallel dataload for that case.

Parallelism can be reenabled when Hive 1.1.0 gets the
fix from Hive 2.1.1.

Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Reviewed-on: http://gerrit.cloudera.org:8080/10306
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins 
---
M bin/impala-config.sh
M bin/load-data.py
M testdata/bin/create-load-data.sh
3 files changed, 12 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/10367/1
-- 
To view, visit http://gerrit.cloudera.org:8080/10367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Gerrit-Change-Number: 10367
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-10 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10116 )

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..


Patch Set 10:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1235
PS10, Line 1235:* TODO: the following paragraph seems at least partially 
obsolate
Why not clean it up then? Point (1) is obsolete but point (2) is still valid.


http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@244
PS10, Line 244: Long.toString(System.currentTimeMillis() / 1000));
Doesn't the HMS set this in alter_table()?


http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@785
PS10, Line 785:   msTbl.putToParameters("impala.lastComputeStatsTime",
Create a constant for the table property in Table similar to what we have in 
HdfsTable, e.g. TBL_PROP_SKIP_HEADER_LINE_COUNT


http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2868
PS10, Line 2868: // HMS updates transient_lastDdlTime if it is removed.
I understand what this comment says, but I don't understand it's relevance in 
this context. Does that mean we should add a dummy time if !setLastDdlTime?


http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2870
PS10, Line 2870: Long.toString(System.currentTimeMillis() / 1000));
We're doing "System.currentTimeMillis() / 1000" in quite a few places, I 
suggest adding a helper function in Table or MetastoreUtils or similar.


http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py
File tests/metadata/test_last_ddl_time_update.py:

http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@74
PS10, Line 74: def run_test(self, query, expect_changed_ddl_time, 
expect_changed_stats_time):
Passing a ";" delimited string is really weird. Why not pass a list of queries? 
As far as I can tell there's a single use of feature in L229, can just split up 
those two queries?


http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@93
PS10, Line 93:   # Hive uses a seconds granularity on the last ddl time.
Not just Hive - we do too. Just say "All times are stored in seconds" or 
something


http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@155
PS10, Line 155: h.expect_no_time_change("drop incremental stats %(TBL)s 
partition (j=1, s='2012')")
use "drop stats" instead to wipe everything (including incremental stats)



--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 10 May 2018 16:58:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 14:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@522
PS15, Line 522: query_events_->MarkEvent(exec_state_to_event.at(new_state));
  :   // TODO: is this needed? This will also happen on the 
"backend" side of cancel rpc.
  :   // And in the case of EOS, sink already knows this.
> makes sense.
I filed IMPALA-7011 for that.


http://gerrit.cloudera.org:8080/#/c/10158/16/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/16/be/src/runtime/coordinator.cc@132
PS16, Line 132:
> nit: extra space
Done


http://gerrit.cloudera.org:8080/#/c/10158/16/be/src/runtime/coordinator.cc@618
PS16, Line 618:
> concurrently
Done



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 14
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 May 2018 17:08:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-10 Thread Dan Hecht (Code Review)
Hello Michael Ho, Sailesh Mukil, Tim Armstrong, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10158

to look at the new patch set (#17).

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..

IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

The is the final change to clarify and break up the Coordinator's lock.
The state machine for the coordinator is made explicit, distinguishing
between executing state and multiple terminal states. Logic to
transition into a terminal state is centralized in one location and
executes exactly once for each coordinator object.

Derived from a patch for IMPALA_5384 by Marcel Kornacker.

Testing:
- exhaustive functional tests
- stress test on minicluster with memory overcommitment. Verified from
  the logs that this exercises all these paths:
  - successful queries
  - client requested cancellation
  - error from exec FInstances RPC
  - error reported asynchronously via report status RPC
  - eos before backend execution completed

Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
---
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/client-request-state.cc
M be/src/service/impala-server.h
M be/src/util/counting-barrier.h
6 files changed, 389 insertions(+), 391 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/10158/17
--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 17
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 18: Code-Review+1

Thanks for the reviews! rebase and carry.


--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 18
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 May 2018 17:10:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10060 )

Change subject: IMPALA-5216: Make admission control queuing async
..


Patch Set 7:

(5 comments)

Will have some more comments but wanted to get the discussion going on these.

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h
File be/src/scheduling/admission-controller.h:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@203
PS7, Line 203: cancelled
given that there's no Cancel() API in this class, how is that done? (I'm 
guessing the client can set the promise, but in any case, let's explain it).


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@671
PS7, Line 671:   if (coord() != NULL) {
 : RETURN_IF_ERROR(coord()->Wait());
 : RETURN_IF_ERROR(UpdateCatalog());
 :   }
now that Execute() doesn't set coord(), isn't this racy? i.e. this wait thread 
could get here before the admission control submission thread created and set 
the Coordinator.  That wasn't possible before because the order was
Execute(); // sets coord_ if necessary.
WaitAsync()


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@897
PS7, Line 897: if (uses_admission_control()) 
admit_outcome_.TrySet(AdmissionOutcome::CANCELLED);
Does this somehow cause the entry in the AC queue to be removed proactively? I 
don't think it's clear how this works from just reading the AC public header.


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@1147
PS7, Line 1147:   lock_guard l(lock_);
  :   if (!UpdateQueryStatus(admit_status).ok()) return;
  :   if (is_cancelled_) {
  : VLOG_QUERY << "Cancelled right after successful 
admission, query id="
  :<< schedule_->query_id();
  : // If query cancelled after being admitted, release 
Admission control resources.
  : 
exec_env_->admission_controller()->ReleaseQuery(*schedule_.get());
  : // If cancellation was user initiated, then 
query_status would be OK, therefore
  : // set it to CANCELLED.
  : discard_result(UpdateQueryStatus(Status::CANCELLED));
  : return;
  :   }
why is that needed? I guess it's really just an optimization to avoid 
Coordinator::Exec()/Cancel() for when the cancel happens in this small window, 
is that right?

Hmm, or is the the log misleading and this path is taken when the cancel 
happens while queued? I think that's true, and if so, then it is worthwhile, 
but then you should reword the log/comments.


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@1171
PS7, Line 1171:lock_guard l(lock_);
  : if (!UpdateQueryStatus(exec_status).ok()) return;
  : cancelled = is_cancelled_;
  : if (!cancelled) {
  :   // This needs to be done while holding lock_ as it will 
ensure that coord_ is
  :   // visible to the cancellation thread if the query is 
marked cancelled after exiting
  :   // this critical section.
  :   coord_exec_started_.Store(true);
  : } else {
  :   VLOG_QUERY << "Cancelled right after starting the 
coordinator query id="
  :  << schedule_->query_id();
  :   discard_result(UpdateQueryStatus(Status::CANCELLED));
  :   // If there was already a non OK query status before we 
set it to CANCELLED then
  :   // take it to update coord_.
  :   cancellation_status = query_status();
  : }
  :   }
As alluded to earlier, I think we might be able to simplfiying this by not 
setting the coord_ pointer until we do the final cancellation check.



--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 7
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 May 2018 17:45:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10060 )

Change subject: IMPALA-5216: Make admission control queuing async
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@671
PS7, Line 671:   if (coord() != NULL) {
 : RETURN_IF_ERROR(coord()->Wait());
 : RETURN_IF_ERROR(UpdateCatalog());
 :   }
> now that Execute() doesn't set coord(), isn't this racy? i.e. this wait thr
Oh, I guess that's what the Join() is for on line 660. Could you add a comment 
there saying something like:
// Wait until the query has passed through admission control and is running.
(Later when we clean up and clearly define the possible CLR states, I think 
this will become easier to remember).



--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 7
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 May 2018 17:50:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR](2.x) IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2

2018-05-10 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10367 )

Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Gerrit-Change-Number: 10367
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 18:00:09 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10367 )

Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2448/


--
To view, visit http://gerrit.cloudera.org:8080/10367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Gerrit-Change-Number: 10367
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 18:04:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6035: Add query options to limit reserved threads

2018-05-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10365


Change subject: IMPALA-6035: Add query options to limit reserved threads
..

IMPALA-6035: Add query options to limit reserved threads

Adds two options: THREAD_RESERVATION_LIMIT and
THREAD_RESERVATION_AGGREGATE_LIMIT, which are both enforced by admission
control based on planner resource requirements and the schedule. The
mechanism used is the same as the minimum reservation checks.

THREAD_RESERVATION_LIMIT limits the total number of reserved threads in
fragments scheduled on a single backend.
THREAD_RESERVATION_AGGREGATE_LIMIT limits the sum of reserved threads
across all fragments.

This also slightly improves the minimum reservation error message to
include the host name.

Testing:
Added end-to-end tests that exercise the code paths.

Ran core tests.

Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007
---
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler.cc
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M 
testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
A testdata/workloads/functional-query/queries/QueryTest/resource-limits.test
M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test
A tests/query_test/test_resource_limits.py
12 files changed, 244 insertions(+), 17 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/10365/3
--
To view, visit http://gerrit.cloudera.org:8080/10365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007
Gerrit-Change-Number: 10365
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6035: Add query options to limit reserved threads

2018-05-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10365 )

Change subject: IMPALA-6035: Add query options to limit reserved threads
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10365/3/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

http://gerrit.cloudera.org:8080/#/c/10365/3/be/src/scheduling/query-schedule.h@61
PS3, Line 61:   int64_t min_reservation_bytes = 0;
This looked like a bug, because these were never explicitly initialised. 
Actually it isn't a bug because std::map guarantees primitives in values are 
zero-initialised if there isn't a default constructor: 
http://en.cppreference.com/w/cpp/container/map/operator_at



--
To view, visit http://gerrit.cloudera.org:8080/10365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007
Gerrit-Change-Number: 10365
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 May 2018 18:18:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db

2018-05-10 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9986 )

Change subject: IMPALA-3307: Add support for IANA time-zone db
..


Patch Set 5:

(19 comments)

Thanks for dealing with my previous comments, Attila! I filed some more but 
basically I'm fine with the changes and feel free to start involving someone 
with more experience on this topic.

http://gerrit.cloudera.org:8080/#/c/9986/4/CMakeLists.txt
File CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/9986/4/CMakeLists.txt@281
PS4, Line 281: Cctz
nit: CCTZ


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@139
PS4, Line 139: new_time_zone_(time_zone), new_tz_
>From reading the names of these 2 variables it's not clear what de difference 
>is. Can you have a new_time_zone_ and a new_time_zone_name_ or such?


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@140
PS4, Line 140: /*overwrite*/
Do you need this comment?


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@153
PS4, Line 153: Timezone *
nit: Timezone* Expect..()


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@155
PS4, Line 155: new_tz_ = TimezoneDatabase::FindTimezone(new_time_zone_);
I wonder if it makes sense to do this assignment in the constructor and then 
this function can be changed to something like "getTimezone" that simply 
returns the corresponding member.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@503
PS4, Line 503: namespace
Why did you need this namespace?


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@504
PS4, Line 504: nline
What is the plan to get rid of the "Duplicate code" TODOs in this review?


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc@55
PS4, Line 55: // This should raise some sort of error or at least return 
null. Hive just ignores it.
Shouldn't this be a TODO?


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc@87
PS4, Line 87: // This should raise some sort of error or at least return 
null. Hive just ignores it.
Same as above


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc
File be/src/exprs/timezone_db-test.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@57
PS4, Line 57: TzAbbev
nit: TzAbbrev?


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@68
PS4, Line 68:   // Abbreviations must start with an uppercase letter.
If it has to start with an uppercase letter, can we add a test this with an 
input that would be valid if the first letter was uppercase?
e.g. "pST", "pst", "singapore" etc.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@105
PS4, Line 105:   // Misformatted time-zone names.
Can you again play around with upper vs lower case letters here?


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h
File be/src/exprs/timezone_db.h:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@1
PS4, Line 1: // with the License.  You may obtain a copy of the License at
Hmm, is the top of the Apache comment missing?


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@23
PS4, Line 23: /// 'TimezoneDatabase' class contains functions to load and 
access the IANA time-zone
Nice description, thx :)


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@91
PS4, Line 91: tz_seg
nit: might be just my preference but tz_segment is still short and I think 
better to read. Up to you.


http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/timezone_db.cc
File be/src/exprs/timezone_db.cc:

http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/timezone_db.cc@147
PS1, Line 147:
> Returning a pointer just to be able to signal failure with nullptr seems co
What I meant is that you could get rid of the offset_sec paramater if you 
changed the return value to int64_t* and you would still be able to detect 
failure if the return value is null. Not a big deal, though. Your choice.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc
File be/src/exprs/timezone_db.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc@365
PS4, Line 365:   hdfsFile hdfs_file = hdfsOpenFile(
Just for my information: LoadZoneInfoFromHdfs() copies the zone info file to a 
local dir, meanwhile this function uses hdfsOpenFile to get the abbrev data. Is 
there a reason that they don't follow the same approac

[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10360 )

Change subject: Update version to 3.1.0-SNAPSHOT
..

Update version to 3.1.0-SNAPSHOT

Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167
Reviewed-on: http://gerrit.cloudera.org:8080/10360
Reviewed-by: Michael Brown 
Tested-by: Impala Public Jenkins 
---
M bin/save-version.sh
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Michael Brown: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167
Gerrit-Change-Number: 10360
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10360 )

Change subject: Update version to 3.1.0-SNAPSHOT
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167
Gerrit-Change-Number: 10360
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 10 May 2018 18:55:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6966: sort table memory by size in catalogd web UI

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10292 )

Change subject: IMPALA-6966: sort table memory by size in catalogd web UI
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2449/


--
To view, visit http://gerrit.cloudera.org:8080/10292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60dc253f862f5fde6fa96147f114d8765bb31a85
Gerrit-Change-Number: 10292
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 10 May 2018 19:17:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6966: sort table memory by size in catalogd web UI

2018-05-10 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10292 )

Change subject: IMPALA-6966: sort table memory by size in catalogd web UI
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60dc253f862f5fde6fa96147f114d8765bb31a85
Gerrit-Change-Number: 10292
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 10 May 2018 19:17:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6999: Upgrade to sqlparse-0.1.19 for Impala shell

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10354 )

Change subject: IMPALA-6999: Upgrade to sqlparse-0.1.19 for Impala shell
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ide51ef3ac52d25a96b0fa832e29b6535197d23cb
Gerrit-Change-Number: 10354
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 19:27:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6999: Upgrade to sqlparse-0.1.19 for Impala shell

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10354 )

Change subject: IMPALA-6999: Upgrade to sqlparse-0.1.19 for Impala shell
..

IMPALA-6999: Upgrade to sqlparse-0.1.19 for Impala shell

sqlparse-0.1.19 is the last version of sqlparse that supports Python
2.6.

Testing:
- Ran all end-to-end tests

Change-Id: Ide51ef3ac52d25a96b0fa832e29b6535197d23cb
Reviewed-on: http://gerrit.cloudera.org:8080/10354
Reviewed-by: David Knupp 
Tested-by: Impala Public Jenkins 
---
M LICENSE.txt
M README.md
M bin/rat_exclude_files.txt
M shell/.gitignore
D shell/ext-py/sqlparse-0.1.14/PKG-INFO
D shell/ext-py/sqlparse-0.1.14/setup.cfg
A shell/ext-py/sqlparse-0.1.19/.travis.yml
R shell/ext-py/sqlparse-0.1.19/AUTHORS
R shell/ext-py/sqlparse-0.1.19/CHANGES
R shell/ext-py/sqlparse-0.1.19/COPYING
R shell/ext-py/sqlparse-0.1.19/MANIFEST.in
R shell/ext-py/sqlparse-0.1.19/README.rst
R shell/ext-py/sqlparse-0.1.19/TODO
R shell/ext-py/sqlparse-0.1.19/bin/sqlformat
R shell/ext-py/sqlparse-0.1.19/docs/source/analyzing.rst
R shell/ext-py/sqlparse-0.1.19/docs/source/api.rst
R shell/ext-py/sqlparse-0.1.19/docs/source/changes.rst
R shell/ext-py/sqlparse-0.1.19/docs/source/conf.py
R shell/ext-py/sqlparse-0.1.19/docs/source/index.rst
R shell/ext-py/sqlparse-0.1.19/docs/source/indices.rst
R shell/ext-py/sqlparse-0.1.19/docs/source/intro.rst
R shell/ext-py/sqlparse-0.1.19/docs/source/ui.rst
R shell/ext-py/sqlparse-0.1.19/docs/sqlformat.1
R shell/ext-py/sqlparse-0.1.19/pytest.ini
R shell/ext-py/sqlparse-0.1.19/setup.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/__init__.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/engine/__init__.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/engine/filter.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/engine/grouping.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/exceptions.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/filters.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/formatter.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/functions.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/keywords.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/lexer.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/pipeline.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/sql.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/tokens.py
R shell/ext-py/sqlparse-0.1.19/sqlparse/utils.py
R shell/ext-py/sqlparse-0.1.19/tests/__init__.py
R shell/ext-py/sqlparse-0.1.19/tests/files/_Make_DirEntry.sql
R shell/ext-py/sqlparse-0.1.19/tests/files/begintag.sql
R shell/ext-py/sqlparse-0.1.19/tests/files/begintag_2.sql
R shell/ext-py/sqlparse-0.1.19/tests/files/dashcomment.sql
R shell/ext-py/sqlparse-0.1.19/tests/files/function.sql
R shell/ext-py/sqlparse-0.1.19/tests/files/function_psql.sql
R shell/ext-py/sqlparse-0.1.19/tests/files/function_psql2.sql
R shell/ext-py/sqlparse-0.1.19/tests/files/function_psql3.sql
R shell/ext-py/sqlparse-0.1.19/tests/files/huge_select.sql
R shell/ext-py/sqlparse-0.1.19/tests/files/test_cp1251.sql
R shell/ext-py/sqlparse-0.1.19/tests/test_filters.py
R shell/ext-py/sqlparse-0.1.19/tests/test_format.py
R shell/ext-py/sqlparse-0.1.19/tests/test_functions.py
R shell/ext-py/sqlparse-0.1.19/tests/test_grouping.py
R shell/ext-py/sqlparse-0.1.19/tests/test_parse.py
R shell/ext-py/sqlparse-0.1.19/tests/test_pipeline.py
R shell/ext-py/sqlparse-0.1.19/tests/test_regressions.py
R shell/ext-py/sqlparse-0.1.19/tests/test_split.py
R shell/ext-py/sqlparse-0.1.19/tests/test_tokenize.py
R shell/ext-py/sqlparse-0.1.19/tests/utils.py
R shell/ext-py/sqlparse-0.1.19/tox.ini
61 files changed, 558 insertions(+), 188 deletions(-)

Approvals:
  David Knupp: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ide51ef3ac52d25a96b0fa832e29b6535197d23cb
Gerrit-Change-Number: 10354
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] test-with-docker: work with git worktree

2018-05-10 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10335 )

Change subject: test-with-docker: work with git worktree
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10335/1/docker/test-with-docker.py
File docker/test-with-docker.py:

http://gerrit.cloudera.org:8080/#/c/10335/1/docker/test-with-docker.py@417
PS1, Line 417: self.git_common_dir = os.path.realpath(
 : _check_output(["git", "rev-parse", 
"--git-common-dir"]).strip())
When I run this on one of my normal checkouts, it does this:
$ git rev-parse --git-common-dir
--git-common-dir

This is on git version 1.9.1, so maybe that is old.



--
To view, visit http://gerrit.cloudera.org:8080/10335
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9186e0b6f068aacc25f8d691508165c04329fa8b
Gerrit-Change-Number: 10335
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 20:19:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7000: [DOCS] Correct info about dedicated executors

2018-05-10 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10357 )

Change subject: IMPALA-7000: [DOCS] Correct info about dedicated executors
..


Patch Set 1:

Hi Juan,
Do you have any comment on this change? Thank you!


--
To view, visit http://gerrit.cloudera.org:8080/10357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4b7e6c57188a41a45d5813882b6dbc37cf47cf1f
Gerrit-Change-Number: 10357
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Juan Yu 
Gerrit-Comment-Date: Thu, 10 May 2018 20:19:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls

2018-05-10 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8523 )

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls
..


Patch Set 11:

(1 comment)

getting back to this change... the upcoming update includes the idea to better 
separate the split specs vs. concrete scan ranges. rebase changes caught this 
in the middle so the update also includes rebase changes (only affected 
HdfsScanNode.java).

http://gerrit.cloudera.org:8080/#/c/8523/11/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

http://gerrit.cloudera.org:8080/#/c/8523/11/common/thrift/PlanNodes.thrift@228
PS11, Line 228: 4: optional THdfsFileSplitGeneratorSpec 
hdfs_file_split_generator_spec
> will try it out.
The additional changes are relatively mechanical. Since this design explicitly 
does not allow these specs to propagate further in the backend, I agree that 
we're better off going this route.



--
To view, visit http://gerrit.cloudera.org:8080/8523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
Gerrit-Change-Number: 8523
Gerrit-PatchSet: 11
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 10 May 2018 21:42:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls

2018-05-10 Thread Vuk Ercegovac (Code Review)
Hello Lars Volker, Dimitris Tsirogiannis, Alex Behm, Mostafa Mokhtar, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8523

to look at the new patch set (#12).

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls
..

IMPALA-5931: Generates scan ranges in planner for s3/adls

Currently, for filesystems that do not include physical
block information (e.g., block replica locations, caching),
synthetic blocks are generated and stored in the catalog
when metadata is loaded. Example file systems for which this is done
includes S3, ADLS, and local fs.

This change avoids generating these blocks when metadata is loaded.
Instead, scan ranges are directly generated from such files by the
backend coordinator. Previously, all scan ranges were produced by
the planner in HDFSScanNode in the frontend. Now, those files without
block information are sent to the coordinator represented by a split
specification that determines how the coordinator will create scan ranges
to send to executors.

This change reduces the space needed in the catalog and reduces the
scan range data structures that are passed from the frontend to the
backend when planning and coordinating a query.
In addition a bug is avoided where non-splittable files were being
split anyways to support the query parameter that places a limit on
scan ranges.

Testing:
- added backend scheduler tests
- mixed-filesystems test covers tables/queries with multiple fs's.
- local fs tests cover the code paths in this change
- all core tests pass when configured with s3
- manually tried larger local filesystem tables (tpch) with multiple
  partitions and observed the same scan ranges.
  - TODO: adls testing

Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
---
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/util/CMakeLists.txt
A be/src/util/flat_buffer.cc
A be/src/util/flat_buffer.h
M common/thrift/Frontend.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
21 files changed, 718 insertions(+), 233 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/12
--
To view, visit http://gerrit.cloudera.org:8080/8523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
Gerrit-Change-Number: 8523
Gerrit-PatchSet: 12
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR](2.x) IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10367 )

Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2
..

IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2

There is a Hive bug in Hive 1.1.0 that can result
in a NullPointerException when doing parallel Hive
operations (see IMPALA-6532). Since dataload goes
parallel on Hive loads starting with IMPALA-6372,
dataload can hit this error on Hive 1.1.0 (i.e.
IMPALA_MINICLUSTER_PROFILE=2). This is impacting
builds on the 2.x branch.

This disables parallel dataload for IMPALA_MINICLUSTER_PROFILE=2.

IMPALA_MINICLUSTER_PROFILE=3 uses a newer version
of Hive that has a fix for this, so this continues
to use parallel dataload for that case.

Parallelism can be reenabled when Hive 1.1.0 gets the
fix from Hive 2.1.1.

Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Reviewed-on: http://gerrit.cloudera.org:8080/10306
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins 
Reviewed-on: http://gerrit.cloudera.org:8080/10367
---
M bin/impala-config.sh
M bin/load-data.py
M testdata/bin/create-load-data.sh
3 files changed, 12 insertions(+), 1 deletion(-)

Approvals:
  Philip Zeyliger: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Gerrit-Change-Number: 10367
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR](2.x) IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10367 )

Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Gerrit-Change-Number: 10367
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 21:48:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6997: Avoid redundant dumping in SetMemLimitExceeded()

2018-05-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10364 )

Change subject: IMPALA-6997: Avoid redundant dumping in SetMemLimitExceeded()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10364/1/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/10364/1/be/src/runtime/runtime-state.cc@211
PS1, Line 211:   if (!query_status_.ok()) return;
This probably works in practice but we don't guarantee that Status is 
thread-safe. It's probably best to hold query_status_lock_ when checking this 
(or do something with atomics, but the lock approach seems fine to me).



--
To view, visit http://gerrit.cloudera.org:8080/10364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I92b87f370a68a2f695ebbc2520a98dd143730701
Gerrit-Change-Number: 10364
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 May 2018 22:09:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10060 )

Change subject: IMPALA-5216: Make admission control queuing async
..


Patch Set 7:

(8 comments)

I think these are my final comments for this iteration round.

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.cc@112
PS7, Line 112: Q
nit: don't capitalize queue

but maybe use "Cancelled (queued)"
which seems consistent with e.g. "Admitted (queued)" and "Timed out (queued)" 
which I think means the action happened while in the queue.


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.cc@899
PS7, Line 899: false;
use && rather than ?:false


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@481
PS7, Line 481: query-exec-state
Is there a reason we can't update the group name to client-request-state? It 
might be just an oversight.
(Maybe good to do a small commit in front of this one for that).


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@482
PS7, Line 482: SubmitToAdmissionController, this,
 :   &admission_controller_thread_,
I think it's a bit confusing that we call this the "admission controller" 
thread given that this thread also finishes the Exec path (i.e. 
Coordinator::Exec()).

How about we rename ProcessQueryOrDmlRequest() to ExecAsyncQueryOrDmlRequest(). 
Then we rename SubmitToAdmissionController() to FinishExecQueryOrDmlRequest() 
or similar, and we call the thread the "exec-thread"? I think that follows the 
naming pattern elsewhere (e.g. WaitAsync()/Wait(), child-query 
ExecAsync()/ExecChildQueries() etc).


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@1148
PS7, Line 1148:   if (!UpdateQueryStatus(admit_status).ok()) return;
in the case that cancel happened while inside AdmitQuery(), AdmitQuery() can 
return CANCELLED, which then will set the status to CANCELLED here. Is that 
intentional?
If so, it seems like we should just get rid of the cause != nullptr nonsense in 
Cancel() and always set the status at that point.


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@1155
PS7, Line 1155: // set it to CANCELLED.
isn't that a user visible behavior change? (though it's probably okay).
but same point - why not just proactively set it to CANCELLED when Cancel() was 
called?

we really need to clean up the state machine (and clarify the legal 
states/transitions) for this class soon.


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@1147
PS7, Line 1147:   lock_guard l(lock_);
  :   if (!UpdateQueryStatus(admit_status).ok()) return;
  :   if (is_cancelled_) {
  : VLOG_QUERY << "Cancelled right after successful 
admission, query id="
  :<< schedule_->query_id();
  : // If query cancelled after being admitted, release 
Admission control resources.
  : 
exec_env_->admission_controller()->ReleaseQuery(*schedule_.get());
  : // If cancellation was user initiated, then 
query_status would be OK, therefore
  : // set it to CANCELLED.
  : discard_result(UpdateQueryStatus(Status::CANCELLED));
  : return;
  :   }
> why is that needed? I guess it's really just an optimization to avoid Coord
After reading the admission controller changes, I see that this window is the 
smaller one since the promise would propagate the cancel and turn the status 
into CANCELLED.

I think we need to simplify the cancellation logic.


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-hs2-server.cc@451
PS7, Line 451:   if (!request_state->uses_admission_control()) {
 : 
request_state->UpdateNonErrorOperationState(TOperationState::RUNNING_STATE);
 :   }
maybe it'd be clearer to tuck this into CRS::Execute().  That way, this code 
doesn't really have to care as much about how the query will be executed.



--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 7
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bik

[Impala-ASF-CR] IMPALA-6966: sort table memory by size in catalogd web UI

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10292 )

Change subject: IMPALA-6966: sort table memory by size in catalogd web UI
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60dc253f862f5fde6fa96147f114d8765bb31a85
Gerrit-Change-Number: 10292
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 10 May 2018 22:39:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6966: sort table memory by size in catalogd web UI

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10292 )

Change subject: IMPALA-6966: sort table memory by size in catalogd web UI
..

IMPALA-6966: sort table memory by size in catalogd web UI

This patch fix the sorting order in "Top-K Tables with Highest
Memory Requirements" in which "Estimated memory" column is sorted
as strings.

Values got from the catalog-server are changed from pretty-printed
strings to bytes numbers. So the web UI is able to sort and render
them correctly.

Change-Id: I60dc253f862f5fde6fa96147f114d8765bb31a85
Reviewed-on: http://gerrit.cloudera.org:8080/10292
Reviewed-by: Dimitris Tsirogiannis 
Tested-by: Impala Public Jenkins 
---
M be/src/catalog/catalog-server.cc
M www/catalog.tmpl
2 files changed, 17 insertions(+), 4 deletions(-)

Approvals:
  Dimitris Tsirogiannis: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I60dc253f862f5fde6fa96147f114d8765bb31a85
Gerrit-Change-Number: 10292
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-7003: Deflake erasure coding data loading

2018-05-10 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10362 )

Change subject: IMPALA-7003: Deflake erasure coding data loading
..


Patch Set 1:

(1 comment)

Did you test data loading on a Jenkins worker to make sure it's not broken?

http://gerrit.cloudera.org:8080/#/c/10362/1/testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
File testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl:

http://gerrit.cloudera.org:8080/#/c/10362/1/testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl@123
PS1, Line 123:   
Comment why we need this setting



--
To view, visit http://gerrit.cloudera.org:8080/10362
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I219106cd3ec7ffab7a834700f2a722b165e5f66c
Gerrit-Change-Number: 10362
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Thu, 10 May 2018 22:47:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required

2018-05-10 Thread Alex Rodoni (Code Review)
Hello Balazs Jeszenszky, Alex Behm, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10339

to look at the new patch set (#2).

Change subject: IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required
..

IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required

Change-Id: I2124e14900d0f82569c061cc46006447bb054b36
---
M docs/shared/impala_common.xml
M docs/topics/impala_invalidate_metadata.xml
2 files changed, 133 insertions(+), 149 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/10339/2
--
To view, visit http://gerrit.cloudera.org:8080/10339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2124e14900d0f82569c061cc46006447bb054b36
Gerrit-Change-Number: 10339
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required

2018-05-10 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10339 )

Change subject: IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required
..


Patch Set 2:

The newly generated google doc is shared with you. Thanks!


--
To view, visit http://gerrit.cloudera.org:8080/10339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2124e14900d0f82569c061cc46006447bb054b36
Gerrit-Change-Number: 10339
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 10 May 2018 23:06:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10165/10/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
File fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java:

http://gerrit.cloudera.org:8080/#/c/10165/10/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java@216
PS10, Line 216:   public boolean isFileCompressionTypeSupported(String fileName,
I took another look at this and realised we don't have any code coverage for 
calling this for compressed files. It's only used in "LOAD DATA" and we don't 
test that with compressed files.

After digging in more, I think we can actually just delete this code and the 
check. The current scenarios are:
* No file suffix -> this check passes
* Recognised and supported compression suffix -> this check passes
* Recognised and unsupported compression suffix -> this check fails
* Unrecognised file suffix -> treated as NONE, the check passes

This is inconsistent, in that in the unrecognised case we're "optimistic" and 
proceed assuming that it's uncompressed, but in the case where we don't have 
the plugin loaded yet, we fail the operation.

"LOAD DATA" doesn't actually read the file contents and doesn't try to validate 
the files in any other ways.

I think it would be best to just remove this check and proceed optimistically.



--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 10 May 2018 23:17:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10165/10/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
File fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java:

http://gerrit.cloudera.org:8080/#/c/10165/10/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java@216
PS10, Line 216:   public boolean isFileCompressionTypeSupported(String fileName,
> I took another look at this and realised we don't have any code coverage fo
Doing this would also prevent a minor regression where loading, say, files with 
a ".lz4" suffix would work before this patch but not after this patch.



--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 10 May 2018 23:23:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell

2018-05-10 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#13). ( 
http://gerrit.cloudera.org:8080/9195 )

Change subject: IMPALA-6337: Fix infinite loop in Impala shell
..

IMPALA-6337: Fix infinite loop in Impala shell

This patch fixes a bug in sqlparse where sqlparse incorrectly splits a
statement that has a new line inside double quotes. The bug in sqlparse
causes Impala shell to go to infinite loop when a statement contains a
new line inside double quotes.

The patch in sqlparse is based on the upstream fix at
https://github.com/andialbrecht/sqlparse/pull/396

Testing:
- Added new end-to-end shell tests
- Ran end-to-end shell tests

Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
---
M shell/ext-py/sqlparse-0.1.19/sqlparse/lexer.py
M shell/ext-py/sqlparse-0.1.19/tests/test_split.py
M tests/shell/test_shell_interactive.py
3 files changed, 25 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/9195/13
-- 
To view, visit http://gerrit.cloudera.org:8080/9195
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Gerrit-Change-Number: 9195
Gerrit-PatchSet: 13
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell

2018-05-10 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9195 )

Change subject: IMPALA-6337: Fix infinite loop in Impala shell
..


Patch Set 13:

> Patch Set 12:
>
> (1 comment)
>
> > Yeah I think it's a good idea to upgrade sqlparse to the last version that
> > supports Python 2.6 especially if we intend to use a vendored sqlparse. IMO
> > it's cleaner to do the upgrade in a separate patch and the apply the bug fix
> > in another patch. We can either upgrade first and apply the bug fix or apply
> > the bug fix first and then upgrade. Either way works for me. Thoughts?
>
> I think upgrading first makes the most sense to me.
>
> And thanks for clarifying my other questions.

The backported bugfix is now based on the upgraded sqlparse-0.1.19. Please 
re-review this CR again.


--
To view, visit http://gerrit.cloudera.org:8080/9195
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Gerrit-Change-Number: 9195
Gerrit-PatchSet: 13
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 10 May 2018 23:26:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-10 Thread Dan Hecht (Code Review)
Hello Michael Ho, Sailesh Mukil, Tim Armstrong, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10158

to look at the new patch set (#19).

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..

IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

The is the final change to clarify and break up the Coordinator's lock.
The state machine for the coordinator is made explicit, distinguishing
between executing state and multiple terminal states. Logic to
transition into a terminal state is centralized in one location and
executes exactly once for each coordinator object.

Derived from a patch for IMPALA_5384 by Marcel Kornacker.

Testing:
- exhaustive functional tests
- stress test on minicluster with memory overcommitment. Verified from
  the logs that this exercises all these paths:
  - successful queries
  - client requested cancellation
  - error from exec FInstances RPC
  - error reported asynchronously via report status RPC
  - eos before backend execution completed

Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
---
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/client-request-state.cc
M be/src/service/impala-server.h
M be/src/util/counting-barrier.h
6 files changed, 387 insertions(+), 391 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/10158/19
--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 19
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10158/19/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/19/be/src/runtime/coordinator.cc@599
PS19, Line 599:   WaitForBackends();
somewhere along the way I broke Kudu DML. Moving this back outside the if-stmt 
fixes that.



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 19
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 May 2018 23:36:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-10 Thread Tim Armstrong (Code Review)
Hello Zoltan Borok-Nagy, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10165

to look at the new patch set (#11).

Change subject: IMPALA-6941: load more text scanner compression plugins
..

IMPALA-6941: load more text scanner compression plugins

Add extensions for LZ4 and ZSTD (which are supported by Hadoop).
Even without a plugin this results in better behaviour because
we don't try to treat the files with unknown extensions as
uncompressed text.

Also allow loading tables containing files with unsupported
compression types. There was weird behaviour before we knew
of the file extension but didn't support querying the table -
the catalog would load the table but the impalad would fail
processing the catalog update. The simplest way to fix it
is to just allow loading the tables.

Similarly, make the "LOAD DATA" operation more permissive -
we can copy files into a directory even if we can't
decompress them.

Switch to always checking plugin version - running mismatched plugin
is inherently unsafe.

Testing:
Positive case where LZO is loaded is exercised. Added
coverage for negative case where LZO is disabled.

Fixed test gaps:
* Querying LZO table with LZO plugin not available.
* Interacting with tables with known but unsupported text
  compressions.
* Querying files with unknown compression suffixes (which are
  treated as uncompressed text).

Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
---
M be/src/common/global-flags.cc
M be/src/exec/CMakeLists.txt
D be/src/exec/hdfs-lzo-text-scanner.cc
D be/src/exec/hdfs-lzo-text-scanner.h
A be/src/exec/hdfs-plugin-text-scanner.cc
A be/src/exec/hdfs-plugin-text-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
M common/fbs/CatalogObjects.fbs
M fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java
M fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java
M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
A testdata/workloads/functional-query/queries/QueryTest/disable-lzo-plugin.test
A 
testdata/workloads/functional-query/queries/QueryTest/unsupported-compression-partitions.test
A tests/custom_cluster/test_scanner_plugin.py
M tests/metadata/test_partition_metadata.py
18 files changed, 458 insertions(+), 280 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/10165/11
--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner

2018-05-10 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10256 )

Change subject: IMPALA-6957: calc thread resource requirement in planner
..


Patch Set 6:

(2 comments)

FE and explain changes look good to me

http://gerrit.cloudera.org:8080/#/c/10256/6/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/10256/6/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@261
PS6, Line 261: .setRequiredThreads(1).build()
Why is this 1 and not getNumInstancesPerHost()


http://gerrit.cloudera.org:8080/#/c/10256/4/testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test:

http://gerrit.cloudera.org:8080/#/c/10256/4/testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test@64
PS4, Line 64: instances
> I'm open to the idea but maybe we should decouple it from this change?
I like short names. I don't think a longer name will necessarily make it 
clearer. Even if we call it "total-instances-estimate" I can definitely see 
users asking about its precise meaning.

Many things here are estimates, so not sure it makes sense to add an explicit 
"estimate" prefix.



--
To view, visit http://gerrit.cloudera.org:8080/10256
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140
Gerrit-Change-Number: 10256
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 May 2018 23:42:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS

2018-05-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10370


Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS
..

IMPALA-7010: don't run memory usage tests on non-HDFS

Moved a number of tests with tuned mem_limits. In some cases
this required separating the tests from non-tuned functional
tests.

TestQueryMemLimit used very high and very low limits only, so seemed
safe to run in all configurations.

Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
---
A testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
A 
testdata/workloads/functional-query/queries/QueryTest/kudu_insert_mem_limit.test
A 
testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test
R 
testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test
M tests/common/skip.py
M tests/query_test/test_insert.py
M tests/query_test/test_kudu.py
M tests/query_test/test_mem_usage_scaling.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_sort.py
M tests/query_test/test_spilling.py
15 files changed, 183 insertions(+), 150 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/10370/1
--
To view, visit http://gerrit.cloudera.org:8080/10370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Gerrit-Change-Number: 10370
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner

2018-05-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10256 )

Change subject: IMPALA-6957: calc thread resource requirement in planner
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10256/6/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/10256/6/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@261
PS6, Line 261: .setRequiredThreads(1).build()
> Why is this 1 and not getNumInstancesPerHost()
The resource profile being computed here is for each instance of the fragment 
(see the method comment), computing the per-host value is done outside of this.


http://gerrit.cloudera.org:8080/#/c/10256/4/testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test:

http://gerrit.cloudera.org:8080/#/c/10256/4/testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test@64
PS4, Line 64: instances
> I like short names. I don't think a longer name will necessarily make it cl
Sounds like deferring it is a good idea :)



--
To view, visit http://gerrit.cloudera.org:8080/10256
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140
Gerrit-Change-Number: 10256
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 May 2018 23:51:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10165/10/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
File fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java:

http://gerrit.cloudera.org:8080/#/c/10165/10/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java@216
PS10, Line 216:   public boolean isFileCompressionTypeSupported(String fileName,
> Doing this would also prevent a minor regression where loading, say, files
Done



--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 10 May 2018 23:52:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 11:

The last PS includes a non-trivial change, so this really needs another quick 
look. Hopefully the change makes sense.


--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 10 May 2018 23:53:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS

2018-05-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10370 )

Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Gerrit-Change-Number: 10370
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 11 May 2018 00:15:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files

2018-05-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9693 )

Change subject: IMPALA-5842: Write page index in Parquet files
..


Patch Set 16: Code-Review+1

(1 comment)

LGTM aside from one bug

http://gerrit.cloudera.org:8080/#/c/9693/16/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/9693/16/be/src/exec/hdfs-parquet-table-writer.cc@769
PS16, Line 769: page_index_memory_consumption_ += new_memory_allocation;
Need to increment this after the TryConsume() succeeds, otherwise we'll release 
too much.



--
To view, visit http://gerrit.cloudera.org:8080/9693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Gerrit-Change-Number: 9693
Gerrit-PatchSet: 16
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Anonymous Coward #248
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 11 May 2018 00:17:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-10 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 19:

(12 comments)

LGTM overall, just a few comments, some relating to style.

http://gerrit.cloudera.org:8080/#/c/10158/19/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/19/be/src/runtime/coordinator.h@93
PS19, Line 93: /// 2. exec_state_lock_, filter_lock_, ExecSummary::lock (leafs)
What about backend_states_init_lock_ ?


http://gerrit.cloudera.org:8080/#/c/10158/19/be/src/runtime/coordinator.h@206
PS19, Line 206: lock_
comment requires an update.


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@325
PS18, Line 325: ExecState state
const ref


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@350
PS18, Line 350: ExecState old_state, ExecState new_state
These look like they're read only, so they should be passed as const refs.


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@357
PS18, Line 357: ExecState state
const ref


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@360
PS18, Line 360: ExecState s
const ref


http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@127
PS14, Line 127:
You mean Cancel() and UpdateBackendExecStatus()? Maybe this is not worth 
mentioning here and can be specified in those functions instead.


http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@279
PS14, Line 279: _RESULTS: GetNext() set eos
To be more explicit, we should say "... or when Cancel() is called."

And we should document that Cancel() is called on an error in the appropriate 
place if we haven't done that already.


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@457
PS18, Line 457: ret_status
Preferable to get rid of the automatic variable 'ret_status' and just return 
exec_status_ here instead. Every time we copy statuses around, we're calling 
the copy constructor that does a string copy.


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@462
PS18, Line 462: Status ret_status;
Same here.


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@476
PS18, Line 476:   DCHECK(exec_status_.ok()) << exec_status_;
If an error happens after exec_state_ is already set to 
ExecState::RETURNED_RESULTS, do we still want to relay that in the logs?

Eg: Limit queries.

Currently, we will be adding that to the logs at L493 but we need to think 
through whether that's necessary, since calling HandleExecStateTransition() in 
L500 will also be a no-op in this case.


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@486
PS18, Line 486: !status.ok() && exec_status_.IsCancelled()
(!status.ok() && !status.IsCancelled() && exec_status_.IsCancelled())



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 19
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 00:33:59 +
Gerrit-HasComments: Yes