[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10347 )

Change subject: Fix diagnostics path to not include the parent dir structure
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2443/


--
To view, visit http://gerrit.cloudera.org:8080/10347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
Gerrit-Change-Number: 10347
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 05:53:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure

2018-05-09 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10347 )

Change subject: Fix diagnostics path to not include the parent dir structure
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
Gerrit-Change-Number: 10347
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 05:50:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure

2018-05-09 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10347 )

Change subject: Fix diagnostics path to not include the parent dir structure
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10347/2/bin/diagnostics/collect_diagnostics.py
File bin/diagnostics/collect_diagnostics.py:

http://gerrit.cloudera.org:8080/#/c/10347/2/bin/diagnostics/collect_diagnostics.py@450
PS2, Line 450: # collection_root_dir is an absoulte path. There is no 
point in preserving its
> nit: its
Done


http://gerrit.cloudera.org:8080/#/c/10347/2/bin/diagnostics/collect_diagnostics.py@451
PS2, Line 451: # entire directory structure in the archive, so set the 
arcname accordingly.
> nit: "archive, so set"
Done



--
To view, visit http://gerrit.cloudera.org:8080/10347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
Gerrit-Change-Number: 10347
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 05:49:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure

2018-05-09 Thread Bharath Vissapragada (Code Review)
Hello Lars Volker, Philip Zeyliger, Kim Jin Chul,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10347

to look at the new patch set (#3).

Change subject: Fix diagnostics path to not include the parent dir structure
..

Fix diagnostics path to not include the parent dir structure

Without the fix, the diagnostics tar file included the entire
directory structure of the diagnostics root dir.

Before:
===
$ tar tf /tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh.tar.gz
tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/
tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/stacks/
tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/stacks/jstack-0.txt


After:
=
$ tar tf /tmp/impala-diagnostics-2018-05-08-12-01-51-Y0nlQI.tar.gz
impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/
impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/stacks/
impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/stacks/jstack-0.txt
.

Tested with python 2.6

Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
---
M bin/diagnostics/collect_diagnostics.py
1 file changed, 4 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/10347/3
--
To view, visit http://gerrit.cloudera.org:8080/10347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
Gerrit-Change-Number: 10347
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure

2018-05-09 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10347 )

Change subject: Fix diagnostics path to not include the parent dir structure
..


Patch Set 2: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10347/2/bin/diagnostics/collect_diagnostics.py
File bin/diagnostics/collect_diagnostics.py:

http://gerrit.cloudera.org:8080/#/c/10347/2/bin/diagnostics/collect_diagnostics.py@450
PS2, Line 450: # collection_root_dir is an absoulte path. There is no 
point in preserving it's
nit: its


http://gerrit.cloudera.org:8080/#/c/10347/2/bin/diagnostics/collect_diagnostics.py@451
PS2, Line 451: # entire directory structure in the archive. So set the 
arcname accordingly.
nit: "archive, so set"



--
To view, visit http://gerrit.cloudera.org:8080/10347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
Gerrit-Change-Number: 10347
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 04:27:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure

2018-05-09 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10347 )

Change subject: Fix diagnostics path to not include the parent dir structure
..


Patch Set 2:

Kim Jun Chul,

If you want to discuss a policy change about requiring JIRA mentions, it seems 
like the dev mailing list may be a better venue than individual code reviews.


--
To view, visit http://gerrit.cloudera.org:8080/10347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
Gerrit-Change-Number: 10347
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 04:24:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] test-with-docker: work with git worktree

2018-05-09 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10335 )

Change subject: test-with-docker: work with git worktree
..


Patch Set 1:

For small changes, I think we think it's fine to avoid the paperwork. I looked, 
and I'm not the only person who does it.


--
To view, visit http://gerrit.cloudera.org:8080/10335
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9186e0b6f068aacc25f8d691508165c04329fa8b
Gerrit-Change-Number: 10335
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 04:22:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] Remove IMPALA THRIFT JAVA VERSION and untested Darwin Thrift versions.

2018-05-09 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10361 )

Change subject: Remove IMPALA_THRIFT_JAVA_VERSION and untested Darwin Thrift 
versions.
..


Patch Set 1:

In terms of additional color, I've heard from multiple folks that the 
environment variables are a major pain. It makes it hard to get started in 
Impala, harder to get tools set up just right, etc. One of the motivations here 
is that I've run into tools that parse pom.xml files but are confused by 
variable references. This is a drop in a big ocean, but I think it's 
directionally reasonable to work towards fewer things in impala-config.sh.


--
To view, visit http://gerrit.cloudera.org:8080/10361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I98f8c0b344afd001864653659c045b5d3c3e94ac
Gerrit-Change-Number: 10361
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 04:17:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] Remove IMPALA THRIFT JAVA VERSION and untested Darwin Thrift versions.

2018-05-09 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10361


Change subject: Remove IMPALA_THRIFT_JAVA_VERSION and untested Darwin Thrift 
versions.
..

Remove IMPALA_THRIFT_JAVA_VERSION and untested Darwin Thrift versions.

The singular use of IMPALA_THRIFT_JAVA_VERSION was in
impala-parent/pom.xml. We can reduce complexity by just inlining
the version there, like we do with several other Java dependencies.

Meanwhile, with the upgrade to Thrift 0.9.3, it doesn't make sense
to retain the Darwin path for Thrift 0.9.2. The reality is likely
that nobody is building Impala on Darwin.

Change-Id: I98f8c0b344afd001864653659c045b5d3c3e94ac
---
M bin/impala-config.sh
M impala-parent/pom.xml
2 files changed, 1 insertion(+), 7 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/10361/1
--
To view, visit http://gerrit.cloudera.org:8080/10361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I98f8c0b344afd001864653659c045b5d3c3e94ac
Gerrit-Change-Number: 10361
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR](2.x) IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10363 )

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10363
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 10 May 2018 03:39:12 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has uploaded a new patch set (#2) to the change 
originally created by Vuk Ercegovac. ( http://gerrit.cloudera.org:8080/10363 )

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..

IMPALA-6948,IMPALA-6962: add end-to-end tests

Adds end-to-end tests to validate that following
various metadata operations, the catalog state
in catalogd and impalads is the same.

For IMPALA-6962, catalogd process restart for tests
is fixed.

Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Reviewed-on: http://gerrit.cloudera.org:8080/10291
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins 
---
M be/src/catalog/catalog-server.cc
M tests/common/impala_cluster.py
M tests/common/impala_service.py
A tests/custom_cluster/test_metadata_replicas.py
M tests/metadata/test_hms_integration.py
A tests/util/hive_utils.py
6 files changed, 242 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/10363/2
--
To view, visit http://gerrit.cloudera.org:8080/10363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10363
Gerrit-PatchSet: 2
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR](2.x) IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10363 )

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..

IMPALA-6948,IMPALA-6962: add end-to-end tests

Adds end-to-end tests to validate that following
various metadata operations, the catalog state
in catalogd and impalads is the same.

For IMPALA-6962, catalogd process restart for tests
is fixed.

Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Reviewed-on: http://gerrit.cloudera.org:8080/10291
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins 
---
M be/src/catalog/catalog-server.cc
M tests/common/impala_cluster.py
M tests/common/impala_service.py
A tests/custom_cluster/test_metadata_replicas.py
M tests/metadata/test_hms_integration.py
A tests/util/hive_utils.py
6 files changed, 242 insertions(+), 68 deletions(-)

--
To view, visit http://gerrit.cloudera.org:8080/10363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10363
Gerrit-PatchSet: 2
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-09 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 16: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@522
PS15, Line 522: // TODO: is this needed? This will also happen on the "backend" 
side of cancel rpc.
  :   // And in the case of EOS, sink already knows this.
  :   if (coord_sink_ != nullptr) coord_sink_->CloseConsumer();
> Yeah, I agree, but would prefer to do it in an independent commit in case t
makes sense.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@943
PS15, Line 943: e : backend_states_) {
> In principle, I agree that's how we'd want it to work. But given the way Cl
sounds good


http://gerrit.cloudera.org:8080/#/c/10158/16/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/16/be/src/runtime/coordinator.cc@132
PS16, Line 132:
nit: extra space



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 16
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 May 2018 02:16:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10306 )

Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2
..

IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2

There is a Hive bug in Hive 1.1.0 that can result
in a NullPointerException when doing parallel Hive
operations (see IMPALA-6532). Since dataload goes
parallel on Hive loads starting with IMPALA-6372,
dataload can hit this error on Hive 1.1.0 (i.e.
IMPALA_MINICLUSTER_PROFILE=2). This is impacting
builds on the 2.x branch.

This disables parallel dataload for IMPALA_MINICLUSTER_PROFILE=2.

IMPALA_MINICLUSTER_PROFILE=3 uses a newer version
of Hive that has a fix for this, so this continues
to use parallel dataload for that case.

Parallelism can be reenabled when Hive 1.1.0 gets the
fix from Hive 2.1.1.

Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Reviewed-on: http://gerrit.cloudera.org:8080/10306
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins 
---
M bin/impala-config.sh
M bin/load-data.py
M testdata/bin/create-load-data.sh
3 files changed, 11 insertions(+), 1 deletion(-)

Approvals:
  Philip Zeyliger: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Gerrit-Change-Number: 10306
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10306 )

Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Gerrit-Change-Number: 10306
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 01:30:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

2018-05-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8936 )

Change subject: IMPALA-3833: Fix invalid data handling in Sequence and RCFile 
scanners
..


Patch Set 19: Code-Review+2

Will wait for the other prerequisite to be merged.


--
To view, visit http://gerrit.cloudera.org:8080/8936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
Gerrit-Change-Number: 8936
Gerrit-PatchSet: 19
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Thu, 10 May 2018 00:48:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 16: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10158/16/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/16/be/src/runtime/coordinator.cc@618
PS16, Line 618: currently
concurrently



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 16
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 May 2018 00:47:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] test-with-docker: work with git worktree

2018-05-09 Thread Kim Jin Chul (Code Review)
Kim Jin Chul has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10335 )

Change subject: test-with-docker: work with git worktree
..


Patch Set 1:

Why don't you put JIRA on the commit msg?


--
To view, visit http://gerrit.cloudera.org:8080/10335
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9186e0b6f068aacc25f8d691508165c04329fa8b
Gerrit-Change-Number: 10335
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Comment-Date: Thu, 10 May 2018 00:34:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] test-with-docker: exit properly on failures

2018-05-09 Thread Kim Jin Chul (Code Review)
Kim Jin Chul has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10318 )

Change subject: test-with-docker: exit properly on failures
..


Patch Set 1:

Why don't you put JIRA on the commit msg?


--
To view, visit http://gerrit.cloudera.org:8080/10318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I285e2f4d07e34898d73beba857e9ac325ed4e6db
Gerrit-Change-Number: 10318
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 10 May 2018 00:34:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@291
PS14, Line 291:   ExecState exec_state_ = ExecState::EXECUTING;
> I agree it's a little confusing, but there is nothing you can do with this
I think the changes to the header comment made it clearer.



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 14
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 May 2018 00:31:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR](2.x) IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10363 )

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2440/


--
To view, visit http://gerrit.cloudera.org:8080/10363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10363
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 10 May 2018 00:24:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-09 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 16:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h@322
PS15, Line 322: cu
> double space snuck back in
Done


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h@341
PS15, Line 341: ith 'instance_ho
> nit: update to "failed_finstance"
Done


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h@414
PS15, Line 414:
> nit: extra space
Done


http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@a115
PS14, Line 115:
  :
> put that back
Done


http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@121
PS14, Line 121: may call any
> should we call this a non-OK status, because for cancellation through Cance
Done


http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@131
PS14, Line 131:
> do you mean cancels?
I think I meant to delete that.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@147
PS15, Line 147: Status prepare_status = query_state_->WaitForPrepare();
  :   DCHECK(!prepare_status.ok());
  :   return UpdateExecState
> should we call UpdateExecStateIfError() here?
yes, done.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@486
PS15, Line 486: !status.ok() && exec_status_.IsCancelled()
> can we have a case with  =  ?
Ideally no. but I think the backend can sometimes do it due to complications in 
the error propagation path. I would like to get to a point where this can't be 
cancelled, but I don't think we're there yet.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@522
PS15, Line 522: // TODO: is this needed? This will also happen on the "backend" 
side of cancel rpc.
  :   // And in the case of EOS, sink already knows this.
  :   if (coord_sink_ != nullptr) coord_sink_->CloseConsumer();
> I dont think we need this, I looked at why we originally needed this and it
Yeah, I agree, but would prefer to do it in an independent commit in case there 
is some unforeseen issue.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@530
PS15, Line 530:
  :   } else {
> in case of cancelBackends(), should we also wait for resources to be freed?
I think the argument can be made in both directions, but we eventually settled 
on the current behavior. The reasoning is documented in the header comment for 
ReleaseAdmissionControlResources(). I'll update this todo as to not add 
confusion.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@605
PS15, Line 605:
> maybe change the name to something like UpdateExecState, since this is also
The "if error" was meant to say that the exec state is only updated if the 
status is an not-ok and only to move into an error state (as opposed to, say, 
cancell state). But I can see the confusion so i'll change it since you feel 
it'd be clearer. I guess the header comment should clarify that this shouldn't 
be used for cancellation.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@639
PS15, Line 639: DCHECK(exec_rpcs_complete_barrier_ != nullptr &&
  :   exec_rpcs_complete_barrier_->pending() <= 0) << "Exec() 
must be called first";
  :   discard_result(SetNonErrorTerminalSt
> does this mean we can call cancel before Exec() completes? what if cancel i
Yeah, this is bogus. Fixed. The comments in the header already say Exec() needs 
to precede this.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@943
PS15, Line 943: e : backend_states_) {
> Do we expect these methods to be called before Exec() completes? If not, we
In principle, I agree that's how we'd want it to work. But given the way 
ClientRequestState currently works and that things like the webserver just 
reach in and get the coord() object, I think they can currently call these 
webserver support rourtines before Exec() completes. How about I revisit this 
after your patch is merged?



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 16
Gerrit-Owner: Dan Hecht 
Ger

[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-09 Thread Dan Hecht (Code Review)
Hello Michael Ho, Sailesh Mukil, Tim Armstrong, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10158

to look at the new patch set (#16).

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..

IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

The is the final change to clarify and break up the Coordinator's lock.
The state machine for the coordinator is made explicit, distinguishing
between executing state and multiple terminal states. Logic to
transition into a terminal state is centralized in one location and
executes exactly once for each coordinator object.

Derived from a patch for IMPALA_5384 by Marcel Kornacker.

Testing:
- exhaustive functional tests
- stress test on minicluster with memory overcommitment. Verified from
  the logs that this exercises all these paths:
  - successful queries
  - client requested cancellation
  - error from exec FInstances RPC
  - error reported asynchronously via report status RPC
  - eos before backend execution completed

Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
---
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/client-request-state.cc
M be/src/service/impala-server.h
M be/src/util/counting-barrier.h
6 files changed, 389 insertions(+), 391 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/10158/16
--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 16
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR](2.x) IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10363 )

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10363
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Comment-Date: Thu, 10 May 2018 00:15:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6923:Update scripts in benchmark folder to store workload and few minor updates

2018-05-09 Thread Kim Jin Chul (Code Review)
Kim Jin Chul has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10100 )

Change subject: IMPALA-6923:Update scripts in benchmark folder to store 
workload and few minor updates
..


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10100/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10100/8//COMMIT_MSG@7
PS8, Line 7: Update
Please add a space in front of this.


http://gerrit.cloudera.org:8080/#/c/10100/8/tests/benchmark/report_benchmark_results.py
File tests/benchmark/report_benchmark_results.py:

http://gerrit.cloudera.org:8080/#/c/10100/8/tests/benchmark/report_benchmark_results.py@141
PS8, Line 141: Dont
I think either "Don't" or "Do not" is better than Dont. This is used for user's 
helper message.


http://gerrit.cloudera.org:8080/#/c/10100/8/tests/benchmark/report_benchmark_results.py@1058
PS8, Line 1058:   if exec_summaries[0] is None:
  : return ""
I think this is a redundant special handling.  I believe L1054-1055 can be 
merged into the L1058-1059 because there is no symbol dependent issue.

e.g.)
if options.hive_results or exec_summaries[0] is None:
  return ""



--
To view, visit http://gerrit.cloudera.org:8080/10100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ica7c00ad59963d466bae9e607a4692af0138962c
Gerrit-Change-Number: 10100
Gerrit-PatchSet: 8
Gerrit-Owner: Nithya Janarthanan 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Nithya Janarthanan 
Gerrit-Comment-Date: Thu, 10 May 2018 00:12:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6997: Avoid redundant dumping in SetMemLimitExceeded()

2018-05-09 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10364


Change subject: IMPALA-6997: Avoid redundant dumping in SetMemLimitExceeded()
..

IMPALA-6997: Avoid redundant dumping in SetMemLimitExceeded()

When a UDF hits a MemLimitExceeded, the query does not
immediately abort. Instead, UDFs rely on the caller
checking the query_status_ periodically. This means that
on some codepaths, UDFs can call SetMemLimitExceeded()
many times (e.g. once per row) before the query fragment
exits.

RuntimeState::SetMemLimitExceeded() currently constructs
a MemLimitExceeded Status and dumps it for each call, even
if the query has already hit an error. This is expensive
and can delay an fragment from exiting when UDFs are
repeatedly hitting MemLimitExceeded.

This changes SetMemLimitExceeded() to avoid dumping if
the query_status_ is already not ok.

Change-Id: I92b87f370a68a2f695ebbc2520a98dd143730701
---
M be/src/runtime/runtime-state.cc
1 file changed, 7 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/10364/1
--
To view, visit http://gerrit.cloudera.org:8080/10364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I92b87f370a68a2f695ebbc2520a98dd143730701
Gerrit-Change-Number: 10364
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR](2.x) IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-09 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10363


Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..

IMPALA-6948,IMPALA-6962: add end-to-end tests

Adds end-to-end tests to validate that following
various metadata operations, the catalog state
in catalogd and impalads is the same.

For IMPALA-6962, catalogd process restart for tests
is fixed.

Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Reviewed-on: http://gerrit.cloudera.org:8080/10291
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins 
(cherry picked from commit 28c1f76529b79d437b66a80b954ec227e0ddc6cd)
---
M be/src/catalog/catalog-server.cc
M tests/common/impala_cluster.py
M tests/common/impala_service.py
A tests/custom_cluster/test_metadata_replicas.py
M tests/metadata/test_hms_integration.py
A tests/util/hive_utils.py
6 files changed, 242 insertions(+), 68 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/10363/1
--
To view, visit http://gerrit.cloudera.org:8080/10363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10363
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac 


[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure

2018-05-09 Thread Kim Jin Chul (Code Review)
Kim Jin Chul has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10347 )

Change subject: Fix diagnostics path to not include the parent dir structure
..


Patch Set 2:

Why don't you put JIRA in your commit message?
As far as I know, most changes include JIRA in the commit message. But some of 
the merged changes did not exist in JIRA. I think it's a good idea to include 
JIRA consistently even with small changes. What do you think?


--
To view, visit http://gerrit.cloudera.org:8080/10347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
Gerrit-Change-Number: 10347
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 09 May 2018 23:36:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-09 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10060 )

Change subject: IMPALA-5216: Make admission control queuing async
..


Patch Set 7:

(27 comments)

I need to dig into the details some more but some initial comments/questions.

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h
File be/src/scheduling/admission-controller.h:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@44
PS7, Line 44: ClientRequestState::admit_outcome_
if this enum is meant to be consumed publicly, then it's best to not refer to 
this private variable when explaining it. let's try to explain it in terms of 
the public portion of the AdmissionController abstraction.


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@201
PS7, Line 201: admit_status
admit_outcome?


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@204
PS7, Line 204: admit_status
admit_outcome


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h
File be/src/service/client-request-state.h:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@72
PS7, Line 72:   /// Must *not* be called with lock_ held.
given that the coordinator object is (unfortunately) current exposed by this 
class, let's document when, and in which cases, that gets created.


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@114
PS7, Line 114: non-error states (RUNNING_STATE and FINISHED_STATE)
is PENDING_STATE one of those?


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@152
PS7, Line 152: coord_
let's try not to talk about private members when documenting the public 
interface (so that we can try to make sense of the abstractions without knowing 
all the implementation details).


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@172
PS7, Line 172:   Coordinator* coord() const {
 : return coord_exec_started_.Load() ? coord_.get() : nullptr;
 :   }
it's confusing that coord() is not actually an accessor given it's name. If we 
really need this then we should name it GetCoord() or something.

But, it seems weird that we need this. We really need to cleanup the CRS 
states, but in the mean time, why can't we just defer setting coord_ until 
after coord->Exec() returns (and maybe hold the lock while setting coord_ if 
that makes the synchronization simpler)?


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@323
PS7, Line 323: Exec() has been successfully called on it
why do we bother creating the coordinator object ahead of when we call Exec()? 
i.e. what use is it before Exec() is called?


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-hs2-server.cc@448
PS7, Line 448:   DCHECK(status.ok() || 
!request_state->uses_admission_control());
why is that the case, and why does this code care?


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-hs2-server.cc@464
PS7, Line 464: if (!status.ok()) goto return_error;
if we hit an error here, how does the WaitAsync thread finish?


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc
File be/src/service/impala-http-handler.cc:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc@769
PS7, Line 769:  == nullptr
not your change, and let's not fix it here, but given that the coord_ ptr is 
set outside of holding the lock, these checks aren't really technically safe. 
This is another example of why we should tighten up the specification of the 
CRS states.


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc@769
PS7, Line 769: uses_admission_control(
should that check for PENDING_STATE instead, to be more specific?


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc@778
PS7, Line 778: request_state->coord() != nullptr
this would also be more about the states, but okay to leave it for now.


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-http-handler.cc@779
PS7, Line 779: request_state->coord()->GetTExecSummary(&summary);
I guess PrintExecSummary() is okay with an empty summary? have you tested that? 
I think previously the summary would at least be initialized (though given that 
the CRS lock isn't held across setting the coord_ ptr and calling Exec(), there 
was a small window previously that could also lead to this empty case).


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/i

[Impala-ASF-CR] IMPALA-7003: Deflake erasure coding data loading

2018-05-09 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10362


Change subject: IMPALA-7003: Deflake erasure coding data loading
..

IMPALA-7003: Deflake erasure coding data loading

Erasure coding data loading is flaky in two ways:
1. HBase sometimes doesn't work because of HBase-19369
2. Nested data loading sometimes fails because the HDFS namenode cannot
   find enough good datanodes.

For problem 1, this patch enables erasure coding only on /test-warehouse
directory. For problem 2, this patch sets
dfs.namenode.redundancy.considerLoad to false, preventing namenode to
exclude heavily-loaded datanodes.

Change-Id: I219106cd3ec7ffab7a834700f2a722b165e5f66c
---
M bin/impala-config.sh
M testdata/bin/create-load-data.sh
M testdata/bin/load-test-warehouse-snapshot.sh
M testdata/bin/setup-hdfs-env.sh
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
5 files changed, 20 insertions(+), 10 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/10362/1
--
To view, visit http://gerrit.cloudera.org:8080/10362
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I219106cd3ec7ffab7a834700f2a722b165e5f66c
Gerrit-Change-Number: 10362
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-6819: Add new queries to targeted-perf workload

2018-05-09 Thread David Knupp (Code Review)
David Knupp has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9979 )

Change subject: IMPALA-6819: Add new queries to targeted-perf workload
..

IMPALA-6819: Add new queries to targeted-perf workload

Description:
Adding new queries to the targeted-perf workload that is
used by Impala performance tests run via $IMPALA_HOME/bin/run-workload.py

Testing:
Ran the performance tests for the targeted-perf workload
and all the tests passed

Change-Id: I5c415924d0bb6da1b1f5df6cb16b95a1d2eaa3ab
Reviewed-on: http://gerrit.cloudera.org:8080/9979
Tested-by: Impala Public Jenkins 
Reviewed-by: David Knupp 
---
A testdata/workloads/targeted-perf/queries/primitive_conjunct_ordering_1.test
A testdata/workloads/targeted-perf/queries/primitive_conjunct_ordering_2.test
A testdata/workloads/targeted-perf/queries/primitive_conjunct_ordering_3.test
A testdata/workloads/targeted-perf/queries/primitive_conjunct_ordering_4.test
A testdata/workloads/targeted-perf/queries/primitive_count_star.test
A testdata/workloads/targeted-perf/queries/primitive_decimal_arithmetic.test
A testdata/workloads/targeted-perf/queries/primitive_filter_bigint_in_list.test
A testdata/workloads/targeted-perf/queries/primitive_intrinsic_appx_median.test
A testdata/workloads/targeted-perf/queries/primitive_intrinsic_to_date.test
M testdata/workloads/targeted-perf/queries/primitive_long_predicate.test
A testdata/workloads/targeted-perf/queries/primitive_many_fragments.test
A 
testdata/workloads/targeted-perf/queries/primitive_many_independent_fragments.test
A 
testdata/workloads/targeted-perf/queries/primitive_orderby_bigint_expression.test
A testdata/workloads/targeted-perf/queries/primitive_shuffle_1mb_rows.test
14 files changed, 675 insertions(+), 171 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  David Knupp: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/9979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I5c415924d0bb6da1b1f5df6cb16b95a1d2eaa3ab
Gerrit-Change-Number: 9979
Gerrit-PatchSet: 6
Gerrit-Owner: Nithya Janarthanan 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 


[Impala-ASF-CR] IMPALA-6819: Add new queries to targeted-perf workload

2018-05-09 Thread David Knupp (Code Review)
David Knupp has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9979 )

Change subject: IMPALA-6819: Add new queries to targeted-perf workload
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5c415924d0bb6da1b1f5df6cb16b95a1d2eaa3ab
Gerrit-Change-Number: 9979
Gerrit-PatchSet: 5
Gerrit-Owner: Nithya Janarthanan 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Comment-Date: Wed, 09 May 2018 23:07:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required

2018-05-09 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10339 )

Change subject: IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required
..


Patch Set 1:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml
File docs/topics/impala_invalidate_metadata.xml:

http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@47
PS1, Line 47: relatively
> replace: very
Done


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@48
PS1, Line 48: in the common scenario of adding new data files to an existing 
table
> replace: whenever possible (link to INVALIDATE vs. REFRESH usage page, if e
Replaced with "whenever possible" No link to add, though.


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@59
PS1, Line 59: By default
> replace: If there is no table specified
Done


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@60
PS1, Line 60: Even for a single table, INVALIDATE METADATA is 
more expensive
:   than REFRESH, so prefer 
REFRESH in the common case where you add new data
:   files for an existing table.
> Same thing mentioned ~10 lines above - remove?
Removed


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@69
PS1, Line 69: Therefore, if some other entity modifies information used by 
Impala in the metastore
:   that Impala and Hive share, the information cached by 
Impala must be updated. However, this does not mean
:   that all metadata updates require an Impala update.
> This is vague, explicitly state when is manual invalidate needed (see L100-
Done


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@74
PS1, Line 74: 
:   
:   
: In Impala 1.2 and higher, a dedicated daemon 
(catalogd) broadcasts DDL changes made
: through Impala to all Impala nodes. Formerly, after you 
created a database or table while connected to one
: Impala node, you needed to issue an INVALIDATE 
METADATA statement on another Impala node
: before accessing the new database or table from the other 
node. Now, newly created or altered objects are
: picked up automatically by all Impala nodes. You must 
still use the INVALIDATE METADATA
: technique after creating or altering objects through 
Hive. See
:  for 
more information on the catalog service.
:   
:   
: The INVALIDATE METADATA statement is new 
in Impala 1.1 and higher, and takes over some of
: the use cases of the Impala 1.0 REFRESH 
statement. Because REFRESH now
: requires a table name parameter, to flush the metadata 
for all tables at once, use the INVALIDATE
: METADATA statement.
:   
:   
: 
> This section is very outdated and mixes up usage of old vs. new usage of RE
Done


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@99
PS1, Line 99: instance
> Don't mention individual instances in this context, it's the service as a w
Done


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@100
PS1, Line 100: required if a change is made from another 
impalad
 :   instance in your cluster, or through Hive and is 
distributed by
 : catalogd.
> INVALIDATE METADATA is required when:
Done


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@106
PS1, Line 106: same Impala node
> This is pre-1.2 information. No INVALIDATE is needed as long as the changes
Done


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@127
PS1, Line 127: INVALIDATE METADATA causes the metadata for 
that table to be marked as stale, and reloaded
 :   the next time the table is referenced. For a huge table, 
that process could take a noticeable amount of time;
> Repeat of L44-47
The whole paragraph is replaced.


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@129
PS1, Line 129: thus you might prefer to use REFRESH where 
practical, to avoid an unpredictable delay later,
 :   for example if the next reference to the table is during a 
benchmark test.
> Key information missing: use REFRESH after invalidating a specific table to
Done


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@137
PS1, Line 137: (such as SequenceFile or HBase tables)
> remove
Done


http://gerrit.cloudera.o

[Impala-ASF-CR] IMPALA-6923:Update scripts in benchmark folder to store workload and few minor updates

2018-05-09 Thread Nithya Janarthanan (Code Review)
Nithya Janarthanan has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/10100 )

Change subject: IMPALA-6923:Update scripts in benchmark folder to store 
workload and few minor updates
..

IMPALA-6923:Update scripts in benchmark folder to store workload and few minor 
updates

Updates to the scripts in benchmark folder to
 - Store test execution results to ExecutionResults table
 - Store workload metrics to WorkloadMetrics
 - Option to mark a performance run as official
 - Option to skip saving of profiles

Testing: Executed these script to create the database and
populate the tables with the above mentioned data

Change-Id: Ica7c00ad59963d466bae9e607a4692af0138962c
---
M tests/benchmark/create_database.py
M tests/benchmark/perf_result_datastore.py
M tests/benchmark/report_benchmark_results.py
3 files changed, 252 insertions(+), 74 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/10100/8
--
To view, visit http://gerrit.cloudera.org:8080/10100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ica7c00ad59963d466bae9e607a4692af0138962c
Gerrit-Change-Number: 10100
Gerrit-PatchSet: 8
Gerrit-Owner: Nithya Janarthanan 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Nithya Janarthanan 


[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10291 )

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10291
Gerrit-PatchSet: 6
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 09 May 2018 22:27:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10291 )

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..

IMPALA-6948,IMPALA-6962: add end-to-end tests

Adds end-to-end tests to validate that following
various metadata operations, the catalog state
in catalogd and impalads is the same.

For IMPALA-6962, catalogd process restart for tests
is fixed.

Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Reviewed-on: http://gerrit.cloudera.org:8080/10291
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins 
---
M be/src/catalog/catalog-server.cc
M tests/common/impala_cluster.py
M tests/common/impala_service.py
A tests/custom_cluster/test_metadata_replicas.py
M tests/metadata/test_hms_integration.py
A tests/util/hive_utils.py
6 files changed, 242 insertions(+), 68 deletions(-)

Approvals:
  Alex Behm: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10291
Gerrit-PatchSet: 7
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10306 )

Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2437/


--
To view, visit http://gerrit.cloudera.org:8080/10306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Gerrit-Change-Number: 10306
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 09 May 2018 21:59:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2

2018-05-09 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10306 )

Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Gerrit-Change-Number: 10306
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 09 May 2018 21:11:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2

2018-05-09 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10306 )

Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2
..


Patch Set 4:

Fixed variable initialization. Tested dataload locally.


--
To view, visit http://gerrit.cloudera.org:8080/10306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Gerrit-Change-Number: 10306
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 09 May 2018 21:02:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6972: Disable parallel dataload on MINICLUSTER PROFILE=2

2018-05-09 Thread Joe McDonnell (Code Review)
Hello Philip Zeyliger, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10306

to look at the new patch set (#4).

Change subject: IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2
..

IMPALA-6972: Disable parallel dataload on MINICLUSTER_PROFILE=2

There is a Hive bug in Hive 1.1.0 that can result
in a NullPointerException when doing parallel Hive
operations (see IMPALA-6532). Since dataload goes
parallel on Hive loads starting with IMPALA-6372,
dataload can hit this error on Hive 1.1.0 (i.e.
IMPALA_MINICLUSTER_PROFILE=2). This is impacting
builds on the 2.x branch.

This disables parallel dataload for IMPALA_MINICLUSTER_PROFILE=2.

IMPALA_MINICLUSTER_PROFILE=3 uses a newer version
of Hive that has a fix for this, so this continues
to use parallel dataload for that case.

Parallelism can be reenabled when Hive 1.1.0 gets the
fix from Hive 2.1.1.

Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
---
M bin/impala-config.sh
M bin/load-data.py
M testdata/bin/create-load-data.sh
3 files changed, 11 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/10306/4
--
To view, visit http://gerrit.cloudera.org:8080/10306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I90a0f2b3756d7192fa7db2958031b8c88eb606e6
Gerrit-Change-Number: 10306
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-09 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 15:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h@341
PS15, Line 341: failed_fragment'
nit: update to "failed_finstance"


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h@414
PS15, Line 414:
nit: extra space


http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@121
PS14, Line 121: s not thread
should we call this a non-OK status, because for cancellation through Cancel() 
we should not call it error


http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@131
PS14, Line 131: tate_
do you mean cancels?


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@147
PS15, Line 147: Status prepare_status = query_state_->WaitForPrepare();
  :   DCHECK(!prepare_status.ok());
  :   return prepare_status;
should we call UpdateExecStateIfError() here?


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@486
PS15, Line 486: !status.ok() && exec_status_.IsCancelled()
can we have a case with  =  ?


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@522
PS15, Line 522: // TODO: is this needed? This will also happen on the "backend" 
side of cancel rpc.
  :   // And in the case of EOS, sink already knows this.
  :   if (coord_sink_ != nullptr) coord_sink_->CloseConsumer();
I dont think we need this, I looked at why we originally needed this and it was 
because the fragment lifecycle was a little different and we needed to to close 
the sink in case the Prepare phase of the fragments failed. After patch 
IMPALA-2550, we handle partially-prepared state failure differently and hence 
dont require this anymore.

Moreover, since this was used to close partially-prepared state failure and now 
if we fail prepare, coord_sink_ is not even set, so this is essentially 
redundant.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@530
PS15, Line 530: ensure resources are
  : // really freed before admitting more queries?
in case of cancelBackends(), should we also wait for resources to be freed?


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@605
PS15, Line 605: UpdateExecStateIfError
maybe change the name to something like UpdateExecState, since this is also 
used to only get the overall status and it may or may not be an error status


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@639
PS15, Line 639: // Wait until backends have started executing, otherwise the 
cancel rpc might arrive
  :   // before the exec rpc.
  :   exec_rpcs_complete_barrier_->Wait();
does this mean we can call cancel before Exec() completes? what if cancel is 
called before exec_rpcs_complete_barrier_ is initialized and is still nullptr.
We should probably just mark this method to only be called after Exec completes 
because currently that how it is used.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@943
PS15, Line 943: backend_states_init_lock_
Do we expect these methods to be called before Exec() completes? If not, we can 
get rid of backend_states_init_lock_ and mark these methods to only be called 
after Exec()



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 15
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 May 2018 20:55:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6995: avoid DCHECK in TimestampParse::Parse()

2018-05-09 Thread Tim Armstrong (Code Review)
Hello Kim Jin Chul,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10349

to look at the new patch set (#2).

Change subject: IMPALA-6995: avoid DCHECK in TimestampParse::Parse()
..

IMPALA-6995: avoid DCHECK in TimestampParse::Parse()

The bug was that the string's length is checked before trimming leading
and trailing spaces instead of afterwards. The bug has been present for
a long time but couldn't hit a DCHECK until recently.

Testing:
Added some backend tests that reproduce the crash.

Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1
---
M be/src/exprs/expr-test.cc
M be/src/runtime/timestamp-parse-util.cc
2 files changed, 56 insertions(+), 51 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/10349/2
--
To view, visit http://gerrit.cloudera.org:8080/10349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1
Gerrit-Change-Number: 10349
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR](asf-site) [DOCS] Impala doc site update for 3.0

2018-05-09 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10322 )

Change subject: [DOCS] Impala doc site update for 3.0
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10322
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: comment
Gerrit-Change-Id: Icf5927efa7baa965095a3ff2fd4ec7411313342d
Gerrit-Change-Number: 10322
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Ambreen Kazi 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 09 May 2018 20:55:18 +
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) [DOCS] Impala doc site update for 3.0

2018-05-09 Thread Alex Rodoni (Code Review)
Alex Rodoni has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10322 )

Change subject: [DOCS] Impala doc site update for 3.0
..

[DOCS] Impala doc site update for 3.0

Add Impala docs from branch master.
Commit hash: f20415755401d56103df7f16348cea8ed12fb3d8

Change-Id: Icf5927efa7baa965095a3ff2fd4ec7411313342d
Reviewed-on: http://gerrit.cloudera.org:8080/10322
Reviewed-by: Michael Brown 
Tested-by: Alex Rodoni 
---
A docs/build3x/html/commonltr.css
A docs/build3x/html/commonrtl.css
A docs/build3x/html/images/impala_arch.jpeg
A docs/build3x/html/index.html
A docs/build3x/html/topics/impala_abort_on_error.html
A docs/build3x/html/topics/impala_adls.html
A docs/build3x/html/topics/impala_admin.html
A docs/build3x/html/topics/impala_admission.html
A docs/build3x/html/topics/impala_aggregate_functions.html
A docs/build3x/html/topics/impala_aliases.html
A docs/build3x/html/topics/impala_allow_unsupported_formats.html
A docs/build3x/html/topics/impala_alter_table.html
A docs/build3x/html/topics/impala_alter_view.html
A docs/build3x/html/topics/impala_analytic_functions.html
A docs/build3x/html/topics/impala_appx_count_distinct.html
A docs/build3x/html/topics/impala_appx_median.html
A docs/build3x/html/topics/impala_array.html
A docs/build3x/html/topics/impala_auditing.html
A docs/build3x/html/topics/impala_authentication.html
A docs/build3x/html/topics/impala_authorization.html
A docs/build3x/html/topics/impala_avg.html
A docs/build3x/html/topics/impala_avro.html
A docs/build3x/html/topics/impala_batch_size.html
A docs/build3x/html/topics/impala_bigint.html
A docs/build3x/html/topics/impala_bit_functions.html
A docs/build3x/html/topics/impala_boolean.html
A docs/build3x/html/topics/impala_breakpad.html
A docs/build3x/html/topics/impala_buffer_pool_limit.html
A docs/build3x/html/topics/impala_char.html
A docs/build3x/html/topics/impala_comments.html
A docs/build3x/html/topics/impala_complex_types.html
A docs/build3x/html/topics/impala_components.html
A docs/build3x/html/topics/impala_compression_codec.html
A docs/build3x/html/topics/impala_compute_stats.html
A docs/build3x/html/topics/impala_compute_stats_min_sample_size.html
A docs/build3x/html/topics/impala_concepts.html
A docs/build3x/html/topics/impala_conditional_functions.html
A docs/build3x/html/topics/impala_config.html
A docs/build3x/html/topics/impala_config_options.html
A docs/build3x/html/topics/impala_config_performance.html
A docs/build3x/html/topics/impala_connecting.html
A docs/build3x/html/topics/impala_conversion_functions.html
A docs/build3x/html/topics/impala_count.html
A docs/build3x/html/topics/impala_create_database.html
A docs/build3x/html/topics/impala_create_function.html
A docs/build3x/html/topics/impala_create_role.html
A docs/build3x/html/topics/impala_create_table.html
A docs/build3x/html/topics/impala_create_view.html
A docs/build3x/html/topics/impala_databases.html
A docs/build3x/html/topics/impala_datatypes.html
A docs/build3x/html/topics/impala_datetime_functions.html
A docs/build3x/html/topics/impala_ddl.html
A docs/build3x/html/topics/impala_debug_action.html
A docs/build3x/html/topics/impala_decimal.html
A docs/build3x/html/topics/impala_decimal_v2.html
A docs/build3x/html/topics/impala_default_join_distribution_mode.html
A docs/build3x/html/topics/impala_default_spillable_buffer_size.html
A docs/build3x/html/topics/impala_delegation.html
A docs/build3x/html/topics/impala_delete.html
A docs/build3x/html/topics/impala_describe.html
A docs/build3x/html/topics/impala_development.html
A docs/build3x/html/topics/impala_disable_codegen.html
A docs/build3x/html/topics/impala_disable_row_runtime_filtering.html
A docs/build3x/html/topics/impala_disable_streaming_preaggregations.html
A docs/build3x/html/topics/impala_disable_unsafe_spills.html
A docs/build3x/html/topics/impala_disk_space.html
A docs/build3x/html/topics/impala_distinct.html
A docs/build3x/html/topics/impala_dml.html
A docs/build3x/html/topics/impala_double.html
A docs/build3x/html/topics/impala_drop_database.html
A docs/build3x/html/topics/impala_drop_function.html
A docs/build3x/html/topics/impala_drop_role.html
A docs/build3x/html/topics/impala_drop_stats.html
A docs/build3x/html/topics/impala_drop_table.html
A docs/build3x/html/topics/impala_drop_view.html
A docs/build3x/html/topics/impala_exec_single_node_rows_threshold.html
A docs/build3x/html/topics/impala_exec_time_limit_s.html
A docs/build3x/html/topics/impala_explain.html
A docs/build3x/html/topics/impala_explain_level.html
A docs/build3x/html/topics/impala_explain_plan.html
A docs/build3x/html/topics/impala_faq.html
A docs/build3x/html/topics/impala_file_formats.html
A docs/build3x/html/topics/impala_fixed_issues.html
A docs/build3x/html/topics/impala_float.html
A docs/build3x/html/topics/impala_functions.html
A docs/build3x/html/topics/impala_functions_overview.html
A docs/build3x/html/topics/impala_grant.html
A do

[Impala-ASF-CR] IMPALA-6995: avoid DCHECK in TimestampParse::Parse()

2018-05-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10349 )

Change subject: IMPALA-6995: avoid DCHECK in TimestampParse::Parse()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10349/1/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/10349/1/be/src/runtime/timestamp-parse-util.cc@399
PS1, Line 399: *d = boost::gregorian::date();
 : *t = 
boost::posix_time::time_duration(boost::posix_time::not_a_date_time);
> From a code refactoring perspective, I would suggest you make a new static
Created a helper function. I actually ended up refactoring the handling of 
lengths in this function too in order to better distinguish the different 
length values.



--
To view, visit http://gerrit.cloudera.org:8080/10349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1
Gerrit-Change-Number: 10349
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 May 2018 20:54:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR](asf-site) [DOCS] Impala doc site update for 3.0

2018-05-09 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10322 )

Change subject: [DOCS] Impala doc site update for 3.0
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10322
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: comment
Gerrit-Change-Id: Icf5927efa7baa965095a3ff2fd4ec7411313342d
Gerrit-Change-Number: 10322
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Ambreen Kazi 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 09 May 2018 20:53:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT

2018-05-09 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10360


Change subject: Update version to 3.1.0-SNAPSHOT
..

Update version to 3.1.0-SNAPSHOT

Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167
---
M bin/save-version.sh
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/10360/1
--
To view, visit http://gerrit.cloudera.org:8080/10360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167
Gerrit-Change-Number: 10360
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6645: Enable disk spill encryption by default

2018-05-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10345 )

Change subject: IMPALA-6645: Enable disk spill encryption by default
..


Patch Set 5:

Hmm, hit an assertion when running tests. I wonder if this is CentOS 6 versus 
Ubuntu

Log file created at: 2018/05/09 19:16:57
Running on machine: ip-172-31-32-199
Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
F0509 19:16:57.214826 117722 openssl_util.h:201] Check failed: ERR_peek_error() 
== 0 (101171331 vs. 0) Expected no pending OpenSSL errors on kudu::Status 
kudu::security::TlsContext::InitiateHandshake(kudu::security::TlsHandshakeType, 
kudu::security::TlsHandshake*) const entry, but had: error:0607C083:digital 
envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 
error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher 
set:evp_enc.c:610 error:0607C083:digital envelope 
routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital 
envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 
error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher 
set:evp_enc.c:610 error:0607C083:digital envelope 
routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital 
envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 
error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher 
set:evp_enc.c:610 error:0607C083:digital envelope 
routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital 
envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 
error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher 
set:evp_enc.c:610 error:0607C083:digital envelope 
routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 error:0607C083:digital 
envelope routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610 
error:0607C083:digital envelope routines:EVP_CIPHER_CTX_ctrl:no cipher 
set:evp_enc.c:610 error:0607C083:digital envelope 
routines:EVP_CIPHER_CTX_ctrl:no cipher set:evp_enc.c:610


--
To view, visit http://gerrit.cloudera.org:8080/10345
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9
Gerrit-Change-Number: 10345
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 May 2018 20:36:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6645: Enable disk spill encryption by default

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10345 )

Change subject: IMPALA-6645: Enable disk spill encryption by default
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2434/


--
To view, visit http://gerrit.cloudera.org:8080/10345
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9
Gerrit-Change-Number: 10345
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 May 2018 20:27:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6802 (part 3): Clean up authorization tests

2018-05-09 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10358


Change subject: IMPALA-6802 (part 3): Clean up authorization tests
..

IMPALA-6802 (part 3): Clean up authorization tests

The third part of this patch is to rewrite the following authorization
tests:
- with
- union
- reset metadata
- show

Testing:
- Added new authorization tests
- Ran all front-end tests

Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7
Cherry-picks: not for 2.x
---
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java
1 file changed, 436 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/10358/1
--
To view, visit http://gerrit.cloudera.org:8080/10358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7
Gerrit-Change-Number: 10358
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10116 )

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..


Patch Set 9: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 09 May 2018 19:17:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-09 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10116 )

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..


Patch Set 9:

The change broke AuthorizationTest#TestDescribeTableResults. I fixed this in 
patch set 9.


--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 09 May 2018 19:11:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-09 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Zoltan Borok-Nagy, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10116

to look at the new patch set (#9).

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..

IMPALA-6131: Track time of last statistics update in metadata

The timestamp of the last COMPUTE STATS operation is saved to
table property "impala.lastComputeStatsTime". The format is
the same as in "transient_lastDdlTime", so the two can be
compared to check if the schema has changed since computing
statistics.

Other changes:
- Handling of "transient_lastDdlTime" is simplified - the old
  logic set it to current time + 1, if the old version was
  >= current time, to ensure that it is always increased by
  DDL operations. This was useful in the past, as IMPALA-387
  used lastDdlTime to check if partition data needs to be
  reloaded, but since IMPALA-1480, Impala does not rely on
  lastDdlTime at all.

- Computing / setting stats on HDFS tables no longer increases
  "transient_lastDdlTime".

- When Kudu tables are (re)loaded, it is checked if their
  HMS representation is up to date, and if it is, then
  IMetaStoreClient.alter_table() is not called. The old
  logic always called alter_table() after loading metadata
  from Kudu. This change was needed to ensure that
  "transient_lastDdlTime" works similarly in HDFS and Kudu
  tables, and should also make (re)loading Kudu tables faster.

Notes:
- Kudu will be able to sync its tables to HMS in the near
  future (see KUDU-2191), so the Kudu metadata handling in
  Impala may need to be redesigned.

Testing:
tests/metadata/test_last_ddl_time_update.py is extended by
- also checking "impala.lastComputeStatsTime"
- testing more SQL statements
- tests for Kudu tables

Note that test_last_ddl_time_update.py is ran only in
exhaustive testing.

Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M tests/metadata/test_last_ddl_time_update.py
7 files changed, 202 insertions(+), 161 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/10116/9
--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6781: expand ORDER BY in some TPCH queries

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10351 )

Change subject: IMPALA-6781: expand ORDER BY in some TPCH queries
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10351
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If74d127fb57546b1948a34aa6d2e68cdc6880fae
Gerrit-Change-Number: 10351
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 09 May 2018 19:06:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6781: expand ORDER BY in some TPCH queries

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10351 )

Change subject: IMPALA-6781: expand ORDER BY in some TPCH queries
..

IMPALA-6781: expand ORDER BY in some TPCH queries

Fix determinism in TPCH Q3, Q10, Q18 by adding another column to the
queries' ORDER BY to guarantee deterministic results. With TPCH 1
these queries were producing differing results across stress test runs.
They were all valid, but the LIMIT without the more specific ORDER BY
meant that several different result sets were possible. By adding these
columns, we sort by a column that has uniqueness across all rows
returned.

Testing:

Repeated runs of these specific TPCH queries via:

  impala-py.test -k Q18 tests/query_test/test_tpch_queries.py

Stress test on a 140-node cluster with TPCH 1 loaded. Previously when
using these queries, the stress test would incorrectly report incorrect
results.

Change-Id: If74d127fb57546b1948a34aa6d2e68cdc6880fae
Reviewed-on: http://gerrit.cloudera.org:8080/10351
Reviewed-by: Michael Brown 
Tested-by: Impala Public Jenkins 
---
M testdata/workloads/tpch/queries/tpch-q10.test
M testdata/workloads/tpch/queries/tpch-q18.test
M testdata/workloads/tpch/queries/tpch-q3.test
3 files changed, 12 insertions(+), 6 deletions(-)

Approvals:
  Michael Brown: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10351
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If74d127fb57546b1948a34aa6d2e68cdc6880fae
Gerrit-Change-Number: 10351
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10291 )

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2435/


--
To view, visit http://gerrit.cloudera.org:8080/10291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10291
Gerrit-PatchSet: 6
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 09 May 2018 18:45:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10291 )

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10291
Gerrit-PatchSet: 6
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 09 May 2018 18:25:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-09 Thread Vuk Ercegovac (Code Review)
Hello Dimitris Tsirogiannis, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10291

to look at the new patch set (#6).

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..

IMPALA-6948,IMPALA-6962: add end-to-end tests

Adds end-to-end tests to validate that following
various metadata operations, the catalog state
in catalogd and impalads is the same.

For IMPALA-6962, catalogd process restart for tests
is fixed.

Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
---
M be/src/catalog/catalog-server.cc
M tests/common/impala_cluster.py
M tests/common/impala_service.py
A tests/custom_cluster/test_metadata_replicas.py
M tests/metadata/test_hms_integration.py
A tests/util/hive_utils.py
6 files changed, 242 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/10291/6
--
To view, visit http://gerrit.cloudera.org:8080/10291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10291
Gerrit-PatchSet: 6
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-09 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10291 )

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10291/3/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/10291/3/be/src/catalog/catalog-server.cc@304
PS3, Line 304: void CatalogServer::CatalogUrlCallback(const 
Webserver::ArgumentMap& args,
> I just think it's weird that we don't show the dbs if there was an error wi
done. makes sense, I was treating it as all-or-nothing.


http://gerrit.cloudera.org:8080/#/c/10291/5/tests/custom_cluster/test_metadata_replicas.py
File tests/custom_cluster/test_metadata_replicas.py:

http://gerrit.cloudera.org:8080/#/c/10291/5/tests/custom_cluster/test_metadata_replicas.py@50
PS5, Line 50:   self.client.execute("invalidate metadata 
functional.alltypes")
> check that the catalog version is at least 50 at this point to make debuggi
Done



--
To view, visit http://gerrit.cloudera.org:8080/10291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10291
Gerrit-PatchSet: 5
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 09 May 2018 18:23:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests

2018-05-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10291 )

Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10291/3/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/10291/3/be/src/catalog/catalog-server.cc@304
PS3, Line 304: void CatalogServer::CatalogUrlCallback(const 
Webserver::ArgumentMap& args,
> did it this way to share the error handling and avoid multiple "error" fiel
I just think it's weird that we don't show the dbs if there was an error with 
getting the catalog version. We return in L315 even though the dbs are just 
fine. Might make debugging slightly harder if something goes wrong.

Don't think we need to do something too complicated, just want to be sure we 
can easily pinpoint the error if something does wrong.


http://gerrit.cloudera.org:8080/#/c/10291/5/tests/custom_cluster/test_metadata_replicas.py
File tests/custom_cluster/test_metadata_replicas.py:

http://gerrit.cloudera.org:8080/#/c/10291/5/tests/custom_cluster/test_metadata_replicas.py@50
PS5, Line 50:   self.client.execute("invalidate metadata 
functional.alltypes")
check that the catalog version is at least 50 at this point to make debugging 
easier if something goes wrong (e.g. we  purposely or accidentally change no-op 
ddls to not increment the version)



--
To view, visit http://gerrit.cloudera.org:8080/10291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5
Gerrit-Change-Number: 10291
Gerrit-PatchSet: 5
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 09 May 2018 18:08:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members

2018-05-09 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10327 )

Change subject: IMPALA-6907: Close stale connections to removed cluster members
..


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10327/3/tests/custom_cluster/test_restart_services.py
File tests/custom_cluster/test_restart_services.py:

http://gerrit.cloudera.org:8080/#/c/10327/3/tests/custom_cluster/test_restart_services.py@57
PS3, Line 57:   def test_restart_impala(self):
> Yes, the way the client cache code is structured makes it hard to discern w
We technically could introduce a test only function that checks the number of 
stale connections in the cache, and have a way to call that from here, but I 
think that's not necessary, and it's going to probably introduce a lot more 
complexity.

So, I think this is fine for now.



--
To view, visit http://gerrit.cloudera.org:8080/10327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d
Gerrit-Change-Number: 10327
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Wed, 09 May 2018 18:00:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files

2018-05-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9693 )

Change subject: IMPALA-5842: Write page index in Parquet files
..


Patch Set 16:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@1225
PS15, Line 1225:
> Switched to ++i.
Ah, I had overlooked row_group->columns[i]


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/util/string-util-test.cc
File be/src/util/string-util-test.cc:

http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/util/string-util-test.cc@33
PS15, Line 33: void EvalTruncation(const string& original, const string& 
expected_result,
> In the test it was easier to organize the code like this.
Let's leave it as it is, I also don't feel strongly about it.


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/util/string-util-test.cc@68
PS15, Line 68:   EvalTruncation(normal_plus_max, "abcdeg" + string(4, 0), 10, 
UP);
> Added new tess. Already had tests with 0xFF at L57-L66
Thanks for adding more tests. I meant the exact 3-character string consisting 
of {0, (char)0x7F, (char)0xFF} to make sure that truncating up would hit the 
signed char overflow. Apologies for not being more clear.


http://gerrit.cloudera.org:8080/#/c/9693/16/be/src/util/string-util.cc
File be/src/util/string-util.cc:

http://gerrit.cloudera.org:8080/#/c/9693/16/be/src/util/string-util.cc@46
PS16, Line 46:   unsigned char uch = (*result)[i];
I think this could use a comment and/or an explicit cast to make the intent 
clear.


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py
File tests/query_test/test_parquet_page_index.py:

http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@36
PS16, Line 36: PAGE_INDEX_TRUNCATE_LENGTH
update


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@55
PS16, Line 55: """
nit: PEP8 says the closing """ should be on a new line. You can use flake8 to 
check the whole file. It was recently added to our codebase: 
https://github.com/apache/impala/commit/2fb73f94b425fde488166a19f78050ddbc3d7b50


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@90
PS16, Line 90:   column_info = ColumnInfo._make((schema, stats, 
offset_index, column_index,
I think you should be able to just pass all the parameters to the ctor 
"ColumnInfo(schema, stats,...). Currently you wrap it in a tuple to get an 
iterable to pass to _make(), but that should not be necessary.


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@96
PS16, Line 96: nametuples
"ColumnInfo objects" or "column infos"


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@97
PS16, Line 97: a row group
the first row group?


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@110
PS16, Line 110: current_loc in page_locations[1:]:
You can zip a list with itself to create (prev, cur) pairs:

 In [6]: l = range(5)

 In [7]: zip(l[:-1], l[1:])
 Out[7]: [(0, 1), (1, 2), (2, 3), (3, 4)]


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@124
PS16, Line 124: null_page
nit: page_is_null might be more clear


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@182
PS16, Line 182: previous_max = None
Would it make sense to validate the ordering instead of re-computing it? I.e., 
extract all the min_values and max_values into lists and then just all 
assert(sorted(min_values))?


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@214
PS16, Line 214: sorting_column
unused?


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@216
PS16, Line 216: skip_col_idx
Stale comment?


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@249
PS16, Line 249: vector.get_value('exec_option')['num_nodes'] = 1
move closer to the comment explaining the intent


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@321
PS16, Line 321: ends
nit: end


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/query_test/test_parquet_page_index.py@348
PS16, Line 348: followed by
  : # zeroes.
I had missed this looking at the truncation code. Why do we keep the zeroes at 
the end?


http://gerrit.cloudera.org:8080/#/c/9693/16/tests/util/get_parquet_metadata.py
File tests/util/get_parquet_metadata.py:

http://gerrit.cloudera.org:8080/#/c/9693/16/tests/util/get_parquet_metadata.py@31
PS16, Line 31: The range must start with a serialized
 :   thrift object.
That does not actually seem to be a requirement of th

[Impala-ASF-CR](asf-site) [DOCS] Impala doc site update for 3.0

2018-05-09 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10322 )

Change subject: [DOCS] Impala doc site update for 3.0
..


Patch Set 5:

The 3.0 change log was committed. Could you review this again? Thanks!


--
To view, visit http://gerrit.cloudera.org:8080/10322
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: comment
Gerrit-Change-Id: Icf5927efa7baa965095a3ff2fd4ec7411313342d
Gerrit-Change-Number: 10322
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Ambreen Kazi 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 09 May 2018 17:45:20 +
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) Update download and signature links for 3.0.0 release.

2018-05-09 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10333 )

Change subject: Update download and signature links for 3.0.0 release.
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: comment
Gerrit-Change-Id: I307d786bf4709d24f1a9511dcc0e6759183e8eae
Gerrit-Change-Number: 10333
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 09 May 2018 17:39:42 +
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) Added changelog for 3.0.0

2018-05-09 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10334 )

Change subject: Added changelog for 3.0.0
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10334
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: comment
Gerrit-Change-Id: I895f54fc47bada6b5131f20167405781c7d284b5
Gerrit-Change-Number: 10334
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 09 May 2018 17:39:53 +
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) Added changelog for 3.0.0

2018-05-09 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10334 )

Change subject: Added changelog for 3.0.0
..

Added changelog for 3.0.0

Change-Id: I895f54fc47bada6b5131f20167405781c7d284b5
Reviewed-on: http://gerrit.cloudera.org:8080/10334
Reviewed-by: Michael Brown 
Tested-by: Sailesh Mukil 
---
A docs/changelog-3.0.html
1 file changed, 410 insertions(+), 0 deletions(-)

Approvals:
  Michael Brown: Looks good to me, approved
  Sailesh Mukil: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10334
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: merged
Gerrit-Change-Id: I895f54fc47bada6b5131f20167405781c7d284b5
Gerrit-Change-Number: 10334
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR](asf-site) Update download and signature links for 3.0.0 release.

2018-05-09 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10333 )

Change subject: Update download and signature links for 3.0.0 release.
..

Update download and signature links for 3.0.0 release.

Change-Id: I307d786bf4709d24f1a9511dcc0e6759183e8eae
Reviewed-on: http://gerrit.cloudera.org:8080/10333
Reviewed-by: Michael Brown 
Tested-by: Sailesh Mukil 
---
M downloads.html
1 file changed, 26 insertions(+), 15 deletions(-)

Approvals:
  Michael Brown: Looks good to me, approved
  Sailesh Mukil: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: merged
Gerrit-Change-Id: I307d786bf4709d24f1a9511dcc0e6759183e8eae
Gerrit-Change-Number: 10333
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6645: Enable disk spill encryption by default

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10345 )

Change subject: IMPALA-6645: Enable disk spill encryption by default
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2434/


--
To view, visit http://gerrit.cloudera.org:8080/10345
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9
Gerrit-Change-Number: 10345
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 May 2018 17:21:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6645: Enable disk spill encryption by default

2018-05-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10345 )

Change subject: IMPALA-6645: Enable disk spill encryption by default
..


Patch Set 5: Code-Review+2

carry


--
To view, visit http://gerrit.cloudera.org:8080/10345
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9
Gerrit-Change-Number: 10345
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 May 2018 17:20:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6645: Enable disk spill encryption by default

2018-05-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10345 )

Change subject: IMPALA-6645: Enable disk spill encryption by default
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10345/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10345/3//COMMIT_MSG@7
PS3, Line 7: IMPALA-6635:
> IMPALA-6645
Done (this also addresses Kim's comment).


http://gerrit.cloudera.org:8080/#/c/10345/3//COMMIT_MSG@11
PS3, Line 11: PCLMULQDQ
> which CPUs introduced that?
Added to comment. Any server-class x86 processors < 5 years will have it.


http://gerrit.cloudera.org:8080/#/c/10345/3/tests/custom_cluster/test_disk_spill_configurations.py
File tests/custom_cluster/test_disk_spill_configurations.py:

http://gerrit.cloudera.org:8080/#/c/10345/3/tests/custom_cluster/test_disk_spill_configurations.py@33
PS3, Line 33: with a customer
> what does that mean?
I'm not sure what I meant.



--
To view, visit http://gerrit.cloudera.org:8080/10345
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9
Gerrit-Change-Number: 10345
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 May 2018 17:20:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6645: Enable disk spill encryption by default

2018-05-09 Thread Tim Armstrong (Code Review)
Hello Dan Hecht, Kim Jin Chul,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10345

to look at the new patch set (#4).

Change subject: IMPALA-6645: Enable disk spill encryption by default
..

IMPALA-6645: Enable disk spill encryption by default

Perf:
Targeted benchmarks with a heavily spilling query on a machine
with PCLMULQDQ support show < 5% of CPU time spent in encryption and
decryption. PCLMULQDQ was introduced in AMD Bulldozer (c. 2011)
and Intel Westmere (c. 2010).

Testing:
Ran core tests with the change.

Updated the custom cluster test to exercise the non-default
configuration.

Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9
---
M be/src/runtime/tmp-file-mgr.cc
R testdata/workloads/functional-query/queries/QueryTest/basic-spilling.test
R tests/custom_cluster/test_disk_spill_configurations.py
3 files changed, 9 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/10345/4
--
To view, visit http://gerrit.cloudera.org:8080/10345
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9
Gerrit-Change-Number: 10345
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Kim Jin Chul 


[Impala-ASF-CR] IMPALA-7000: [DOCS] Correct info about dedicated executors

2018-05-09 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10357


Change subject: IMPALA-7000: [DOCS] Correct info about dedicated executors
..

IMPALA-7000: [DOCS] Correct info about dedicated executors

Change-Id: I4b7e6c57188a41a45d5813882b6dbc37cf47cf1f
---
M docs/topics/impala_scalability.xml
1 file changed, 7 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/10357/1
--
To view, visit http://gerrit.cloudera.org:8080/10357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4b7e6c57188a41a45d5813882b6dbc37cf47cf1f
Gerrit-Change-Number: 10357
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10317 )

Change subject: IMPALA-6980: [DOCS] You can add or change column comment in 
ALTER TABLE
..

IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE

Change-Id: Ia317a4b74d96aa064d375f6afc114f2cc8d30bf4
Reviewed-on: http://gerrit.cloudera.org:8080/10317
Reviewed-by: Fredy Wijaya 
Reviewed-by: Alex Rodoni 
Tested-by: Impala Public Jenkins 
---
M docs/topics/impala_alter_table.xml
1 file changed, 2 insertions(+), 2 deletions(-)

Approvals:
  Fredy Wijaya: Looks good to me, but someone else must approve
  Alex Rodoni: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia317a4b74d96aa064d375f6afc114f2cc8d30bf4
Gerrit-Change-Number: 10317
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10317 )

Change subject: IMPALA-6980: [DOCS] You can add or change column comment in 
ALTER TABLE
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia317a4b74d96aa064d375f6afc114f2cc8d30bf4
Gerrit-Change-Number: 10317
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 09 May 2018 16:42:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10317 )

Change subject: IMPALA-6980: [DOCS] You can add or change column comment in 
ALTER TABLE
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-docs-submit/287/


--
To view, visit http://gerrit.cloudera.org:8080/10317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia317a4b74d96aa064d375f6afc114f2cc8d30bf4
Gerrit-Change-Number: 10317
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 09 May 2018 16:38:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6980: [DOCS] You can add or change column comment in ALTER TABLE

2018-05-09 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10317 )

Change subject: IMPALA-6980: [DOCS] You can add or change column comment in 
ALTER TABLE
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10317
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia317a4b74d96aa064d375f6afc114f2cc8d30bf4
Gerrit-Change-Number: 10317
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 09 May 2018 16:38:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6635: Enable disk spill encryption by default

2018-05-09 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10345 )

Change subject: IMPALA-6635: Enable disk spill encryption by default
..


Patch Set 3: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10345/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10345/3//COMMIT_MSG@7
PS3, Line 7: IMPALA-6635:
IMPALA-6645


http://gerrit.cloudera.org:8080/#/c/10345/3//COMMIT_MSG@11
PS3, Line 11: PCLMULQDQ
which CPUs introduced that?


http://gerrit.cloudera.org:8080/#/c/10345/3/tests/custom_cluster/test_disk_spill_configurations.py
File tests/custom_cluster/test_disk_spill_configurations.py:

http://gerrit.cloudera.org:8080/#/c/10345/3/tests/custom_cluster/test_disk_spill_configurations.py@33
PS3, Line 33: with a customer
what does that mean?



--
To view, visit http://gerrit.cloudera.org:8080/10345
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee4be2a95d689f66c3663d99e4df0fb3968893a9
Gerrit-Change-Number: 10345
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Comment-Date: Wed, 09 May 2018 15:58:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR](asf-site) Update download and signature links for 3.0.0 release.

2018-05-09 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10333 )

Change subject: Update download and signature links for 3.0.0 release.
..


Patch Set 1: Code-Review+2

Oops, I'll try to remember that in the future.


--
To view, visit http://gerrit.cloudera.org:8080/10333
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: comment
Gerrit-Change-Id: I307d786bf4709d24f1a9511dcc0e6759183e8eae
Gerrit-Change-Number: 10333
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 09 May 2018 15:54:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6781: expand ORDER BY in some TPCH queries

2018-05-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10351 )

Change subject: IMPALA-6781: expand ORDER BY in some TPCH queries
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2432/


--
To view, visit http://gerrit.cloudera.org:8080/10351
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If74d127fb57546b1948a34aa6d2e68cdc6880fae
Gerrit-Change-Number: 10351
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 09 May 2018 15:43:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6781: expand ORDER BY in some TPCH queries

2018-05-09 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10351 )

Change subject: IMPALA-6781: expand ORDER BY in some TPCH queries
..


Patch Set 3: Code-Review+2

IMPALA-6970


--
To view, visit http://gerrit.cloudera.org:8080/10351
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If74d127fb57546b1948a34aa6d2e68cdc6880fae
Gerrit-Change-Number: 10351
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 09 May 2018 15:43:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure

2018-05-09 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10347 )

Change subject: Fix diagnostics path to not include the parent dir structure
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10347/1/bin/diagnostics/collect_diagnostics.py
File bin/diagnostics/collect_diagnostics.py:

http://gerrit.cloudera.org:8080/#/c/10347/1/bin/diagnostics/collect_diagnostics.py@451
PS1, Line 451: arcname=os.path.basename(self.collection_root_dir))
> Some comments will be helpful to understand why an alternative name is requ
Done



--
To view, visit http://gerrit.cloudera.org:8080/10347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
Gerrit-Change-Number: 10347
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 09 May 2018 15:18:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Fix diagnostics path to not include the parent dir structure

2018-05-09 Thread Bharath Vissapragada (Code Review)
Hello Philip Zeyliger, Kim Jin Chul,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10347

to look at the new patch set (#2).

Change subject: Fix diagnostics path to not include the parent dir structure
..

Fix diagnostics path to not include the parent dir structure

Without the fix, the diagnostics tar file included the entire
directory structure of the diagnostics root dir.

Before:
===
$ tar tf /tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh.tar.gz
tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/
tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/stacks/
tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/stacks/jstack-0.txt


After:
=
$ tar tf /tmp/impala-diagnostics-2018-05-08-12-01-51-Y0nlQI.tar.gz
impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/
impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/stacks/
impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/stacks/jstack-0.txt
.

Tested with python 2.6

Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
---
M bin/diagnostics/collect_diagnostics.py
1 file changed, 4 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/10347/2
--
To view, visit http://gerrit.cloudera.org:8080/10347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
Gerrit-Change-Number: 10347
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files

2018-05-09 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9693 )

Change subject: IMPALA-5842: Write page index in Parquet files
..


Patch Set 15:

(29 comments)

http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.h
File be/src/exec/hdfs-parquet-table-writer.h:

http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.h@108
PS15, Line 108: PAGE_INDEX_TRUNCATE_LENGTH
> PAGE_INDEX_MAX_STRING_LENGTH?
Changed the name


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.h@130
PS15, Line 130: Write
> nit: "Writes"
Done, also fixed the previous and the next comment as well.


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@48
PS15, Line 48: using namespace parquet;
> Can you remove this? I find omitting the parquet:: prefix confusing, and we
I agree, done.


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@109
PS15, Line 109: HdfsSinkTable
> HdfsTableSink?
Done


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@118
PS15, Line 118: 
table_sink_mem_tracker_->Release(page_index_memory_consumption_);
> Can we do this in Close() instead? We have generally been trying to do reso
Done


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@312
PS15, Line 312:   OffsetIndex offset_index_;
> I find it useful to prefix Parquet types with parquet:: and we do that in p
I'm also in favor of prefixing. Done.


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@318
PS15, Line 318:   MemTracker *table_sink_mem_tracker_;
> nit: MemTracker*
Done


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@651
PS15, Line 651:   location.compressed_page_size = -1;
> Probably doesn't have much of an effect, but should we set the size to 0 fo
Yeah, it makes more sense.


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@668
PS15, Line 668: location.compressed_page_size = 
page.header.compressed_page_size + len;
> Should we add a comment for this line to explain that one includes the leng
Added a comment.


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@752
PS15, Line 752:   string min_val, max_val;
> I think we usually do one declaration per line, but I don't feel strongly a
Done


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@754
PS15, Line 754: Status s_min = TruncateMinValue(page_stats.min_value, 
PAGE_INDEX_TRUNCATE_LENGTH,
> Could this truncate non-string values if PAGE_INDEX_TRUNCATE_LENGTH was sma
Added a comment.


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@761
PS15, Line 761: column_index_.null_pages.push_back(true);
> Maybe DCHECK that both are false, to catch errors that result in only one b
Added DCHECK.


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@766
PS15, Line 766: Consume
> We prefer TryConsume() and returning a MEM_LIMIT_EXCEEDED error synchronous
Switched to TryConsume() and check its result.


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@1220
PS15, Line 1220:   // Currently we only support Parquet files with one row 
group.
> Can you add a sentence why this particular code relies on this limited supp
Updated the comment.


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/hdfs-parquet-table-writer.cc@1225
PS15, Line 1225: ++)
> nit: ++i
Switched to ++i.
I need 'i' to index 'columns_' and 'row_group->columns'.
The same stands for the next for-loop.


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/parquet-column-stats.h
File be/src/exec/parquet-column-stats.h:

http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/parquet-column-stats.h@88
PS15, Line 88:   if (a != a && b != b) return 0;
> Can we use std::isnan()?
Yes, done.


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/parquet-column-stats.h@113
PS15, Line 113: Used for merging page statistics into column chunk statistics.
> I think it's better to omit usage from the comment here. Instead say that i
Changed the added comment to your sentence.


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/parquet-column-stats.h@164
PS15, Line 164: Let's a
> nit: "We assume..." so it reads less like a suggestions.
Done


http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/exec/parquet-column-stats.h@165
PS15, Line 165: If not all values are equal
> What happens for NaN, i.e. when all values are NaN?
In ColumnStats::Merge(), MinMaxTrait::Compare() will compare the NaNs, 
and the cu

[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files

2018-05-09 Thread Zoltan Borok-Nagy (Code Review)
Hello Lars Volker, Anonymous Coward #248, Tim Armstrong, Csaba Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9693

to look at the new patch set (#16).

Change subject: IMPALA-5842: Write page index in Parquet files
..

IMPALA-5842: Write page index in Parquet files

This commit builds on the previous work of
Pooja Nilangekar: https://gerrit.cloudera.org/#/c/7464/

The commit implements the write path of PARQUET-922:
"Add column indexes to parquet.thrift". As specified in the
parquet-format, Impala writes the page indexes just before
the footer. This allows much more efficient page filtering
than using the same information from the 'statistics' field
of DataPageHeader.

I updated Pooja's python tests as well.

Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
---
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/parquet-column-stats.h
M be/src/exec/parquet-column-stats.inline.h
M be/src/util/CMakeLists.txt
A be/src/util/string-util-test.cc
A be/src/util/string-util.cc
A be/src/util/string-util.h
M common/thrift/parquet.thrift
M testdata/bin/load-dependent-tables.sql
M tests/query_test/test_chars.py
A tests/query_test/test_parquet_page_index.py
M tests/util/get_parquet_metadata.py
13 files changed, 999 insertions(+), 98 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/9693/16
--
To view, visit http://gerrit.cloudera.org:8080/9693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Gerrit-Change-Number: 9693
Gerrit-PatchSet: 16
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Anonymous Coward #248
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6999: Upgrade to sqlparse-0.1.19 for Impala shell

2018-05-09 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10354 )

Change subject: IMPALA-6999: Upgrade to sqlparse-0.1.19 for Impala shell
..


Patch Set 1:

Jenkins pre-review tests passed: 
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/2124/


--
To view, visit http://gerrit.cloudera.org:8080/10354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ide51ef3ac52d25a96b0fa832e29b6535197d23cb
Gerrit-Change-Number: 10354
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 09 May 2018 14:51:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required

2018-05-09 Thread Balazs Jeszenszky (Code Review)
Balazs Jeszenszky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10339 )

Change subject: IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required
..


Patch Set 1:

(18 comments)

IMO the page overall needs a wider cleanup, commented on that too.

http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml
File docs/topics/impala_invalidate_metadata.xml:

http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@47
PS1, Line 47: relatively
replace: very


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@48
PS1, Line 48: in the common scenario of adding new data files to an existing 
table
replace: whenever possible (link to INVALIDATE vs. REFRESH usage page, if 
exists)


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@59
PS1, Line 59: By default
replace: If there is no table specified


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@60
PS1, Line 60: Even for a single table, INVALIDATE METADATA is 
more expensive
:   than REFRESH, so prefer 
REFRESH in the common case where you add new data
:   files for an existing table.
Same thing mentioned ~10 lines above - remove?


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@69
PS1, Line 69: Therefore, if some other entity modifies information used by 
Impala in the metastore
:   that Impala and Hive share, the information cached by 
Impala must be updated. However, this does not mean
:   that all metadata updates require an Impala update.
This is vague, explicitly state when is manual invalidate needed (see 
L100-102), and remove this.


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@74
PS1, Line 74: 
:   
:   
: In Impala 1.2 and higher, a dedicated daemon 
(catalogd) broadcasts DDL changes made
: through Impala to all Impala nodes. Formerly, after you 
created a database or table while connected to one
: Impala node, you needed to issue an INVALIDATE 
METADATA statement on another Impala node
: before accessing the new database or table from the other 
node. Now, newly created or altered objects are
: picked up automatically by all Impala nodes. You must 
still use the INVALIDATE METADATA
: technique after creating or altering objects through 
Hive. See
:  for 
more information on the catalog service.
:   
:   
: The INVALIDATE METADATA statement is new 
in Impala 1.1 and higher, and takes over some of
: the use cases of the Impala 1.0 REFRESH 
statement. Because REFRESH now
: requires a table name parameter, to flush the metadata 
for all tables at once, use the INVALIDATE
: METADATA statement.
:   
:   
: 
This section is very outdated and mixes up usage of old vs. new usage of 
REFRESH, remove.


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@99
PS1, Line 99: instance
Don't mention individual instances in this context, it's the service as a whole 
that needs update.


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@100
PS1, Line 100: required if a change is made from another 
impalad
 :   instance in your cluster, or through Hive and is 
distributed by
 : catalogd.
INVALIDATE METADATA is required when:
* metadata changes of existing tables are done outside of Impala
* new tables are added outside of Impala, that are to be used by Impala
* SERVER or DATABASE level Sentry privileges are changed outside of Impala
* block metadata changes outside of Impala, but files remain the same (HDFS 
rebalance)
* possibly when UDF jars change (needs verification)

All metadata changes go through catalogd and are distributed by statestored.
'Outside of Impala' for table metadata means Hive and any other Hive client, 
e.g. SparkSQL.


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@106
PS1, Line 106: same Impala node
This is pre-1.2 information. No INVALIDATE is needed as long as the changes are 
done through the Impala service (using any impalad).


http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@127
PS1, Line 127: INVALIDATE METADATA causes the metadata for 
that table to be marked as stale, and reloaded
 :   the next time the table is referenced. For a huge table, 
that process could take a noticeable amount of time;
Repeat of L44-4

[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db

2018-05-09 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9986 )

Change subject: IMPALA-3307: Add support for IANA time-zone db
..


Patch Set 4: Code-Review+1

(6 comments)

Few nits otherwise looks good to me. The LocalToUtc performance part is 
optional - as it does not affect other parts of the code, it can be easily done 
later when general structure is already accepted by other reviewers.

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@526
PS4, Line 526:   const string& tz_name = (start_lookup.abbr != nullptr) ? 
start_lookup.abbr :
 :   context->impl()->state()->local_time_zone()->name();
What is the goal of this logic? To print timezone abbreviations instead of the 
full names, or to distinguish between summer/winter time, or both? A comment 
would be nice, and maybe the logic could be moved to a TimestampValue member 
function like string GetTimezoneName(Timezone*).


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc@93
PS4, Line 93: inline bool CheckIfDateOutOfRange(const cctz::civil_day& date) {
:   static const cctz::civil_day 
max_date(TimestampFunctions::MAX_YEAR, 12, 31);
:   static const cctz::civil_day 
min_date(TimestampFunctions::MIN_YEAR, 1, 1);
:   return date < min_date || date > max_date;
: }
This could be simpler and possibly faster by expecting a cctz::civil_second 
argument and check if 1400<=year<1.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc@128
PS4, Line 128:
cctz explains pretty well the handling of dst boundaries, maybe we could add a 
link to it, for example to 
https://github.com/google/cctz/blob/a2dd3d0fbc811fe0a1d4d2dbb0341f1a3d28cb2a/include/cctz/time_zone.h#L147


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc@129
PS4, Line 129:   // In case of ambiguity invalidate TimestampValue.
 :   const cctz::time_zone::civil_lookup from_cl = 
local_tz->lookup(from_cs);
 :   if (UNLIKELY(from_cl.kind != 
cctz::time_zone::civil_lookup::UNIQUE)) {
 : SetToInvalidDateTime();
 :   } else {
 : cctz::time_point from_tp = from_cl.pre;
 :
 : // Convert 'from_tp' time_point to civil_second assuming 
'UTC' time-zone.
 : cctz::civil_second to_cs = cctz::convert(from_tp, 
TimezoneDatabase::GetUtcTimezone());
 :
 : // boost::gregorian::date() throws 
boost::gregorian::bad_year if year is not in the
 : // 1400.. range. Need to check validity before creating 
the date object.
 : if (UNLIKELY(CheckIfDateOutOfRange(cctz::civil_day(to_cs 
{
I may be possible to get TimestampValue from cctz::time_zone::civil_lookup in a 
faster way - splitting from_tp to a day part (since a constant date) and the 
remainder seconds part is enough for us and should be faster then getting 
cctz::civil_second (which contains year/month/day split).

The code could look something like this:
int64 secs_since_base = from_tp - BASETIME_AS_CCTZ_SYS_SEC;
time_=sec_since_base%(24*60*60)+time_.fractional_seconds();
int32 days_since_base = sec_since_base/(24*60*60);
if(out_of_range(days_since_base)) SetToInvalidDateTime();
date_ = days_since_base - BASEDATE_AS_BOOST_GREG_DATE;


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/runtime/timestamp-value.cc@146
PS4, Line 146:   time_ = boost::posix_time::time_duration(to_cs.hour(), 
to_cs.minute(), to_cs.second(),
nit: long line


http://gerrit.cloudera.org:8080/#/c/9986/3/be/src/util/filesystem-util.h
File be/src/util/filesystem-util.h:

http://gerrit.cloudera.org:8080/#/c/9986/3/be/src/util/filesystem-util.h@66
PS3, Line 66: iff
> "iff" stands for "if and only if".
Thanks for the explanation!



--
To view, visit http://gerrit.cloudera.org:8080/9986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
Gerrit-Change-Number: 9986
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 09 May 2018 11:47:13 +
Gerrit-HasComments: Yes