[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures

2016-08-19 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new patch set (#4).

Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures
..

IMPALA-3962: Clean up scratch tuple batch on scan failures

The parquet scanner doesn't clean up 'scratch_batch_' properly
which causes it to process a partially filled scratch_batch_
if any of the column reader fails. This change cleans up the
scratch batch if any of the parquet column readers fails.
The clean up involves freeing the mem_pool of scratch_batch_
and setting number of tuples in scratch_batch_ to 0.

This change also extends debug action to emulate the behavior
of exceeding the query's memory limit.

Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
---
M be/src/exec/exec-node.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M common/thrift/PlanNodes.thrift
M tests/failure/test_failpoints.py
10 files changed, 77 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/3991/4
-- 
To view, visit http://gerrit.cloudera.org:8080/3991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures

2016-08-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/3991/3/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

PS3, Line 441: Status::MemLimitExceeded();
> shoudl this use mem_tracker()->SetMemLimitExceeded() since Tim is convertin
Done


http://gerrit.cloudera.org:8080/#/c/3991/3/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

PS3, Line 350: column_readers_[0]->RowGroupAtEnd()
> couldn't we have hit column 0's end of rowgroup, but not column 1's, which 
Good point. Reverted to the check in the old code.


http://gerrit.cloudera.org:8080/#/c/3991/3/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

PS3, Line 381: current
> currently
Done


PS3, Line 383: nodes
> node's
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2.

2016-08-19 Thread Youwei Wang (Code Review)
Hello Jim Apple, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3081

to look at the new patch set (#44).

Change subject: IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or 
AVX2.
..

IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2.

Using SSSE3/AVX2 intrinsic to accelerate the function
"static inline void ByteSwap(void* dst, const void* src, int len)" of BitUtil 
class,
and a scalar byte-swap routine is added as fallback.
Also the runtime selector for CPUs of different capacity is included,
as well as performance test and data verification.
Brief performance comparison is listed here:
CPU: Intel(R) Core(TM) i5-4460  CPU@3.20GHz
Result:
I0725 20:47:02.402506  2078 bswap-benchmark.cc:117] Machine Info: Intel(R) 
Core(TM) i5-4460  CPU @ 3.20GHz
ByteSwap benchmark:Function  iters/ms   10%ile   50%ile   90%ile 
10%ile 50%ile 90%ile
 
(relative) (relative) (relative)
-
 FastScalar675  725  731
 1X 1X 1X
  SSSE3   6.12e+03  6.2e+03 6.23e+03  
9.06X  8.55X  8.53X
   AVX2   1.87e+04 1.88e+04 1.89e+04  
27.7X  25.9X  25.9X
   SIMD   1.82e+04 1.88e+04 1.89e+04
27X  25.9X  25.9X

Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/bswap-benchmark.cc
M be/src/exprs/string-functions-ir.cc
M be/src/util/bit-util-test.cc
M be/src/util/bit-util.cc
M be/src/util/bit-util.h
6 files changed, 372 insertions(+), 51 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/81/3081/44
-- 
To view, visit http://gerrit.cloudera.org:8080/3081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1
Gerrit-PatchSet: 44
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Youwei Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Youwei Wang 


[Impala-CR](cdh5-trunk) IMPALA-889: Add support for ISO-SQL trim()

2016-08-19 Thread Youwei Wang (Code Review)
Youwei Wang has posted comments on this change.

Change subject: IMPALA-889: Add support for ISO-SQL trim()
..


Patch Set 10:

Hi Jim. Sorry for this late response. 
Since Impala 2809: https://gerrit.cloudera.org/#/c/3081/ is my current task of 
highest priority, I have to resolve it first. I will try my best to re-pick 
this one as soon as possible. 
Sorry for this.

-- 
To view, visit http://gerrit.cloudera.org:8080/3213
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602
Gerrit-PatchSet: 10
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Youwei Wang 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Youwei Wang 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page

2016-08-19 Thread Kathy Sun (Code Review)
Kathy Sun has uploaded a new patch set (#7).

Change subject: IMPALA-3981: Fix crash when accessing statestored / catalogd 
/memz page
..

IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page

The /memz page tried to add JVM metrics even when they didn't exist for
all daemons, not just Impala. This led to a crash when they tried to
access ExecEnv::GetInstance() without an initialised ExecEnv at
statestored and catalogd

To fix, changed the memz handler method to take an optional metric
group, provided by the caller.  memz handler will check the existence of
the metric group.

Used C++11 lambdas rather than boost::bind to help simplify the code.

Testing: Ran locally and looked at impalad/memz, statestored/memz
and catalogd/memz

Add a test file test_web_pages.py to test sending request to /memz on
impalad / statestored / catalogd

Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29
---
M be/src/catalog/catalogd-main.cc
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/impala-server.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestored-main.cc
M be/src/util/default-path-handlers.cc
M be/src/util/default-path-handlers.h
M be/src/util/memory-metrics.cc
M be/src/util/metrics-test.cc
M be/src/util/metrics.cc
M be/src/util/metrics.h
A tests/webserver/test_web_pages.py
M www/memz.tmpl
15 files changed, 111 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/3998/7
-- 
To view, visit http://gerrit.cloudera.org:8080/3998
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Kathy Sun 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata

2016-08-19 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4063

Change subject: IMPALA-1659: Netezza compatibility functions: metadata
..

IMPALA-1659: Netezza compatibility functions: metadata

Added the SQL functions current_catalog(), current_schema(), current_user() and
session_user() as aliases to existing ones and a new SQL function current_sid().

Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca
---
M be/src/exprs/expr-test.cc
M be/src/exprs/utility-functions-ir.cc
M be/src/exprs/utility-functions.h
M be/src/runtime/runtime-state.h
M common/function-registry/impala_functions.py
5 files changed, 37 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/4063/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 


[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata

2016-08-19 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-1659: Netezza compatibility functions: metadata
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4063/1/be/src/exprs/utility-functions-ir.cc
File be/src/exprs/utility-functions-ir.cc:

Line 120: StringVal UtilityFunctions::CurrentSid(FunctionContext* ctx) {
This function was requested in the feature request, but I feel that it should 
not be added because of the reasons listed below, please let me know your 
opinion.

There is no documentation available for Netazza's current_sid. The docs linked 
in the issue attachment are Microsoft's and their current_sid has a different 
meaning: it stands for Security ID there.

Although there are no docs about Netazza's current_sid, there are some examples 
on the net, for example here: 
http://stackoverflow.com/questions/27402899/netezza-sql-to-find-client-ip

Two things are obvious from this example:
- current_sid is not a function, but a magic constant
- current_sid is a number, not a string

Adding a current_sid function will not provide any Netezza-compatibility, 
because it won't work without () and our session IDs are strings not numbers 
(although in theory we could use a different representation of our session ID, 
but a 128-bits may be too much to return as a number).

For this reason I suggest to omit this function.


-- 
To view, visit http://gerrit.cloudera.org:8080/4063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


Re: Contributions to Cloudera Impala

2016-08-19 Thread Nishidha Panpaliya

Hi Jim,

Thanks a lot for your response.

Please see my comments inline.

Adding Zhi Zhi, our colleague from China in the thread.

Regards,
Nishidha



From:   Sudarshan Jagadale/Austin/Contr/IBM
To: Jim Apple 
Cc: "dev@impala" , Manish
Patil/Austin/Contr/IBM@IBMUS, Silvius Rus ,
Valencia Serrao/Austin/Contr/IBM@IBMUS, Nishidha
Panpaliya/Austin/Contr/IBM@IBMUS
Date:   08/18/2016 10:45 AM
Subject:Re: Contributions to Cloudera Impala


Hi Jim,

+Nishidha

Thank you for your inputs...

Thanks and Regards
Sudarshan Jagadale
Power Open Source Solutions




From:   Jim Apple 
To: "dev@impala" 
Cc: Silvius Rus , Sudarshan
Jagadale/Austin/Contr/IBM@IBMUS, Manish
Patil/Austin/Contr/IBM@IBMUS, Valencia
Serrao/Austin/Contr/IBM@IBMUS
Date:   08/17/2016 09:43 PM
Subject:Re: Contributions to Cloudera Impala



> I'm glad to tell you that we are able to build and test Impala on Ubuntu
> linux ppc64le with the great support from the Cloudera Community.

Excellent!

> Our next action is to upstream all our changes to Cloudera Impala.

Great!

Cloudera has donated Impala to the Apache Software Foundation (aka
"ASF"). Cloudera now contributes to the project, and the project is
managed by the Impala community.

> With
> this, our plan is to start building latest Impala on Power8 as we'd been
> porting quite an old version (code from cdh5-trunk branch till 23rd
March,
> 2016). Since then, I know there have been many many changes happened
which
> are yet to be ported, specially kudu stuff.

Yes, there have been many changes. One is that Impala is now hosted on
ASF-owned git. Please see
https://cwiki.apache.org/confluence/display/IMPALA/How+to+switch+to
+Apache-hosted+git
[Nishidha] I've read a few pages from this Confluence. Indeed, very useful
to start with.

>We know we need CLA to be signed to start contributions. We have
already
>initiated the process and hoping to get it done soon.

I think the right thing to do here is use the Apache CLAs. See

https://www.apache.org/licenses/cla-corporate.txt

and

https://www.apache.org/licenses/icla.txt
[Nishidha] We'll start with this.

> By the time we get CLA signed, we would start porting the changes
done
>in last 5 months. So, I wanted to know which tag/branch should we take
>up for this.

This is a question we could all discuss together, and it might end up
being a decision made by the Project Management Committee (PMC).

This is a big question about how Apache Impala will evolve. Our bylaws
state:

"Significant, pervasive features may be developed in a speculative
branch of the repository. The PMC may grant commit rights on the
branch to its consistent contributors for the duration of the
initiative. Branch committers are responsible for shepherding their
feature into an active release and do not cast binding votes or vetoes
in the project."

So perhaps this should happen on a separate branch?

One question the community should also consider, IMHO, is whether the
community will have sufficient resources to maintain a working ppc64le
codebase indefinitely into the future.
[Nishidha] We found two new source code URLs as one mentioned in Confluence
https://git-wip-us.apache.org/repos/asf/incubator-impala.git and another to
be https://github.com/apache/incubator-impala.
Commits wise both look same, though former one says "wip" in the URL.
Please suggest the URL to be forked and worked upon. We didn't know if we
could directly work on a separate branch on apache's Impala. We thought of
forking first into our repo and then working on it.

> Working on cdh5-trunk will put us into an unending loop of
>porting as it is being modified everyday. We are thinking to create a
>branch from cdh5.8.0-release tag and start working on it. Please
suggest
>us the best way to do this.

Since Impala is now developed on Apache infrastructure, we have
switched branching schemas. Our main branch is now "master". We do not
have any release branches yet.
[Nishidha] Okay. So, after CLA, can we work directly on Apache's Impala or
we'll need to fork it into our repository and create a new branch from
master, and then generate PRs/Gerrit code reviews from it?

>Verifying all the changes on x86 platforms ourself here will also be
>time consuming and add potential delays in upstreaming. So, we were
>thinking if we can get a job on Cloudera's Continuous integration
server
>which would simply fetch our branch and verify it on all the supported
>platforms and do all the required checks. I'm not sure if this is
>feasible but just a thought. Any other suggestions  to foster this
>activity would be appreciated.

We are working on making a publicly-available CI setup, but we aren't done
yet.

Do you have a CI setup and x86-64 machines that your CI workers can run on?
[Nishidha] Which CI do you have or working on? We can setup Jenkins here
and can get x86-64

[Impala-CR](cdh5-trunk) IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2.

2016-08-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or 
AVX2.
..


Patch Set 44:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3081/44/be/src/util/bit-util.cc
File be/src/util/bit-util.cc:

Line 171:   void (*ByteSwapFunc)(const uint8_t* src, uint8_t* dst) = NULL;
name formatting (this is a variable name)


Line 172:   if( TEMPLATE_DATA_WIDTH == 16 ) {
parentheses formatting


-- 
To view, visit http://gerrit.cloudera.org:8080/3081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1
Gerrit-PatchSet: 44
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Youwei Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Youwei Wang 
Gerrit-HasComments: Yes


Re: Contributions to Cloudera Impala

2016-08-19 Thread Jim Apple
> [Nishidha] We found two new source code URLs as one mentioned in Confluence
> https://git-wip-us.apache.org/repos/asf/incubator-impala.git and another to
> be https://github.com/apache/incubator-impala.
> Commits wise both look same, though former one says "wip" in the URL. Please
> suggest the URL to be forked and worked upon.

I don't know exactly what you mean by "forked" in this context. You
should use the repo referred to as "asf-gerrit" in
https://cwiki.apache.org/confluence/display/IMPALA/How+to+switch+to+Apache-hosted+git

> [Nishidha] Okay. So, after CLA, can we work directly on Apache's Impala or
> we'll need to fork it into our repository and create a new branch from
> master, and then generate PRs/Gerrit code reviews from it?

I don't understand this question, but let me try and clarify some
things that may help you answer it.

All of Impala's commits get code reviews on gerrit.

We don't use PRs.

> [Nishidha] Which CI do you have or working on?

Cloudera uses Jenkins.

> What is the expected timeline for your CI to be publicly available?

You can follow https://issues.cloudera.org/browse/IMPALA-3228 to learn
about updates.

Apache projects are sometimes referred to as "do-ocracies". If you
would like to find a CI hosting provider that will support Apache
Impala, you are welcome to help - you don't have to be an Impala
committer or a CLoudera employee to help.

> 1. Would you suggest me to start a new thread for branch discussion,
> although I created a gist in github.com
> https://gist.github.com/npanpaliya/bd58e554370455babc5c4f290e4b1723 for the
> same?

Yes. Apache Impala project direction discussions are unlikely to
happen on private gists.


Re: Failed to submit/update my patch

2016-08-19 Thread Jim Apple
It looks like you pushed https://gerrit.cloudera.org/#/c/3081/. How
did you fix it?

On Thu, Aug 18, 2016 at 7:04 PM, Wang, Youwei A  wrote:
> Greetings, Jim.
> I don't see that Change-Id I8777cf76f04d34a46f53d53005412e0f1d63b5b7 using 
> following command as you do:
> git fetch gerrit && git log gerrit/cdh5-trunk
>
> The output of "git remote -v" is:
>
> root@debian:~/Impala# git remote -v
> gerrit  ssh://hayabusa-in...@gerrit.cloudera.org:29418/Impala (fetch)
> gerrit  ssh://hayabusa-in...@gerrit.cloudera.org:29418/Impala (push)
> origin  https://github.com/cloudera/Impala.git (fetch)
> origin  https://github.com/cloudera/Impala.git (push)
>
>
> -Original Message-
> From: Jim Apple [mailto:jbap...@cloudera.com]
> Sent: Friday, August 19, 2016 9:07 AM
> To: Wang, Youwei A 
> Cc: dev@impala.incubator.apache.org
> Subject: Re: Failed to submit/update my patch
>
> I don't see this using "git fetch gerrit && git log gerrit/cdh5-trunk".
>
> What is the output of "git remote -v" for you?
>
> On Thu, Aug 18, 2016 at 6:03 PM, Wang, Youwei A  
> wrote:
>> Greetings, Jim. Yes, the Change-Id I8777cf76f04d34a46f53d53005412e0f1d63b5b7 
>> does appear in my branch. And it also appears in the cdh5-trunk. As a proof, 
>> the following message is copied from the output of "git log" for cdh5-trunk:
>>
>> commit 4bc2f0557bce48b04897c0123eba598812931785
>> Author: Tim Armstrong 
>> Date:   Mon May 23 14:09:39 2016 -0700
>>
>> IMPALA-3611: track unused Disk IO buffer memory
>>
>> Track I/O buffers against separate MemTrackers. This gives us better
>> visibility into memory consumption from the debug webpage and from
>> MemTracker consumption dumps. The immediate motivation was in trying to
>> determine whether idle memory consumption of an impalad was caused by a
>> memory leak.
>>
>> We add two trackers: for buffers cached in DiskIoMgr's free list,
>> and another for clients that don't provide a MemTracker (the only
>> one is BufferedBlockMgr, which will be removed at some point).
>>
>> The previous code "tracked" the buffers against the process-wide
>> tracker, but it was a no-op outside of ASAN builds since the
>> process-wide tracker took its value from TCMalloc.
>>
>> The test code required fixing because it assumed that buffers were
>> always credited against the DiskIoMgr's tracker. This only made sense
>> when the DiskIoMgr's tracker is the root process-wide tracker.
>>
>> Fix backend test logging for disk-io-mgr-test.
>>
>> Testing:
>> Ran exhaustive tests.
>>
>> Change-Id: I8777cf76f04d34a46f53d53005412e0f1d63b5b7
>> Reviewed-on: http://gerrit.cloudera.org:8080/3799
>> Reviewed-by: Dan Hecht 
>> Tested-by: Internal Jenkins
>> (cherry picked from commit
>> 17bf14417e3438d772b19111431453bdd537742a)
>>
>> PS: And I have conducted some experiments last night:
>> 1. Create an empty VMWare workstation and install Debian 8.5 (the
>> known latest version); 2. Install git; No other unrelated packages are
>> installed; 3. git clone https://github.com/cloudera/Impala.git
>> 4. cd Impala
>> 5. git checkout cdh5-trunk
>> 6. Follow every step from this page:
>> https://github.com/cloudera/Impala/wiki/Using-Gerrit-to-submit-and-rev
>> iew-patches 7. Apply my patch to cdh5-trunk and then commit it; 8. git
>> push gerrit HEAD:refs/for/cdh5-trunk
>> Result: I got the exact same error as we discussed.
>>
>> I have also try to substitute the VMWare workstation using a Debian docker 
>> image in the step 1.
>> And the remained steps are identical. The same error still happens.
>>
>> -Original Message-
>> From: Jim Apple [mailto:jbap...@cloudera.com]
>> Sent: Thursday, August 18, 2016 8:28 PM
>> To: Wang, Youwei A 
>> Cc: dev@impala.incubator.apache.org
>> Subject: Re: Failed to submit/update my patch
>>
>> Does Change-Id: I8777cf76f04d34a46f53d53005412e0f1d63b5b7 show up in your 
>> git history when trying to push? If it does, something is wrong, since this 
>> change should not have been merged into gerrit/cdh5-trunk.
>>
>> On Thu, Aug 18, 2016 at 12:16 AM, Wang, Youwei A  
>> wrote:
>>> Greetings, everyone.
>>> Thank you for providing such great solution, @Todd and @Jim.
>>> But I am afraid I have to say the  'rebase -i' approach doesn't work for me 
>>> and yes, I have made 100% sure that my patch is the only patch that show up 
>>> in my interactive rebase is my own.
>>>
>>> I have also tried another approach: I have diffed my branch and the 
>>> cdh5-trunk, then saved it to another place. After that, I switched to the 
>>> cdh5-trunk, run git pull,  applied my patch, and then finally commit my 
>>> changes. Then I run "git push gerrit HEAD:refs/for/cdh5-trunk", still got 
>>> the same error. Since now I am working on the latest cdh5-trunk, I believe 
>>> the possibility that other branches introduce this error is eliminated. 
>>> However, this issue still happens.
>>>
>>> I am trying to use another clean machine to re-commit 

[Impala-ASF-CR] IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for Kudu

2016-08-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for 
Kudu
..


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java
File fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java:

Line 118:* Normalizes a 'predicate' consisting of an uncast SlotRef and a 
constant Expr into
what was wrong with the previous version of this function?

given just the description in the function comment, it's not clear why this is 
disabled by casting. you should leave a comment in here, so the next person 
doesn't try to add the casting logic again.

also, doesn't this break other things? i remember we specifically added the 
casting logic to address a problem.


http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/Expr.java
File fe/src/main/java/com/cloudera/impala/analysis/Expr.java:

Line 1193:* For children of 'this' that are constant expressions capable of 
being expressed
which constants cannot be expressed as literals?


http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/LiteralExpr.java
File fe/src/main/java/com/cloudera/impala/analysis/LiteralExpr.java:

Line 55:   public static LiteralExpr create(String value, Type type) throws 
AnalysisException {
remove throw


Line 154:* Returns null for types that cannot be expressed as literals, 
e.g. TIMESTAMP.
you mean for which don't have a LiteralExpr subclass? 'cannot be expressed' 
sounds like a sql limitation.


-- 
To view, visit http://gerrit.cloudera.org:8080/3986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae7612433a2e27f8887abe6624f9ee0f4867b934
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[DISCUSS] Branching for a release soon?

2016-08-19 Thread Jim Apple
I would like to volunteer to be the next release manager for an
upcoming Apache Impala 2.7 (incubating) release. I plan to:

1. Get a vote done adding to our bylaws the procedure for creating a
branch. (See thread from yesterday)

2. Create a release branch, possibly backdated a few commits from HEAD
to a commit that looks stable.

3. Maintain that branch by cherry-picking bugfix commits from "master"
as necessary to get the tests running cleanly.

4. Make a release candidate and call for a vote here, then on the IPMC
list (which is required for Incubator releases)

Obviously, this glosses over some details about publishing and signing
releases, but I would welcome feedback on this outline.


Re: [DISCUSS] Branching for a release soon?

2016-08-19 Thread Todd Lipcon
Sounds great to me!

Todd

On Aug 19, 2016 8:41 AM, "Jim Apple"  wrote:

> I would like to volunteer to be the next release manager for an
> upcoming Apache Impala 2.7 (incubating) release. I plan to:
>
> 1. Get a vote done adding to our bylaws the procedure for creating a
> branch. (See thread from yesterday)
>
> 2. Create a release branch, possibly backdated a few commits from HEAD
> to a commit that looks stable.
>
> 3. Maintain that branch by cherry-picking bugfix commits from "master"
> as necessary to get the tests running cleanly.
>
> 4. Make a release candidate and call for a vote here, then on the IPMC
> list (which is required for Incubator releases)
>
> Obviously, this glosses over some details about publishing and signing
> releases, but I would welcome feedback on this outline.
>


[Impala-ASF-CR] Add .clang-format for Impala's C++ style

2016-08-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Add .clang-format for Impala's C++ style
..


Patch Set 5: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for Kudu

2016-08-19 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for 
Kudu
..


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java
File fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java:

Line 118:* Normalizes a 'predicate' consisting of an uncast SlotRef and a 
constant Expr into
> what was wrong with the previous version of this function?
For Kudu predicates, at least, we can't have the slotref wrapped by a cast. So 
this fn now returns the very specific kind of predicate described here if 
possible.
This shouldn't break anything else because its only caller is the KuduScanNode. 
Would you prefer if I change the name of the fn to 
normalizeNoCastSlotRefComparison() (or similar)?


http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/Expr.java
File fe/src/main/java/com/cloudera/impala/analysis/Expr.java:

Line 1193:* For children of 'this' that are constant expressions capable of 
being expressed
> which constants cannot be expressed as literals?
DATE/DATETIME/TIMESTAMP don't have LiteralExpr subclasses. I'll change.


http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/LiteralExpr.java
File fe/src/main/java/com/cloudera/impala/analysis/LiteralExpr.java:

Line 55:   public static LiteralExpr create(String value, Type type) throws 
AnalysisException {
> remove throw
analyze(), uncheckedCastTo(), and BoolLiteral()/NumericLiteral() all throw 
AnalysisEx.


Line 154:* Returns null for types that cannot be expressed as literals, 
e.g. TIMESTAMP.
> you mean for which don't have a LiteralExpr subclass? 'cannot be expressed'
Yes, I'll change the comment


-- 
To view, visit http://gerrit.cloudera.org:8080/3986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae7612433a2e27f8887abe6624f9ee0f4867b934
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


Re: [DISCUSS] Branching for a release soon?

2016-08-19 Thread Tom White
Thanks for volunteering to do the release Jim! The plan looks fine to me.

Tom

On Fri, Aug 19, 2016 at 4:42 PM, Todd Lipcon  wrote:
> Sounds great to me!
>
> Todd
>
> On Aug 19, 2016 8:41 AM, "Jim Apple"  wrote:
>
>> I would like to volunteer to be the next release manager for an
>> upcoming Apache Impala 2.7 (incubating) release. I plan to:
>>
>> 1. Get a vote done adding to our bylaws the procedure for creating a
>> branch. (See thread from yesterday)
>>
>> 2. Create a release branch, possibly backdated a few commits from HEAD
>> to a commit that looks stable.
>>
>> 3. Maintain that branch by cherry-picking bugfix commits from "master"
>> as necessary to get the tests running cleanly.
>>
>> 4. Make a release candidate and call for a vote here, then on the IPMC
>> list (which is required for Incubator releases)
>>
>> Obviously, this glosses over some details about publishing and signing
>> releases, but I would welcome feedback on this outline.
>>


[Impala-ASF-CR] IMPALA-3342: Runtime profile TotalCpuTime should eliminate wait times

2016-08-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3342: Runtime profile TotalCpuTime should eliminate wait 
times
..


Patch Set 1:

Unless I'm missing something, TotalCpuTime at the fragment level isn't 
correctly updated with the user/sys time. It looks like we try to approximate 
it with:

  int64_t cpu_time = cpu_and_wait_time
  - runtime_state_->total_storage_wait_timer()->value()
  - runtime_state_->total_network_send_timer()->value()
  - runtime_state_->total_network_receive_timer()->value();

But it's not clear that that's accurate.

I agree that we already do the right thing for ScannerThreadsUserTime and 
ScannerThreadsSysTime

-- 
To view, visit http://gerrit.cloudera.org:8080/4052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If8edd0c3559968ff12e42f83dd970e424fa4a3c8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3943: Adhere to abort on error in ProcessFooter().

2016-08-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3943: Adhere to abort_on_error in ProcessFooter().
..


Patch Set 7:

(1 comment)

Not this patch, but we should consider changing HdfsScanNode so that it handles 
the abort_on_error logic at the callsite for Open() if Open() fails. It might 
simplify this and the other scanners

http://gerrit.cloudera.org:8080/#/c/3862/7/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 188:   if (!status.ok()) return state_->LogOrReturnError(status.msg());
Don't we want to return early if there was any kind of error? If its a 
non-aborting error I think we want to return early with Status::OK(), since 
it's not clear if it's safe to execute the below code.


-- 
To view, visit http://gerrit.cloudera.org:8080/3862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6aff766a1ce6376efb329bdde51c648149dfe08c
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


Re: [DISCUSS] Branching for a release soon?

2016-08-19 Thread Tim Armstrong
How do you plan to choose which commits to cherry-pick? Should we let you
know if we think a patch should/shouldn't be part of the release?

On Fri, Aug 19, 2016 at 8:44 AM, Tom White  wrote:

> Thanks for volunteering to do the release Jim! The plan looks fine to me.
>
> Tom
>
> On Fri, Aug 19, 2016 at 4:42 PM, Todd Lipcon  wrote:
> > Sounds great to me!
> >
> > Todd
> >
> > On Aug 19, 2016 8:41 AM, "Jim Apple"  wrote:
> >
> >> I would like to volunteer to be the next release manager for an
> >> upcoming Apache Impala 2.7 (incubating) release. I plan to:
> >>
> >> 1. Get a vote done adding to our bylaws the procedure for creating a
> >> branch. (See thread from yesterday)
> >>
> >> 2. Create a release branch, possibly backdated a few commits from HEAD
> >> to a commit that looks stable.
> >>
> >> 3. Maintain that branch by cherry-picking bugfix commits from "master"
> >> as necessary to get the tests running cleanly.
> >>
> >> 4. Make a release candidate and call for a vote here, then on the IPMC
> >> list (which is required for Incubator releases)
> >>
> >> Obviously, this glosses over some details about publishing and signing
> >> releases, but I would welcome feedback on this outline.
> >>
>


[Impala-ASF-CR] IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for Kudu

2016-08-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for 
Kudu
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java
File fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java:

Line 118:* Normalizes a 'predicate' consisting of an uncast SlotRef and a 
constant Expr into
> For Kudu predicates, at least, we can't have the slotref wrapped by a cast.
maybe we should change the name, to warn potential future callers. the initial 
thinking behind putting this logic here was that it seems relatively general, 
so would be useful outside of kudu. i'm not sure that's still the case.


-- 
To view, visit http://gerrit.cloudera.org:8080/3986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae7612433a2e27f8887abe6624f9ee0f4867b934
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


Re: [DISCUSS] Branching for a release soon?

2016-08-19 Thread Jim Apple
I hadn't decided yet. What do you think?

I could test the release, then look for bug fixes that landed in
master after my branch started that fix anything that doesn't pass.

Or I could reach out to commit authors and ask them, or I could ask
them to email me, like you mentioned.

Or I could read commit messages and cherry-pick anything that is a bug fix.

On Fri, Aug 19, 2016 at 9:01 AM, Tim Armstrong  wrote:
> How do you plan to choose which commits to cherry-pick? Should we let you
> know if we think a patch should/shouldn't be part of the release?
>
> On Fri, Aug 19, 2016 at 8:44 AM, Tom White  wrote:
>
>> Thanks for volunteering to do the release Jim! The plan looks fine to me.
>>
>> Tom
>>
>> On Fri, Aug 19, 2016 at 4:42 PM, Todd Lipcon  wrote:
>> > Sounds great to me!
>> >
>> > Todd
>> >
>> > On Aug 19, 2016 8:41 AM, "Jim Apple"  wrote:
>> >
>> >> I would like to volunteer to be the next release manager for an
>> >> upcoming Apache Impala 2.7 (incubating) release. I plan to:
>> >>
>> >> 1. Get a vote done adding to our bylaws the procedure for creating a
>> >> branch. (See thread from yesterday)
>> >>
>> >> 2. Create a release branch, possibly backdated a few commits from HEAD
>> >> to a commit that looks stable.
>> >>
>> >> 3. Maintain that branch by cherry-picking bugfix commits from "master"
>> >> as necessary to get the tests running cleanly.
>> >>
>> >> 4. Make a release candidate and call for a vote here, then on the IPMC
>> >> list (which is required for Incubator releases)
>> >>
>> >> Obviously, this glosses over some details about publishing and signing
>> >> releases, but I would welcome feedback on this outline.
>> >>
>>


Re: [DISCUSS] Branching for a release soon?

2016-08-19 Thread Tim Armstrong
I think the first sounds easiest. If commit authors really think something
should go in the release they can always reach out to you.

On Fri, Aug 19, 2016 at 9:15 AM, Jim Apple  wrote:

> I hadn't decided yet. What do you think?
>
> I could test the release, then look for bug fixes that landed in
> master after my branch started that fix anything that doesn't pass.
>
> Or I could reach out to commit authors and ask them, or I could ask
> them to email me, like you mentioned.
>
> Or I could read commit messages and cherry-pick anything that is a bug fix.
>
> On Fri, Aug 19, 2016 at 9:01 AM, Tim Armstrong 
> wrote:
> > How do you plan to choose which commits to cherry-pick? Should we let you
> > know if we think a patch should/shouldn't be part of the release?
> >
> > On Fri, Aug 19, 2016 at 8:44 AM, Tom White  wrote:
> >
> >> Thanks for volunteering to do the release Jim! The plan looks fine to
> me.
> >>
> >> Tom
> >>
> >> On Fri, Aug 19, 2016 at 4:42 PM, Todd Lipcon  wrote:
> >> > Sounds great to me!
> >> >
> >> > Todd
> >> >
> >> > On Aug 19, 2016 8:41 AM, "Jim Apple"  wrote:
> >> >
> >> >> I would like to volunteer to be the next release manager for an
> >> >> upcoming Apache Impala 2.7 (incubating) release. I plan to:
> >> >>
> >> >> 1. Get a vote done adding to our bylaws the procedure for creating a
> >> >> branch. (See thread from yesterday)
> >> >>
> >> >> 2. Create a release branch, possibly backdated a few commits from
> HEAD
> >> >> to a commit that looks stable.
> >> >>
> >> >> 3. Maintain that branch by cherry-picking bugfix commits from
> "master"
> >> >> as necessary to get the tests running cleanly.
> >> >>
> >> >> 4. Make a release candidate and call for a vote here, then on the
> IPMC
> >> >> list (which is required for Incubator releases)
> >> >>
> >> >> Obviously, this glosses over some details about publishing and
> signing
> >> >> releases, but I would welcome feedback on this outline.
> >> >>
> >>
>


[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata

2016-08-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-1659: Netezza compatibility functions: metadata
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4063/1/be/src/exprs/utility-functions-ir.cc
File be/src/exprs/utility-functions-ir.cc:

Line 120: StringVal UtilityFunctions::CurrentSid(FunctionContext* ctx) {
> This function was requested in the feature request, but I feel that it shou
agree, let's leave it out


http://gerrit.cloudera.org:8080/#/c/4063/1/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

Line 551:   [['user', 'current_user', 'current_schema'], 'STRING', [], 
'impala::UtilityFunctions::User'],
current_schema is the same as the user?


-- 
To view, visit http://gerrit.cloudera.org:8080/4063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-1654: Support general predicates in most partition DDL operations.

2016-08-19 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-1654: Support general predicates in most partition DDL 
operations.
..


Patch Set 9:

This is now getting reviewed on https://gerrit.cloudera.org/#/c/3942/. Amos, if 
you don't need this anymore, can you click "Abandon"? It doesn't delete 
anything, so you'll always be able to come back and look at it later.

-- 
To view, visit http://gerrit.cloudera.org:8080/1563
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
Gerrit-PatchSet: 9
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Amos Bird 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Amos Bird 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3153: Incorrect behaviour around slash escaping single quotes

2016-08-19 Thread Jim Apple (Code Review)
Jim Apple has abandoned this change.

Change subject: IMPALA-3153: Incorrect behaviour around slash escaping single 
quotes
..


Abandoned

This change is small enough that it should be re-sent on the new gerrit project 
Impala-ASF. See 
https://cwiki.apache.org/confluence/display/IMPALA/How+to+switch+to+Apache-hosted+git

-- 
To view, visit http://gerrit.cloudera.org:8080/3335
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I002f75f3330ee319d82213331a5a046ecf198189
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Feng Guangyuan 
Gerrit-Reviewer: Feng Guangyuan 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-2581: LIMIT can be propagated down into some aggregations

2016-08-19 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-2581: LIMIT can be propagated down into some aggregations
..


Patch Set 4:

I will be able to look at this again in about 9 days.

-- 
To view, visit http://gerrit.cloudera.org:8080/3822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I59c5b7af7a73ccdbc5496b28eacb9b6859d202bc
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Add .clang-format for Impala's C++ style

2016-08-19 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: Add .clang-format for Impala's C++ style
..


Patch Set 5: Code-Review+2

Seems like there's consensus on the mailing list. Could you add a page to the 
wiki explaining what this is, how we expect it to be used and so on?

-- 
To view, visit http://gerrit.cloudera.org:8080/3886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3662: Don't double allocate tuples' buffer in parquet scanner

2016-08-19 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4064

Change subject: IMPALA-3662: Don't double allocate tuples' buffer in parquet 
scanner
..

IMPALA-3662: Don't double allocate tuples' buffer in parquet scanner

HdfsScanner::StartNewRowBatch() is called once per row batch
by the parquet scanner to allocate a new row batch and tuple
buffer. Similarly, a scratch batch is created for each row
batch in HdfsParquetScanner::AssembleRows() which also contains
the tuple buffer. In reality, only the tuple buffer in the
scratch batch is used. So, the tuple buffer allocated by
HdfsScanner::StartNewRowBatch() is unused memory for the
parquet scanner.

This change fixes the problem above by implementing
HdfsParquetScanner::StartNewRowBatch() which creates
a new row batch without allocating the tuple buffer.
With this patch, the memory consumption when
materializing very wide tuples is reduced by half.

Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scanner.h
3 files changed, 14 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/4064/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata

2016-08-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1659: Netezza compatibility functions: metadata
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4063/1/be/src/exprs/utility-functions-ir.cc
File be/src/exprs/utility-functions-ir.cc:

Line 120: StringVal UtilityFunctions::CurrentSid(FunctionContext* ctx) {
> This function was requested in the feature request, but I feel that it shou
I agree that the function lists in the netezza compatibility JIRAs were not 
thoroughly vetted, thanks for thinking through this.

I think some of the same points apply to current_schema, etc as well - it looks 
like they may be magic constants too.

It seems like having functions with the same name as the magic would make 
porting queries much easier, even though though would still require 
modifications, so it might be worth considering.

We don't have a UDF to return the current session. Maybe that's a useful piece 
of functionality too even if it's not exactly the same as Netezza. What do you 
think? Returning it as a string seems ok to me, since that's how we display it 
everywhere in the system.


-- 
To view, visit http://gerrit.cloudera.org:8080/4063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page

2016-08-19 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3981: Fix crash when accessing statestored / catalogd 
/memz page
..


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/3998/7/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

PS7, Line 17: #
: # Verification of impalad metrics after a test run.
remove


PS7, Line 21: from tests.common.errors import Timeout
is this used?


PS7, Line 27: (\
remove the \, and try and put some of the parameters onto this line.


PS7, Line 31: resp.raise_for_status()
does this raise an error when status != 200? If so you don't need the checks on 
lines 41, 43, 45. Maybe you could check if the contents of the webpage has what 
you're expecting?


PS7, Line 35: """request web page /memz at imapalad / statestored / catalogd."""
I'd remove this, it doesn't tell the reader anything that's not in the test 
comments.


-- 
To view, visit http://gerrit.cloudera.org:8080/3998
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Kathy Sun 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2033: Netezza compatibility functions quote ident

2016-08-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2033: Netezza compatibility functions quote_ident
..


Patch Set 4:

Hi Shirish, have you had a chance to look at the comments?

-- 
To view, visit http://gerrit.cloudera.org:8080/3263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idbac4cd10fe08a6bd4871a03ba3b5f2c20fa8ee1
Gerrit-PatchSet: 4
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Shirish Gajera 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Shirish Gajera 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3662: Don't double allocate tuples' buffer in parquet scanner

2016-08-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3662: Don't double allocate tuples' buffer in parquet 
scanner
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4064/1/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

PS1, Line 304: StartNewRowBatch
I think we should disambiguate so that it's clear which variant is being used. 
Maybe StartNewRowBatchNoTupleBuffer() or StartNewParquetRowBatch()?


Line 341: RETURN_IF_ERROR(CommitRows(row_batch, num_row_to_commit));
Doesn't CommitRows() call HdfsScanner::StartNewRowBatch() anyway?


-- 
To view, visit http://gerrit.cloudera.org:8080/4064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures

2016-08-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures
..


Patch Set 4: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3991/4/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

Line 442: "Debug Action: MEM_LIMIT_EXCEEDED");
this is fine with me, but any reason we don't use mem_tracker() (i.e. 
exec-node's tracker)?


http://gerrit.cloudera.org:8080/#/c/3991/4/be/src/exec/hdfs-scan-node.h
File be/src/exec/hdfs-scan-node.h:

PS4, Line 576: if requested by scanner
I think we should delete this part because it sounds like the function itself 
will check if the scanner requested it, but you're really just saying that this 
is called by the scanner.


-- 
To view, visit http://gerrit.cloudera.org:8080/3991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format

2016-08-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: REVIEW-ONLY: the results of running clang-format
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4046/1/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

Line 261
i'd prefer to keep it this way, because that makes complicated boolean exprs 
easier to read


Line 1038:**next_backend_on_host) != backends_on_host.end());
do  you know what the rationale for this indentation is?


-- 
To view, visit http://gerrit.cloudera.org:8080/4046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add .clang-format for Impala's C++ style

2016-08-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: Add .clang-format for Impala's C++ style
..


Patch Set 5: Code-Review-1

i posted a comment on the preview of the reformatting results you sent out, 
please address that first.

-- 
To view, visit http://gerrit.cloudera.org:8080/3886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Add .clang-format for Impala's C++ style

2016-08-19 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: Add .clang-format for Impala's C++ style
..


Patch Set 5: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/106/

-- 
To view, visit http://gerrit.cloudera.org:8080/3886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Add .clang-format for Impala's C++ style

2016-08-19 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Add .clang-format for Impala's C++ style
..


Patch Set 5:

> i posted a comment on the preview of the reformatting results you
 > sent out, please address that first.

OK - halted the in-progress verification job to address your comments first.

-- 
To view, visit http://gerrit.cloudera.org:8080/3886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3662: Don't double allocate tuples buffer in parquet scanner

2016-08-19 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new patch set (#2).

Change subject: IMPALA-3662: Don't double allocate tuples buffer in parquet 
scanner
..

IMPALA-3662: Don't double allocate tuples buffer in parquet scanner

HdfsScanner::StartNewRowBatch() is called once per row batch
by the parquet scanner to allocate a new row batch and tuple
buffer. Similarly, a scratch batch is created for each row
batch in HdfsParquetScanner::AssembleRows() which also contains
the tuple buffer. In reality, only the tuple buffer in the
scratch batch is used. So, the tuple buffer allocated by
HdfsScanner::StartNewRowBatch() is unused memory for the
parquet scanner.

This change fixes the problem above by implementing
HdfsParquetScanner::StartNewRowBatch() which creates
a new row batch without allocating the tuple buffer.
With this patch, the memory consumption when
materializing very wide tuples is reduced by half.

Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scanner.h
3 files changed, 14 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/4064/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3662: Don't double allocate tuples buffer in parquet scanner

2016-08-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3662: Don't double allocate tuples buffer in parquet 
scanner
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4064/1/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

PS1, Line 304: StartNewRowBatch
> I think we should disambiguate so that it's clear which variant is being us
Done


Line 341: RETURN_IF_ERROR(CommitRows(row_batch, num_row_to_commit));
> Doesn't CommitRows() call HdfsScanner::StartNewRowBatch() anyway?
I believe it's calling HdfsParquetScanner::CommitRows() which doesn't call 
StartNewRowBatch().


-- 
To view, visit http://gerrit.cloudera.org:8080/4064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format

2016-08-19 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: REVIEW-ONLY: the results of running clang-format
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4046/1/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

Line 261
> i'd prefer to keep it this way, because that makes complicated boolean expr
Setting BreakBeforeBinaryOperators to NonAssignment or All fixes this, but 
increases the total diff from 79697 to 81122 or 85238 lines, respectively.


Line 1038:**next_backend_on_host) != backends_on_host.end());
> do  you know what the rationale for this indentation is?
I think it is ContinuationIndentWidth - 4 spaces, plus the length of the 
grandparent ('DCHECK')


-- 
To view, visit http://gerrit.cloudera.org:8080/4046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures

2016-08-19 Thread Michael Ho (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3991

to look at the new patch set (#5).

Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures
..

IMPALA-3962: Clean up scratch tuple batch on scan failures

The parquet scanner doesn't clean up 'scratch_batch_' properly
which causes it to process a partially filled scratch_batch_
if any of the column reader fails. This change cleans up the
scratch batch if any of the parquet column readers fails.
The clean up involves freeing the mem_pool of scratch_batch_
and setting number of tuples in scratch_batch_ to 0.

This change also extends debug action to emulate the behavior
of exceeding the query's memory limit.

Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
---
M be/src/exec/exec-node.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M common/thrift/PlanNodes.thrift
M tests/failure/test_failpoints.py
10 files changed, 76 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/3991/5
-- 
To view, visit http://gerrit.cloudera.org:8080/3991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures

2016-08-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3991/4/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

Line 442: "Debug Action: MEM_LIMIT_EXCEEDED");
> this is fine with me, but any reason we don't use mem_tracker() (i.e. exec-
Done


http://gerrit.cloudera.org:8080/#/c/3991/4/be/src/exec/hdfs-scan-node.h
File be/src/exec/hdfs-scan-node.h:

PS4, Line 576: if requested by scanner
> I think we should delete this part because it sounds like the function itse
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures

2016-08-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures
..


Patch Set 5: Code-Review+2

Carry +2 forward.

-- 
To view, visit http://gerrit.cloudera.org:8080/3991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata

2016-08-19 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-1659: Netezza compatibility functions: metadata
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4063/1/be/src/exprs/utility-functions-ir.cc
File be/src/exprs/utility-functions-ir.cc:

Line 120: StringVal UtilityFunctions::CurrentSid(FunctionContext* ctx) {
> I agree that the function lists in the netezza compatibility JIRAs were not
I agree that providing a function to query the session ID would be a useful 
addition. If you are fine with this implementation knowing that it will not 
provide any Netezza-compatility after all, I am happy to submit it as well - 
after all, I already wrote the code for it. ;)


http://gerrit.cloudera.org:8080/#/c/4063/1/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

Line 551:   [['user', 'current_user', 'current_schema'], 'STRING', [], 
'impala::UtilityFunctions::User'],
> current_schema is the same as the user?
Since there is no universal definition of a schema, we can not provide a 
function that will fit everyone's definition. Based on my short research, 
according to IBM and Oracle, schemas seem to be conceptually similar to users 
(although I'm not an expert of their database products, so I may be mistaken). 
The document attached to the feature request also linked to IBM's page. Based 
on this it seems like a reasonable compromise to return the user name here.

My sources in more details:

netezza-functions.pdf [1], which is attached to the issue IMPALA-1660 contains 
the list of functions to be added with some context. Despite the name of the 
document, most of the functions are not Netezza functions, but are from other 
vendors. The info for CURRENT_SCHEMA links to an IBM page (as many other 
functions in that PDF), but it is a dead link. The link for CURRENT_CATALOG 
however, leads to another IBM page [2] that lists all functions.

This page describes CURRENT_SCHEMA as "Returns the current schema name (user 
name)". Also a highly rated answer on Stack Overflow [3] states that in Oracle 
"schema == namespace within database, identical to user account" but also that 
"The meaning of 'catalog', 'schema' and 'database' vary from one implementation 
to another."

[1] https://issues.cloudera.org/secure/attachment/12199/netezza-functions.pdf
[2] 
http://www.ibm.com/support/knowledgecenter/SSULQD_7.0.3/com.ibm.nz.dbu.doc/r_dbuser_functions.html
[3] http://stackoverflow.com/a/7944489/5613485


-- 
To view, visit http://gerrit.cloudera.org:8080/4063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3662: Don't double allocate tuples buffer in parquet scanner

2016-08-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3662: Don't double allocate tuples buffer in parquet 
scanner
..


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4064/1/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 341: RETURN_IF_ERROR(CommitRows(row_batch, num_row_to_commit));
> I believe it's calling HdfsParquetScanner::CommitRows() which doesn't call 
Ah you're right. This is confusing.


-- 
To view, visit http://gerrit.cloudera.org:8080/4064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


Re: [DISCUSS] Branching for a release soon?

2016-08-19 Thread Todd Lipcon
I believe the newer versions of gerrit support a label-like concept called
'hashtags' (apparently the authors of Gerrit love Instagram). They allow
you to tag a review/commit with arbitrary strings. Perhaps this feature
could be used so that patch authors can tag their reviews/commits as
"stable-candidate" and Jim can easily query for them? Then when it has been
cherry-picked to the branch, a "stable-picked" tag can be added to indicate
it has been merged?

If this sounds useful, let me know and I can look into enabling the feature
on gerrit.cloudera.org. I think some non-trivial config change has to be
made, though, so would take a week or two.

-Todd

On Fri, Aug 19, 2016 at 9:38 AM, Tim Armstrong 
wrote:

> I think the first sounds easiest. If commit authors really think something
> should go in the release they can always reach out to you.
>
> On Fri, Aug 19, 2016 at 9:15 AM, Jim Apple  wrote:
>
> > I hadn't decided yet. What do you think?
> >
> > I could test the release, then look for bug fixes that landed in
> > master after my branch started that fix anything that doesn't pass.
> >
> > Or I could reach out to commit authors and ask them, or I could ask
> > them to email me, like you mentioned.
> >
> > Or I could read commit messages and cherry-pick anything that is a bug
> fix.
> >
> > On Fri, Aug 19, 2016 at 9:01 AM, Tim Armstrong 
> > wrote:
> > > How do you plan to choose which commits to cherry-pick? Should we let
> you
> > > know if we think a patch should/shouldn't be part of the release?
> > >
> > > On Fri, Aug 19, 2016 at 8:44 AM, Tom White  wrote:
> > >
> > >> Thanks for volunteering to do the release Jim! The plan looks fine to
> > me.
> > >>
> > >> Tom
> > >>
> > >> On Fri, Aug 19, 2016 at 4:42 PM, Todd Lipcon 
> wrote:
> > >> > Sounds great to me!
> > >> >
> > >> > Todd
> > >> >
> > >> > On Aug 19, 2016 8:41 AM, "Jim Apple"  wrote:
> > >> >
> > >> >> I would like to volunteer to be the next release manager for an
> > >> >> upcoming Apache Impala 2.7 (incubating) release. I plan to:
> > >> >>
> > >> >> 1. Get a vote done adding to our bylaws the procedure for creating
> a
> > >> >> branch. (See thread from yesterday)
> > >> >>
> > >> >> 2. Create a release branch, possibly backdated a few commits from
> > HEAD
> > >> >> to a commit that looks stable.
> > >> >>
> > >> >> 3. Maintain that branch by cherry-picking bugfix commits from
> > "master"
> > >> >> as necessary to get the tests running cleanly.
> > >> >>
> > >> >> 4. Make a release candidate and call for a vote here, then on the
> > IPMC
> > >> >> list (which is required for Incubator releases)
> > >> >>
> > >> >> Obviously, this glosses over some details about publishing and
> > signing
> > >> >> releases, but I would welcome feedback on this outline.
> > >> >>
> > >>
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera


[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata

2016-08-19 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-1659: Netezza compatibility functions: metadata
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4063/1/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

Line 551:   [['user', 'current_user', 'current_schema'], 'STRING', [], 
'impala::UtilityFunctions::User'],
> Since there is no universal definition of a schema, we can not provide a fu
To clarify: Netezza is a subsidiary of IBM, but [2] is from the documentation 
of IBM's DB2, which I believe to be unrelated apart from the common owner of 
the products.


-- 
To view, visit http://gerrit.cloudera.org:8080/4063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata

2016-08-19 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-1659: Netezza compatibility functions: metadata
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4063/1/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

Line 551:   [['user', 'current_user', 'current_schema'], 'STRING', [], 
'impala::UtilityFunctions::User'],
> To clarify: Netezza is a subsidiary of IBM, but [2] is from the documentati
Actually [2] is from the Netezza documentation, my mistake.


-- 
To view, visit http://gerrit.cloudera.org:8080/4063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-(3895,3859): Don't log file data on parse errors

2016-08-19 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#3).

Change subject: IMPALA-(3895,3859): Don't log file data on parse errors
..

IMPALA-(3895,3859): Don't log file data on parse errors

Logging file or table data is a bad idea, and doing it by default is
particularly bad. This patch changes HdfsScanNode::LogRowParseError() to
log a file and offset only.

Testing: See rewritten tests.

To support testing this change, we also fix IMPALA-3895, by introducing
a canonical string __HDFS_FILENAME__ that all Hadoop filenames in the ERROR
output are replaced with before comparing with the expected
results. This fixes a number of issues with the old way of matching
filenames which purported to be a regex, but really wasn't. In
particular, we can now match the rest of an ERROR line after the
filename, which was not possible before.

In some cases, we don't want to substitute filenames because the ERROR
output is looking for a very specific output. In that case we can write:

$NAMENODE/

and this patch will not perform _any_ filename substitutions on ERROR
sections that contain the $NAMENODE string.

Finally, this patch fixes a bug where a test that had an ERRORS section
but no RESULTS section would silently pass without testing anything.

Change-Id: I5a604f8784a9ff7b4bf878f82ee7f56697df3272
---
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
M testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hbase-scan-node-errors.test
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-rcfile-scan-node-errors.test
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-sequence-scan-errors.test
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-continue-on-error.test
M testdata/workloads/functional-query/queries/QueryTest/strict-mode-abort.test
M testdata/workloads/functional-query/queries/QueryTest/strict-mode.test
M tests/common/impala_test_suite.py
M tests/common/test_result_verifier.py
17 files changed, 354 insertions(+), 367 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/4020/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4020
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a604f8784a9ff7b4bf878f82ee7f56697df3272
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-(3895,3859): Don't log file data on parse errors

2016-08-19 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-(3895,3859): Don't log file data on parse errors
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4020/2/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

PS2, Line 635: TO
> weird caps - should be lower case?
Done


http://gerrit.cloudera.org:8080/#/c/4020/2/tests/common/impala_test_suite.py
File tests/common/impala_test_suite.py:

Line 328: replace_filenames_with_placeholder = True
> I think like this is getting big enough to be its own function. E.g. verify
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4020
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5a604f8784a9ff7b4bf878f82ee7f56697df3272
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata

2016-08-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-1659: Netezza compatibility functions: metadata
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4063/1/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

Line 551:   [['user', 'current_user', 'current_schema'], 'STRING', [], 
'impala::UtilityFunctions::User'],
> Actually [2] is from the Netezza documentation, my mistake.
maybe we should leave this out for now until we can get confirmation (from a 
netezza user) what this really means. intuitively i wouldn't expect user == 
schema.


-- 
To view, visit http://gerrit.cloudera.org:8080/4063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

2016-08-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#3).

Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot 
fail
..

IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

HdfsTableSink::Close() makes calls to functions that can fail with a
Status. However, since the function has a void return type, these
error statuses are just logged and we cannot take any action according
to the type of error.

This patch moves the closing of the partition file into the FlushFinal
function from Close(), so that in case of an error on closing the
file, the error is propagated up and some action can be taken.

We try and close all the partition files in the map in Close() as well
because if a query is cancelled, FlushFinal will not be called and we
would end up leaking some file descriptors.

Also fixed some long lines in this patch.

Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
---
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
2 files changed, 14 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4018/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

2016-08-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot 
fail
..


Patch Set 3:

(1 comment)

Passed private build:
http://sandbox.jenkins.cloudera.com/job/impala-private-build-and-test/3984/

http://gerrit.cloudera.org:8080/#/c/4018/2/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

Line 647: RETURN_IF_ERROR(FinalizePartitionFile(state, 
cur_partition->second.first));
> I still don't see why this is MergeStatus() instead of RETURN_IF_ERROR.  We
Right, that makes sense. I've made that change.


-- 
To view, visit http://gerrit.cloudera.org:8080/4018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

2016-08-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#4).

Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot 
fail
..

IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

HdfsTableSink::Close() makes calls to functions that can fail with a
Status. However, since the function has a void return type, these
error statuses are just logged and we cannot take any action according
to the type of error.

This patch moves the closing of the partition file into the FlushFinal
function from Close(), so that in case of an error on closing the
file, the error is propagated up and some action can be taken.

We try and close all the partition files in the map in Close() as well
because if a query is cancelled, FlushFinal will not be called and we
would end up leaking some file descriptors.

Also fixed some long lines in this patch.

Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
---
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
2 files changed, 13 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4018/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

2016-08-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot 
fail
..


Patch Set 4:

Does this pass the tests?

-- 
To view, visit http://gerrit.cloudera.org:8080/4018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

2016-08-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot 
fail
..


Patch Set 4:

> Does this pass the tests?

Yes, I mentioned about that it passed the private build:
http://sandbox.jenkins.cloudera.com/job/impala-private-build-and-test/3984/

-- 
To view, visit http://gerrit.cloudera.org:8080/4018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

2016-08-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot 
fail
..


Patch Set 4: Code-Review+2

(1 comment)

Oops, sorry missed your comment.

http://gerrit.cloudera.org:8080/#/c/4018/4/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

Line 626:   ImpaladMetrics::NUM_FILES_OPEN_FOR_INSERT->Increment(-1);
Should we do these before returning error? otherwise, the impalad metric can 
get stuck at non-zero, but there's nothing we can do about it right?


-- 
To view, visit http://gerrit.cloudera.org:8080/4018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3662: Don't double allocate tuples buffer in parquet scanner

2016-08-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3662: Don't double allocate tuples buffer in parquet 
scanner
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page

2016-08-19 Thread Kathy Sun (Code Review)
Kathy Sun has posted comments on this change.

Change subject: IMPALA-3981: Fix crash when accessing statestored / catalogd 
/memz page
..


Patch Set 7:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/3998/6/be/src/util/metrics-test.cc
File be/src/util/metrics-test.cc:

PS6, Line 350: reinterpret_cast(NULL)
> just wondering - does nullptr work here?
Yeah, It could work, I write in this way since lines 372 did so.
to make them consistent


http://gerrit.cloudera.org:8080/#/c/3998/6/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

PS6, Line 27: quest_w
> should be:
Done


PS6, Line 37: "Test all memz/ webpages"""
> if you raise here, line 38 will never be executed. I think you don't need t
Done


PS6, Line 27: def request_web_page(\
: host_name, web_ui_port, relative_url, params={}, 
timeout_secs=DEFAULT_TIMEOUT):
:   url = "http://%s:%s%s"; % (host_name, web_ui_port, relative_url)
:   resp = requests.get(url, params=params, timeout=timeout_secs)
:   resp.raise_for_status()
:   return resp
: 
: class TestWebPage(ImpalaTestSuite):
:   """request web page /memz at imapalad / statestored / 
catalogd."""
:   def test_memz(self):
: """Test all memz/ webpages"""
: relative_url = "/memz"
: 
> Don't think you need a class for this. Just have:
Done


PS6, Line 44: ge = request_web_page("lo
> comment needs finishing
Done


PS6, Line 45: assert page.status_cod
> I feel like this test can be written like:
Done


PS6, Line 49: 
> Does this fail if the web page doesn't exist?
I add assert for the status_code, so it will fail now


http://gerrit.cloudera.org:8080/#/c/3998/7/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

PS7, Line 17: #
: # Verification of impalad metrics after a test run.
> remove
Done


PS7, Line 21: from tests.common.errors import Timeout
> is this used?
no, removed


PS7, Line 21: from tests.common.errors import Timeout
> is this used?
Done


PS7, Line 27: (\
> remove the \, and try and put some of the parameters onto this line.
I decided not to wrap requests.get() into another function
remove request_web_page() for cleaner code


PS7, Line 31: resp.raise_for_status()
> does this raise an error when status != 200? If so you don't need the check
I think status_code is enough for our purpose of checking whether the system 
crashed on clicking /memz.


PS7, Line 35: """request web page /memz at imapalad / statestored / catalogd."""
> I'd remove this, it doesn't tell the reader anything that's not in the test
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3998
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Kathy Sun 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

2016-08-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot 
fail
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4018/4/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

Line 626:   ImpaladMetrics::NUM_FILES_OPEN_FOR_INSERT->Increment(-1);
> Should we do these before returning error? otherwise, the impalad metric ca
Yes, we would want that to not be decremented in case of an error.
So that test_verify_metrics can catch that a file(s) was not closed, in case of 
our test runs.

And otherwise too, users can see that there are some files still open.


-- 
To view, visit http://gerrit.cloudera.org:8080/4018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page

2016-08-19 Thread Kathy Sun (Code Review)
Kathy Sun has uploaded a new patch set (#8).

Change subject: IMPALA-3981: Fix crash when accessing statestored / catalogd 
/memz page
..

IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page

The /memz page tried to add JVM metrics even when they didn't exist for
all daemons, not just Impala. This led to a crash when they tried to
access ExecEnv::GetInstance() without an initialised ExecEnv at
statestored and catalogd

To fix, changed the memz handler method to take an optional metric
group, provided by the caller.  memz handler will check the existence of
the metric group.

Used C++11 lambdas rather than boost::bind to help simplify the code.

Testing: Ran locally and looked at impalad/memz, statestored/memz
and catalogd/memz

Add a test file test_web_pages.py to test sending request to /memz on
impalad / statestored / catalogd

Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29
---
M be/src/catalog/catalogd-main.cc
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/impala-server.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestored-main.cc
M be/src/util/default-path-handlers.cc
M be/src/util/default-path-handlers.h
M be/src/util/memory-metrics.cc
M be/src/util/metrics-test.cc
M be/src/util/metrics.cc
M be/src/util/metrics.h
A tests/webserver/test_web_pages.py
M www/memz.tmpl
15 files changed, 96 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/3998/8
-- 
To view, visit http://gerrit.cloudera.org:8080/3998
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Kathy Sun 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page

2016-08-19 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3981: Fix crash when accessing statestored / catalogd 
/memz page
..


Patch Set 8: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3998/8/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

PS8, Line 23: imapalad
impalad


-- 
To view, visit http://gerrit.cloudera.org:8080/3998
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Kathy Sun 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

2016-08-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot 
fail
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4018/4/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

Line 626:   ImpaladMetrics::NUM_FILES_OPEN_FOR_INSERT->Increment(-1);
> Yes, we would want that to not be decremented in case of an error.
Hmm, but what can a user do about that? and does an error mean the file isn't 
closed or that there is something else wrong with the file (maybe not even 
open).


-- 
To view, visit http://gerrit.cloudera.org:8080/4018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format

2016-08-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: REVIEW-ONLY: the results of running clang-format
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4046/1/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

Line 261
> Setting BreakBeforeBinaryOperators to NonAssignment or All fixes this, but 
what exactly does BreakBeforeBinaryOperators do? can you limit the scope to 
logical operators and boolean exprs?


Line 1038:**next_backend_on_host) != backends_on_host.end());
> I think it is ContinuationIndentWidth - 4 spaces, plus the length of the gr
shouldn't it be one or the other? i don't recall seeing this before.


-- 
To view, visit http://gerrit.cloudera.org:8080/4046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3996: Migrate to updated Kudu insert string API

2016-08-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3996: Migrate to updated Kudu insert string API
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I910c24724d0bc887b2d4a3e62ecdf72420a76f6f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

2016-08-19 Thread Sailesh Mukil (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4018

to look at the new patch set (#5).

Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot 
fail
..

IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

HdfsTableSink::Close() makes calls to functions that can fail with a
Status. However, since the function has a void return type, these
error statuses are just logged and we cannot take any action according
to the type of error.

This patch moves the closing of the partition file into the FlushFinal
function from Close(), so that in case of an error on closing the
file, the error is propagated up and some action can be taken.

We try and close all the partition files in the map in Close() as well
because if a query is cancelled, FlushFinal will not be called and we
would end up leaking some file descriptors.

Also fixed some long lines in this patch.

Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
---
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
2 files changed, 15 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4018/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page

2016-08-19 Thread Kathy Sun (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3998

to look at the new patch set (#9).

Change subject: IMPALA-3981: Fix crash when accessing statestored / catalogd 
/memz page
..

IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page

The /memz page tried to add JVM metrics even when they didn't exist for
all daemons, not just Impala. This led to a crash when they tried to
access ExecEnv::GetInstance() without an initialised ExecEnv at
statestored and catalogd

To fix, changed the memz handler method to take an optional metric
group, provided by the caller.  memz handler will check the existence of
the metric group.

Used C++11 lambdas rather than boost::bind to help simplify the code.

Testing: Ran locally and looked at impalad/memz, statestored/memz
and catalogd/memz

Add a test file test_web_pages.py to test sending request to /memz on
impalad / statestored / catalogd

Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29
---
M be/src/catalog/catalogd-main.cc
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/impala-server.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestored-main.cc
M be/src/util/default-path-handlers.cc
M be/src/util/default-path-handlers.h
M be/src/util/memory-metrics.cc
M be/src/util/metrics-test.cc
M be/src/util/metrics.cc
M be/src/util/metrics.h
A tests/webserver/test_web_pages.py
M www/memz.tmpl
15 files changed, 96 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/3998/9
-- 
To view, visit http://gerrit.cloudera.org:8080/3998
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Kathy Sun 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

2016-08-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot 
fail
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4018/4/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

Line 626:   }
> Hmm, but what can a user do about that? and does an error mean the file isn
As we spoke, this metric is not to catch errors, but just to count the number 
of tries to open and close files.


http://gerrit.cloudera.org:8080/#/c/4018/5/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

PS5, Line 620: partition->tmp_hdfs_file = NULL;
Had to move this up too. Else, we could have cases where the metric is 
decremented twice for the same file.


-- 
To view, visit http://gerrit.cloudera.org:8080/4018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3786: Replace "cloudera" with "apache" (part 2)

2016-08-19 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3786: Replace "cloudera" with "apache" (part 2)
..


Patch Set 3: -Code-Review

It passed the packaging build: 
http://golden.jenkins.cloudera.com/job/CDH5-Packaging-Impala-On-Demand/347/

-- 
To view, visit http://gerrit.cloudera.org:8080/3937
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0d35fa6602a7fc0c212b2ef5e2b3322b77dde7e2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3786: Replace "cloudera" with "apache" (part 1)

2016-08-19 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3786: Replace "cloudera" with "apache" (part 1)
..


Patch Set 2:

It passed the packaging build: 
http://golden.jenkins.cloudera.com/job/CDH5-Packaging-Impala-On-Demand/348/

(posted the wrong link for this patch in my previous comment)

-- 
To view, visit http://gerrit.cloudera.org:8080/3936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3767dd1ee86df767075fdf1b371eb6b0b06668db
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format

2016-08-19 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: REVIEW-ONLY: the results of running clang-format
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4046/1/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

Line 261
> what exactly does BreakBeforeBinaryOperators do? can you limit the scope to
> The way to wrap binary operators.
>
> Possible values:
>
> BOS_None (in configuration: None) Break after operators.
> BOS_NonAssignment (in configuration: NonAssignment) Break before operators 
> that aren't assignments.
> BOS_All (in configuration: All) Break before operators.

It looks to me like you cannot limit the scope to logical operators and binary 
expressions.

BTW, I got that description from http://zed0.co.uk/clang-format-configurator/


Line 1038:**next_backend_on_host) != backends_on_host.end());
> shouldn't it be one or the other? i don't recall seeing this before.
I guess that's not how clang-format sees it.


-- 
To view, visit http://gerrit.cloudera.org:8080/4046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-19 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..


Patch Set 2:

(6 comments)

Were any tests affected?

http://gerrit.cloudera.org:8080/#/c/3994/2/be/src/util/webserver.cc
File be/src/util/webserver.cc:

PS2, Line 223: http_address_.hostname.compare
Can you use IsWildcardAddress(http_address_) ?


PS2, Line 224: If GetHostname() returns an error, hostname will be 0.0.0.0
um? if GetHostname() returns an error, hostname will be undefined.


http://gerrit.cloudera.org:8080/#/c/3994/2/shell/impala_client.py
File shell/impala_client.py:

PS2, Line 238: esult.version, result.webserver_address
Just return result?


http://gerrit.cloudera.org:8080/#/c/3994/2/shell/impala_shell.py
File shell/impala_shell.py:

PS2, Line 686: self.webserver_address.replace("localhost", "127.0.0.1")
why?


Line 877: self._print_if_verbose("Query submitted at: %s (Coordinator: 
%s)" % (time.strftime(
long line


PS2, Line 880: self._print_if_verbose("Query submitted at: %s" % time.strftime(
 : "%Y-%m-%d %H:%M:%S", time.localtime()))
does this print for 'use' statements as well? I still think that's a bit 
verbose. Maybe remove this branch?


-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format

2016-08-19 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new patch set (#2).

Change subject: REVIEW-ONLY: the results of running clang-format
..

REVIEW-ONLY: the results of running clang-format

See https://gerrit.cloudera.org/#/c/3886 for the configuration

time -p (for FILENAME in $(find be/src/ -name '*.cc' -or -name '*.h' \
| grep -v '/gutil/' | grep -v '/thirdparty/'); do echo $FILENAME; \
clang-format-3.6 -i -style=file $FILENAME; done)

Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de
---
M .clang-format
M be/src/benchmarks/atod-benchmark.cc
M be/src/benchmarks/atof-benchmark.cc
M be/src/benchmarks/atoi-benchmark.cc
M be/src/benchmarks/bitmap-benchmark.cc
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/benchmarks/expr-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/benchmarks/int-hash-benchmark.cc
M be/src/benchmarks/lock-benchmark.cc
M be/src/benchmarks/multiint-benchmark.cc
M be/src/benchmarks/network-perf-benchmark.cc
M be/src/benchmarks/overflow-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/benchmarks/rle-benchmark.cc
M be/src/benchmarks/row-batch-serialize-benchmark.cc
M be/src/benchmarks/status-benchmark.cc
M be/src/benchmarks/string-benchmark.cc
M be/src/benchmarks/string-compare-benchmark.cc
M be/src/benchmarks/string-search-benchmark.cc
M be/src/benchmarks/thread-create-benchmark.cc
M be/src/benchmarks/tuple-layout-benchmark.cc
M be/src/bufferpool/buffer-pool.h
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalog-util.cc
M be/src/catalog/catalog-util.h
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/catalog/catalogd-main.cc
M be/src/codegen/codegen-anyval-ir.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/codegen/codegen-symbol-emitter.cc
M be/src/codegen/codegen-symbol-emitter.h
M be/src/codegen/impala-ir-data.h
M be/src/codegen/impala-ir.h
M be/src/codegen/instruction-counter-test.cc
M be/src/codegen/instruction-counter.cc
M be/src/codegen/instruction-counter.h
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/codegen/mcjit-mem-mgr.h
M be/src/common/atomic-test.cc
M be/src/common/atomic.h
M be/src/common/compiler-util.h
M be/src/common/global-flags.cc
M be/src/common/global-types.h
M be/src/common/hdfs.h
M be/src/common/init.cc
M be/src/common/init.h
M be/src/common/logging.cc
M be/src/common/logging.h
M be/src/common/names.h
M be/src/common/object-pool.h
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/aggregation-node-ir.cc
M be/src/exec/aggregation-node.cc
M be/src/exec/aggregation-node.h
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/base-sequence-scanner.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/catalog-op-executor.cc
M be/src/exec/catalog-op-executor.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/data-source-scan-node.cc
M be/src/exec/data-source-scan-node.h
M be/src/exec/delimited-text-parser-test.cc
M be/src/exec/delimited-text-parser.cc
M be/src/exec/delimited-text-parser.h
M be/src/exec/delimited-text-parser.inline.h
M be/src/exec/empty-set-node.cc
M be/src/exec/empty-set-node.h
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/external-data-source-executor.cc
M be/src/exec/external-data-source-executor.h
M be/src/exec/filter-context.cc
M be/src/exec/filter-context.h
M be/src/exec/hash-join-node-ir.cc
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hash-table-ir.cc
M be/src/exec/hash-table-test.cc
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hbase-table-scanner.cc
M be/src/exec/hbase-table-scanner.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hbase-table-writer.cc
M be/src/exec/hbase-table-writer.h
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner-test.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-avro-table-writer.cc
M be/src/exec/hdfs-avro-table-writer.h
M be/src/exec/hdfs-lzo-text-scanner.cc
M be/src/exec/hdfs-lzo-text-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M be/src/exec/hdfs-sequence-ta

[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format

2016-08-19 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: REVIEW-ONLY: the results of running clang-format
..


Patch Set 2: Verified-1

This version adds NonAssignment and then shows how it behaves on 
simple-scheduler.cc

-- 
To view, visit http://gerrit.cloudera.org:8080/4046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format

2016-08-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: REVIEW-ONLY: the results of running clang-format
..


Patch Set 2:

looks good

-- 
To view, visit http://gerrit.cloudera.org:8080/4046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] Add .clang-format for Impala's C++ style

2016-08-19 Thread Jim Apple (Code Review)
Hello Marcel Kornacker, Henry Robinson, Internal Jenkins, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3886

to look at the new patch set (#6).

Change subject: Add .clang-format for Impala's C++ style
..

Add .clang-format for Impala's C++ style

Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa
---
A .clang-format
1 file changed, 13 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/3886/6
-- 
To view, visit http://gerrit.cloudera.org:8080/3886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Add .clang-format for Impala's C++ style

2016-08-19 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Add .clang-format for Impala's C++ style
..


Patch Set 6:

PS6 Added

BreakBeforeBinaryOperators: 'NonAssignment'

from  http://gerrit.cloudera.org:8080/4046

-- 
To view, visit http://gerrit.cloudera.org:8080/3886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


Re: [DISCUSS] Branching for a release soon?

2016-08-19 Thread Jim Apple
Todd, I see this as being useful for future releases, even if it
doesn't get used for this one. Thanks for volunteering to look into
that.

On Fri, Aug 19, 2016 at 11:12 AM, Todd Lipcon  wrote:
> I believe the newer versions of gerrit support a label-like concept called
> 'hashtags' (apparently the authors of Gerrit love Instagram). They allow
> you to tag a review/commit with arbitrary strings. Perhaps this feature
> could be used so that patch authors can tag their reviews/commits as
> "stable-candidate" and Jim can easily query for them? Then when it has been
> cherry-picked to the branch, a "stable-picked" tag can be added to indicate
> it has been merged?
>
> If this sounds useful, let me know and I can look into enabling the feature
> on gerrit.cloudera.org. I think some non-trivial config change has to be
> made, though, so would take a week or two.
>
> -Todd
>
> On Fri, Aug 19, 2016 at 9:38 AM, Tim Armstrong 
> wrote:
>
>> I think the first sounds easiest. If commit authors really think something
>> should go in the release they can always reach out to you.
>>
>> On Fri, Aug 19, 2016 at 9:15 AM, Jim Apple  wrote:
>>
>> > I hadn't decided yet. What do you think?
>> >
>> > I could test the release, then look for bug fixes that landed in
>> > master after my branch started that fix anything that doesn't pass.
>> >
>> > Or I could reach out to commit authors and ask them, or I could ask
>> > them to email me, like you mentioned.
>> >
>> > Or I could read commit messages and cherry-pick anything that is a bug
>> fix.
>> >
>> > On Fri, Aug 19, 2016 at 9:01 AM, Tim Armstrong 
>> > wrote:
>> > > How do you plan to choose which commits to cherry-pick? Should we let
>> you
>> > > know if we think a patch should/shouldn't be part of the release?
>> > >
>> > > On Fri, Aug 19, 2016 at 8:44 AM, Tom White  wrote:
>> > >
>> > >> Thanks for volunteering to do the release Jim! The plan looks fine to
>> > me.
>> > >>
>> > >> Tom
>> > >>
>> > >> On Fri, Aug 19, 2016 at 4:42 PM, Todd Lipcon 
>> wrote:
>> > >> > Sounds great to me!
>> > >> >
>> > >> > Todd
>> > >> >
>> > >> > On Aug 19, 2016 8:41 AM, "Jim Apple"  wrote:
>> > >> >
>> > >> >> I would like to volunteer to be the next release manager for an
>> > >> >> upcoming Apache Impala 2.7 (incubating) release. I plan to:
>> > >> >>
>> > >> >> 1. Get a vote done adding to our bylaws the procedure for creating
>> a
>> > >> >> branch. (See thread from yesterday)
>> > >> >>
>> > >> >> 2. Create a release branch, possibly backdated a few commits from
>> > HEAD
>> > >> >> to a commit that looks stable.
>> > >> >>
>> > >> >> 3. Maintain that branch by cherry-picking bugfix commits from
>> > "master"
>> > >> >> as necessary to get the tests running cleanly.
>> > >> >>
>> > >> >> 4. Make a release candidate and call for a vote here, then on the
>> > IPMC
>> > >> >> list (which is required for Incubator releases)
>> > >> >>
>> > >> >> Obviously, this glosses over some details about publishing and
>> > signing
>> > >> >> releases, but I would welcome feedback on this outline.
>> > >> >>
>> > >>
>> >
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera


[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail

2016-08-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot 
fail
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1731,IMPALA-3868: Float values are not parsed correctly

2016-08-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-1731,IMPALA-3868: Float values are not parsed correctly
..


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3791
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9e17d0f051b300a22a520ce34e276c2d4460d35e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3221: Copyright / license audit

2016-08-19 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3221: Copyright / license audit
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3995/1/LICENSE.txt
File LICENSE.txt:

Line 461: shell/ext-py/prettytable-0.7.1: 3-clause BSD (see 
shell/ext-py/prettytable-0.7.1/PKG-INFO)
> I noticed that one 3-clause BSD licence is copied in above, while these are
There's no license or copyright text included with prettytable etc. that I 
could find, so nothing to copy here. For d3, note that there are a couple of 
mentions of the author.


-- 
To view, visit http://gerrit.cloudera.org:8080/3995
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I24a868aec6a4f17f4ccca1b088d2f0de32f75d87
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3221: Copyright / license audit

2016-08-19 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#2).

Change subject: IMPALA-3221: Copyright / license audit
..

IMPALA-3221: Copyright / license audit

Populates LICENSE.txt with known third-party licenses in the Impala
codebase.

Change-Id: I24a868aec6a4f17f4ccca1b088d2f0de32f75d87
---
M LICENSE.txt
D be/src/gutil/LICENSE.txt
M be/src/runtime/string-search.h
3 files changed, 297 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/3995/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3995
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I24a868aec6a4f17f4ccca1b088d2f0de32f75d87
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-3221: Copyright / license audit

2016-08-19 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3221: Copyright / license audit
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3995/1/LICENSE.txt
File LICENSE.txt:

Line 461: shell/ext-py/prettytable-0.7.1: 3-clause BSD (see 
shell/ext-py/prettytable-0.7.1/PKG-INFO)
> There's no license or copyright text included with prettytable etc. that I 
There is sopy copyright text in shell/ext-py/prettytable-0.7.1/COPYING. 
sqlparse is also in COPYING. sasl points to toddlipcon's github page, which 
says the code was borrowed from qpid, which is an apache project. Do you think 
there needs to be any copying there?


-- 
To view, visit http://gerrit.cloudera.org:8080/3995
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I24a868aec6a4f17f4ccca1b088d2f0de32f75d87
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3999: Embed the query-wide fragment instance index in the instance id

2016-08-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has uploaded a new patch set (#2).

Change subject: IMPALA-3999: Embed the query-wide fragment instance index in 
the instance id
..

IMPALA-3999: Embed the query-wide fragment instance index in the instance id

This adds utility function in uid-util.h to create query and instance ids and 
convert between them.
It also adapts SimpleScheduler to utilize those functions when creating the 
instance id
(TPlanFragmentInstanceCtx.fragment_instance_id).

Change-Id: Ia56a03ed9a1d7e77c72b66a01cd48c5b6bf3624f
---
M be/src/scheduling/simple-scheduler.cc
M be/src/scheduling/simple-scheduler.h
M be/src/service/impala-server.cc
M be/src/util/CMakeLists.txt
A be/src/util/uid-util-test.cc
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
7 files changed, 106 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/4065/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia56a03ed9a1d7e77c72b66a01cd48c5b6bf3624f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#3).

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..

IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, 
overly verbose

The webserver address was always configured as 0.0.0.0 (meaning that
the webserver could be reached on any IP for that machine) unless
otherwise specified. This is not a correct value to dispay to the
user. This patch returns the hostname of the node, when requested,
if the webserver host address is 0.0.0.0.

This patch also does not print the coordinator link for very simple
queries, as it's not necessary and is unnecessarily verbose.

This patch also does away with pinging the impalad an extra time per
query for finding the host time and webserver address. It instead
remembers the webserver address at connect time and displays client
local time for every query instead.

Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
---
M be/src/service/impala-beeswax-server.cc
M be/src/util/webserver.cc
M common/thrift/ImpalaService.thrift
M shell/impala_client.py
M shell/impala_shell.py
5 files changed, 31 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/3994/3
-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] Add .clang-format for Impala's C++ style

2016-08-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: Add .clang-format for Impala's C++ style
..


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/3994/2/be/src/util/webserver.cc
File be/src/util/webserver.cc:

PS2, Line 223: sWildcardAddress(http_address_
> Can you use IsWildcardAddress(http_address_) ?
Done


PS2, Line 224: If GetHostname() returns an error, hostname will remain as
> um? if GetHostname() returns an error, hostname will be undefined.
I assign hostname in L222. So if GetHostname() returns an error, hostname will 
remain whatever value it was (which in this case is 0.0.0.0). Clarified the 
comment a bit.


http://gerrit.cloudera.org:8080/#/c/3994/2/shell/impala_client.py
File shell/impala_client.py:

PS2, Line 238: esult
> Just return result?
Done


http://gerrit.cloudera.org:8080/#/c/3994/2/shell/impala_shell.py
File shell/impala_shell.py:

PS2, Line 686: ult.version
> why?
Else it will print http://localhost:25000 vs http://127.0.0.1:25000
If you think that's fine, I can remove this line.


Line 877: 
> long line
Done


PS2, Line 880: (time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()),
 : self.webserver_address))
> does this print for 'use' statements as well? I still think that's a bit ve
Yes it does. I forgot that we never used to print the time before. I've removed 
it.


-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-19 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3994/2/be/src/util/webserver.cc
File be/src/util/webserver.cc:

PS2, Line 224: If GetHostname() returns an error, hostname will be 0.0.0.0
> I assign hostname in L222. So if GetHostname() returns an error, hostname w
GetHostname() makes no guarantees about what happens to the argument in the 
error case.


-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-19 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3994/2/shell/impala_shell.py
File shell/impala_shell.py:

PS2, Line 686: self.webserver_address.replace("localhost", "127.0.0.1")
> Else it will print http://localhost:25000 vs http://127.0.0.1:25000
What's wrong with that? They're both the same address.


-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3999: Embed the query-wide fragment instance index in the instance id

2016-08-19 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3999: Embed the query-wide fragment instance index in 
the instance id
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/4065/2//COMMIT_MSG
Commit Message:

PS2, Line 7: 3999
nit: please use 3988; 3999 is a duplicate.


PS2, Line 9: This adds utility function in uid-util.h to create query and 
instance ids and convert between them.
   : It also adapts SimpleScheduler to utilize those functions when 
creating the instance id
   : (TPlanFragmentInstanceCtx.fragment_instance_id).
Long lines.


http://gerrit.cloudera.org:8080/#/c/4065/2/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

PS2, Line 512: CreateInstanceId(schedule->query_id(), instance_idx, 
&instance_id);
not clear that you really need this instead of just keeping a TUniqueId and 
incrementing id.lo on every time through this loop.


http://gerrit.cloudera.org:8080/#/c/4065/2/be/src/util/uid-util.h
File be/src/util/uid-util.h:

PS2, Line 62: void
why not return TUniqueId? Easier to use inline.


PS2, Line 69: void
again, why not return?


http://gerrit.cloudera.org:8080/#/c/4065/2/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

Line 257:   // The bottom NUM_FRAGMENT_INSTANCE_IDX_BITS are 0.
refer to where this is defined


-- 
To view, visit http://gerrit.cloudera.org:8080/4065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia56a03ed9a1d7e77c72b66a01cd48c5b6bf3624f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3610: Account for memory used by filters in the coordinator

2016-08-19 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4066

Change subject: IMPALA-3610: Account for memory used by filters in the 
coordinator
..

IMPALA-3610: Account for memory used by filters in the coordinator

Before this patch, Impala would not account for the memory used to
aggregate runtime filters together in the coordinator. Impala's memory
could therefore be silently overcommitted.

This patch accounts for aggregated filter memory in a new filter
memtracker that is attached to the coordinator's query_mem_tracker(). If
the query memory limit is exceeded when a filter update arrives, that
update is discarded. If the filter is from a partitioned join, the
entire filter can therefore be discarded immediately (to alleviate
memory pressure) and a dummy 'always true' filter is sent to backends to
unblock them.

If the filter is from a broadcast join, no aggregation is done, so there
is no tracking. The Thrift input and output filter data structures are
not tracked (as we generally don't track RPC objects, but plan to in the
future).

Memory that is added to a memtracker must always be released. To do
this, we need to signal to the coordinator that it is finished, and that
there is no point trying to process any future updates that might arrive
concurrently. This patch adds Coordinator::Done() which is called from
QueryExecState::Done(), and which releases memory from all in-process
runtime filters.

Finally, this patch increases the upper limit for runtime filters to
512MB. This allows testing on very large datasets. The default maximum
is still 16MB, per RUNTIME_FILTER_MAX_SIZE.

Testing: Added a new test that triggers the OOM condition on the
coordinator. All existing runtime filter tests pass.

Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b
---
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/mem-tracker.h
M be/src/runtime/runtime-filter-bank.h
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M 
testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters_phj.test
7 files changed, 138 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/4066/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4066
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR] IMPALA-3610: Account for memory used by filters in the coordinator

2016-08-19 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3610: Account for memory used by filters in the 
coordinator
..


Patch Set 1:

I took Sailesh's suggestion of special-casting the treatment of broadcast 
filters and not copying them into an intermediate filter in the filter bank.

While this is a big simplification, it does make it a bit harder to track the 
outgoing Thrift filter data structure; the rough argument in favour of that 
being ok is that the incoming data structure is not tracked (yet) anyhow, so 
we're really moving the memory used for that to the outgoing data structure.  
It would be good to fix that for all Thrift structures.

-- 
To view, visit http://gerrit.cloudera.org:8080/4066
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#4).

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..

IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, 
overly verbose

The webserver address was always configured as 0.0.0.0 (meaning that
the webserver could be reached on any IP for that machine) unless
otherwise specified. This is not a correct value to dispay to the
user. This patch returns the hostname of the node, when requested,
if the webserver host address is 0.0.0.0.

This patch also does not print the coordinator link for very simple
queries, as it's not necessary and is unnecessarily verbose.

This patch also does away with pinging the impalad an extra time per
query for finding the host time and webserver address. It instead
remembers the webserver address at connect time and displays client
local time for every query instead.

Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
---
M be/src/service/impala-beeswax-server.cc
M be/src/util/webserver.cc
M common/thrift/ImpalaService.thrift
M shell/impala_client.py
M shell/impala_shell.py
5 files changed, 30 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/3994/4
-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..


Patch Set 4:

(2 comments)

Also, all the shell tests pass with these changes.

http://gerrit.cloudera.org:8080/#/c/3994/2/be/src/util/webserver.cc
File be/src/util/webserver.cc:

PS2, Line 224: e).ok()) {
> GetHostname() makes no guarantees about what happens to the argument in the
Makes sense. Done.


http://gerrit.cloudera.org:8080/#/c/3994/2/shell/impala_shell.py
File shell/impala_shell.py:

PS2, Line 686: ult.version
> What's wrong with that? They're both the same address.
I was under the impression that some shell environments wouldn't create an 
automatic hyperlink for an address with "localhost". But changed it in any case 
as it doesn't matter that much.


-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3943: Adhere to abort on error in ProcessFooter().

2016-08-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3943: Adhere to abort_on_error in ProcessFooter().
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3862/7/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 188:   if (!status.ok()) return state_->LogOrReturnError(status.msg());
> Don't we want to return early if there was any kind of error? If its a non-
Please disregard this comment, I was confused.


-- 
To view, visit http://gerrit.cloudera.org:8080/3862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6aff766a1ce6376efb329bdde51c648149dfe08c
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures

2016-08-19 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures
..


IMPALA-3962: Clean up scratch tuple batch on scan failures

The parquet scanner doesn't clean up 'scratch_batch_' properly
which causes it to process a partially filled scratch_batch_
if any of the column reader fails. This change cleans up the
scratch batch if any of the parquet column readers fails.
The clean up involves freeing the mem_pool of scratch_batch_
and setting number of tuples in scratch_batch_ to 0.

This change also extends debug action to emulate the behavior
of exceeding the query's memory limit.

Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Reviewed-on: http://gerrit.cloudera.org:8080/3991
Reviewed-by: Michael Ho 
Tested-by: Internal Jenkins
---
M be/src/exec/exec-node.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M common/thrift/PlanNodes.thrift
M tests/failure/test_failpoints.py
10 files changed, 76 insertions(+), 22 deletions(-)

Approvals:
  Michael Ho: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures

2016-08-19 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures
..


Patch Set 5: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3999: Embed the query-wide fragment instance index in the instance id

2016-08-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3999: Embed the query-wide fragment instance index in 
the instance id
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/4065/2//COMMIT_MSG
Commit Message:

PS2, Line 7: 3999
> nit: please use 3988; 3999 is a duplicate.
Done


PS2, Line 9: This adds utility function in uid-util.h to create query and 
instance ids and convert between them.
   : It also adapts SimpleScheduler to utilize those functions when 
creating the instance id
   : (TPlanFragmentInstanceCtx.fragment_instance_id).
> Long lines.
this is a weird line limit, where did this come from?


http://gerrit.cloudera.org:8080/#/c/4065/2/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

PS2, Line 512: CreateInstanceId(schedule->query_id(), instance_idx, 
&instance_id);
> not clear that you really need this instead of just keeping a TUniqueId and
this way it's easier to change the layout, if that should ever be needed. and i 
don't see any downside.


http://gerrit.cloudera.org:8080/#/c/4065/2/be/src/util/uid-util.h
File be/src/util/uid-util.h:

PS2, Line 62: void
> why not return TUniqueId? Easier to use inline.
we typically only return scalars


PS2, Line 69: void
> again, why not return?
see above


http://gerrit.cloudera.org:8080/#/c/4065/2/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

Line 257:   // The bottom NUM_FRAGMENT_INSTANCE_IDX_BITS are 0.
> refer to where this is defined
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia56a03ed9a1d7e77c72b66a01cd48c5b6bf3624f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add .clang-format for Impala's C++ style

2016-08-19 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Add .clang-format for Impala's C++ style
..


Patch Set 6:

> Seems like there's consensus on the mailing list. Could you add a
 > page to the wiki explaining what this is, how we expect it to be
 > used and so on?

Added a brief note: 
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=65147115&selectedPageVersions=10&selectedPageVersions=9

The official clang-format documentation I linked to explains how to use 
clang-format well, and briefly.

-- 
To view, visit http://gerrit.cloudera.org:8080/3886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


  1   2   >