[Impala-ASF-CR] IMPALA-3610: Account for memory used by filters in the coordinator

2016-08-23 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3610: Account for memory used by filters in the 
coordinator
..


Patch Set 1:

(1 comment)

Added code to move, rather than copy, input filter to output, which prevents 
briefly having two copies of the same large filter.

http://gerrit.cloudera.org:8080/#/c/4066/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS1, Line 1994: ++state->num_received;
  : DCHECK_LE(state->num_received, 
state->src_fragment_instance_idxs.size());
> This doesn't seem very necessary anymore. It seems redundant seeing that we
Good catch, done.


-- 
To view, visit http://gerrit.cloudera.org:8080/4066
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3610: Account for memory used by filters in the coordinator

2016-08-23 Thread Henry Robinson (Code Review)
Hello Sailesh Mukil,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4066

to look at the new patch set (#2).

Change subject: IMPALA-3610: Account for memory used by filters in the 
coordinator
..

IMPALA-3610: Account for memory used by filters in the coordinator

Before this patch, Impala would not account for the memory used to
aggregate runtime filters together in the coordinator. Impala's memory
could therefore be silently overcommitted.

This patch accounts for aggregated filter memory in a new filter
memtracker that is attached to the coordinator's query_mem_tracker(). If
the query memory limit is exceeded when a filter update arrives, that
update is discarded. If the filter is from a partitioned join, the
entire filter can therefore be discarded immediately (to alleviate
memory pressure) and a dummy 'always true' filter is sent to backends to
unblock them.

If the filter is from a broadcast join, no aggregation is done, so there
is no tracking. The Thrift input and output filter data structures are
not tracked (as we generally don't track RPC objects, but plan to in the
future). The filter payload is moved from the input request structure to
the output broadcast structure without copying.

Memory that is added to a memtracker must always be released. To do
this, we need to signal to the coordinator that it is finished, and that
there is no point trying to process any future updates that might arrive
concurrently. This patch adds Coordinator::Done() which is called from
QueryExecState::Done(), and which releases memory from all in-process
runtime filters.

Finally, this patch increases the upper limit for runtime filters to
512MB. This allows testing on very large datasets. The default maximum
is still 16MB, per RUNTIME_FILTER_MAX_SIZE.

Testing: Added a new test that triggers the OOM condition on the
coordinator. All existing runtime filter tests pass.

Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b
---
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/mem-tracker.h
M be/src/runtime/runtime-filter-bank.h
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M 
testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters_phj.test
7 files changed, 139 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/4066/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4066
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


Re: Error linking impalad with -build_shared_libs option

2016-08-23 Thread Daniel Hecht
Hi Martin,

Nice to hear from you. https://issues.cloudera.org/browse/IMPALA-3979

Thanks,
Dan

On Tuesday, August 23, 2016, Martin Grund (Das Grundprinzip.de) <
grundprin...@gmail.com> wrote:

> I've seen this in the past and propose a patch to fix it:
> https://gerrit.cloudera.org/#/c/4108/
>
> Let me know if you have JIRA to track this to update the commit message.
>
> Martin
>
> On Mon, Aug 22, 2016 at 10:15 PM Laszlo Gaal  >
> wrote:
>
> > link.txt is generated by cmake based on the contents of CMakeLists.txt
> and
> > the project/environment settings. Most of the link-time dependencies seem
> > to be set up in $(IMPALA_HOME)/be/CMakeLists.txt, based on settings
> > calculated one level up in $(IMPALA_HOME)/CMakeLists.txt.
> > $(IMPALA_HOME)/be/CMakeLists.txt looks complex; I'll look more into it
> > tomorrow.
> >
> > Thanks,
> >
> > - LaszloG
> >
> > On Mon, Aug 22, 2016 at 6:07 PM, Yonghyun Hwang  >
> > wrote:
> >
> > > I observed the same linking error and digged this into further. The
> below
> > > is what I've found.
> > >
> > > *1. how to reproduce*
> > >
> > > # assuming that impala-config.sh is sourced
> > > $ cd ${IMPALA_HOME}
> > > *$ make -j8 impalad*
> > > ...
> > > [100%] Built target CodeGen
> > > Linking CXX executable ../../build/debug/service/impalad
> > > ...
> > > ../../build/debug/util/libUtil.so: undefined reference to
> > > `google_breakpad::ExceptionHandler::WriteMinidump(std::string const&,
> > bool
> > > (*)(google_breakpad::MinidumpDescriptor const&, void*, bool), void*)'
> > > ../../build/debug/util/libUtil.so: undefined reference to
> > > `snappy::MaxCompressedLength(unsigned long)'
> > > ../../build/debug/util/libUtil.so: undefined reference to `my_strlen'
> > > ../../build/debug/util/libUtil.so: undefined reference to
> `LZ4_compress'
> > > ../../build/debug/util/libUtil.so: undefined reference to
> > > `BZ2_bzDecompress'
> > > ...
> > >
> > >
> > > *$ cd ${IMPALA_HOME}/be/src/service*
> > > *$ ${IMPALA_HOME}/toolchain/cmake-3.2.3-p1/bin/cmake -E  *
> > > *cmake_link_script CMakeFiles/impalad.dir/link.txt --verbose=*
> > > ...
> > > ../../build/debug/util/libUtil.so: undefined reference to
> > > ...
> > > *../../build/debug/util/libUtil.so: undefined reference to
> > `LZ4_compress'*
> > > *../../build/debug/util/libUtil.so: undefined reference to
> > > `BZ2_bzDecompress'*
> > > ../../build/debug/util/libUtil.so: undefined reference to
> > > *`snappy::RawCompress*(char const*,
> > > unsigned long, char*, unsigned long*)'
> > > ...
> > > boost::regex_traits >
> > >::do_assign(char
> > > const*, char
> > > const*, unsigned int)'
> > > ...
> > > ../../build/debug/util/libUtil.so: undefined reference to *`snappy::*
> > > RawUncompress(char
> > > const*, unsigned long, char*)'
> > >
> > > *2. cause of the failure*
> > >
> > > $ cd ${IMPALA_HOME}/be/src/service
> > > *$ cat CMakeFiles/impalad.dir/link.txt*
> > > ${IMPALA_HOME}/toolchain/gcc-4.9.2/bin/g++-Wall -Wno-sign-compare
> > > -Wno-unknown-pragmas -pthread -fno-strict-aliasing -std=c++14
> > > -Wno-deprecated ...
> > > *../../../toolchain/snappy-1.1.3/lib/libsnappy.a
> > > ../../../toolchain/lz4-svn/lib/liblz4.a*
> > > ...
> > > *../../../toolchain/boost-1.57.0/lib/libboost_regex.a*
> > > ...
> > > *../../build/debug/util/libUtil.so*
> > > -lpthread -ldl -lssl -lcrypto
> > >
> > > Apparently, libUtil.so __should__ show before libsnappy.a and *regex.a.
> > >
> > >
> > > Next step is to find out who creates CMakeFiles/impalad.dir/link.txt
> and
> > > why libUtil.so is placed after libsnappy.a, liblz4.a, and
> > libboost_regex.a
> > >
> > > Thank you,
> > > Yonghyun
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Aug 17, 2016 at 10:13 AM, Henry Robinson  >
> > > wrote:
> > >
> > > > Only that it's IMPALA-3979. Does anyone have any theories yet?
> > > >
> > > > On 17 August 2016 at 10:10, Dimitris Tsirogiannis <
> > > > dtsirogian...@cloudera.com > wrote:
> > > >
> > > > > Hey guys,
> > > > >
> > > > > When I build from the latest master using the -build_shared_libs
> > > option I
> > > > > get linking errors (see [1]). Building with static works fine. Any
> > > ideas?
> > > > >
> > > > > Thanks
> > > > > Dimitris
> > > > >
> > > > >
> > > > > [1]
> > > > > Linking CXX executable ../../build/debug/service/impalad
> > > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > > 'BZ2_bzBuffToBuffCompress'
> > > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > > 'snappy::MaxCompressedLength(unsigned long)'
> > > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > > 'snappy::RawCompress(char const*, unsigned long, char*, unsigned
> > > long*)'
> > > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > > 'LZ4_compress'
> > > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > > 'BZ2_bzDecompressEnd'
> > > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > > 'BZ2_bzDecompressInit'
> 

[Impala-CR](cdh5-trunk) IMPALA-3610: Account for memory used by filters in the coordinator

2016-08-23 Thread Henry Robinson (Code Review)
Henry Robinson has abandoned this change.

Change subject: IMPALA-3610: Account for memory used by filters in the 
coordinator
..


Abandoned

Moved to ASF project.

-- 
To view, visit http://gerrit.cloudera.org:8080/3618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3221: Copyright / license audit

2016-08-23 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3221: Copyright / license audit
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3995/1/LICENSE.txt
File LICENSE.txt:

Line 461: shell/ext-py/prettytable-0.7.1: 3-clause BSD (see 
shell/ext-py/prettytable-0.7.1/PKG-INFO)
> There is sopy copyright text in shell/ext-py/prettytable-0.7.1/COPYING. sql
Thanks for finding those! I don't *think* we need a COPYING for sasl; the entry 
in LICENSE is enough (I think).


-- 
To view, visit http://gerrit.cloudera.org:8080/3995
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I24a868aec6a4f17f4ccca1b088d2f0de32f75d87
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3221: Copyright / license audit

2016-08-23 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#3).

Change subject: IMPALA-3221: Copyright / license audit
..

IMPALA-3221: Copyright / license audit

Populates LICENSE.txt with known third-party licenses in the Impala
codebase.

Change-Id: I24a868aec6a4f17f4ccca1b088d2f0de32f75d87
---
M LICENSE.txt
D be/src/gutil/LICENSE.txt
M be/src/runtime/string-search.h
3 files changed, 354 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/3995/3
-- 
To view, visit http://gerrit.cloudera.org:8080/3995
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I24a868aec6a4f17f4ccca1b088d2f0de32f75d87
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 


Re: Error linking impalad with -build_shared_libs option

2016-08-23 Thread Martin Grund (Das Grundprinzip.de)
I've seen this in the past and propose a patch to fix it:
https://gerrit.cloudera.org/#/c/4108/

Let me know if you have JIRA to track this to update the commit message.

Martin

On Mon, Aug 22, 2016 at 10:15 PM Laszlo Gaal 
wrote:

> link.txt is generated by cmake based on the contents of CMakeLists.txt and
> the project/environment settings. Most of the link-time dependencies seem
> to be set up in $(IMPALA_HOME)/be/CMakeLists.txt, based on settings
> calculated one level up in $(IMPALA_HOME)/CMakeLists.txt.
> $(IMPALA_HOME)/be/CMakeLists.txt looks complex; I'll look more into it
> tomorrow.
>
> Thanks,
>
> - LaszloG
>
> On Mon, Aug 22, 2016 at 6:07 PM, Yonghyun Hwang 
> wrote:
>
> > I observed the same linking error and digged this into further. The below
> > is what I've found.
> >
> > *1. how to reproduce*
> >
> > # assuming that impala-config.sh is sourced
> > $ cd ${IMPALA_HOME}
> > *$ make -j8 impalad*
> > ...
> > [100%] Built target CodeGen
> > Linking CXX executable ../../build/debug/service/impalad
> > ...
> > ../../build/debug/util/libUtil.so: undefined reference to
> > `google_breakpad::ExceptionHandler::WriteMinidump(std::string const&,
> bool
> > (*)(google_breakpad::MinidumpDescriptor const&, void*, bool), void*)'
> > ../../build/debug/util/libUtil.so: undefined reference to
> > `snappy::MaxCompressedLength(unsigned long)'
> > ../../build/debug/util/libUtil.so: undefined reference to `my_strlen'
> > ../../build/debug/util/libUtil.so: undefined reference to `LZ4_compress'
> > ../../build/debug/util/libUtil.so: undefined reference to
> > `BZ2_bzDecompress'
> > ...
> >
> >
> > *$ cd ${IMPALA_HOME}/be/src/service*
> > *$ ${IMPALA_HOME}/toolchain/cmake-3.2.3-p1/bin/cmake -E  *
> > *cmake_link_script CMakeFiles/impalad.dir/link.txt --verbose=*
> > ...
> > ../../build/debug/util/libUtil.so: undefined reference to
> > ...
> > *../../build/debug/util/libUtil.so: undefined reference to
> `LZ4_compress'*
> > *../../build/debug/util/libUtil.so: undefined reference to
> > `BZ2_bzDecompress'*
> > ../../build/debug/util/libUtil.so: undefined reference to
> > *`snappy::RawCompress*(char const*,
> > unsigned long, char*, unsigned long*)'
> > ...
> > boost::regex_traits >
> >::do_assign(char
> > const*, char
> > const*, unsigned int)'
> > ...
> > ../../build/debug/util/libUtil.so: undefined reference to *`snappy::*
> > RawUncompress(char
> > const*, unsigned long, char*)'
> >
> > *2. cause of the failure*
> >
> > $ cd ${IMPALA_HOME}/be/src/service
> > *$ cat CMakeFiles/impalad.dir/link.txt*
> > ${IMPALA_HOME}/toolchain/gcc-4.9.2/bin/g++-Wall -Wno-sign-compare
> > -Wno-unknown-pragmas -pthread -fno-strict-aliasing -std=c++14
> > -Wno-deprecated ...
> > *../../../toolchain/snappy-1.1.3/lib/libsnappy.a
> > ../../../toolchain/lz4-svn/lib/liblz4.a*
> > ...
> > *../../../toolchain/boost-1.57.0/lib/libboost_regex.a*
> > ...
> > *../../build/debug/util/libUtil.so*
> > -lpthread -ldl -lssl -lcrypto
> >
> > Apparently, libUtil.so __should__ show before libsnappy.a and *regex.a.
> >
> >
> > Next step is to find out who creates CMakeFiles/impalad.dir/link.txt and
> > why libUtil.so is placed after libsnappy.a, liblz4.a, and
> libboost_regex.a
> >
> > Thank you,
> > Yonghyun
> >
> >
> >
> >
> >
> > On Wed, Aug 17, 2016 at 10:13 AM, Henry Robinson 
> > wrote:
> >
> > > Only that it's IMPALA-3979. Does anyone have any theories yet?
> > >
> > > On 17 August 2016 at 10:10, Dimitris Tsirogiannis <
> > > dtsirogian...@cloudera.com> wrote:
> > >
> > > > Hey guys,
> > > >
> > > > When I build from the latest master using the -build_shared_libs
> > option I
> > > > get linking errors (see [1]). Building with static works fine. Any
> > ideas?
> > > >
> > > > Thanks
> > > > Dimitris
> > > >
> > > >
> > > > [1]
> > > > Linking CXX executable ../../build/debug/service/impalad
> > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > 'BZ2_bzBuffToBuffCompress'
> > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > 'snappy::MaxCompressedLength(unsigned long)'
> > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > 'snappy::RawCompress(char const*, unsigned long, char*, unsigned
> > long*)'
> > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > 'LZ4_compress'
> > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > 'BZ2_bzDecompressEnd'
> > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > 'BZ2_bzDecompressInit'
> > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > 'BZ2_bzBuffToBuffDecompress'
> > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > 'BZ2_bzDecompress'
> > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > 'snappy::GetUncompressedLength(char const*, unsigned long, unsigned
> > > > long*)'
> > > > ../../build/debug/util/libUtil.so: error: undefined reference to
> > > > 'snappy::RawUncompres

[Impala-ASF-CR] Fix dynamic linking for Impala

2016-08-23 Thread Martin Grund (Code Review)
Martin Grund has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4108

Change subject: Fix dynamic linking for Impala
..

Fix dynamic linking for Impala

When building Impala, the user can choose between two different
build modes: static and dynamically linked libraries. The first
one is used for release builds, while the latter one is used
mainly locally for developers to shorten build times.

However, for some time now, the shared build has been broken in
some configurations. This patch fixes this issue. The main
solution is to move the webserver into the Util library and to
remove the manual linking of Util into the error-utils-test.

Change-Id: I2fb1efd600bfaf17de730186fc80a0dbd976a049
---
M be/CMakeLists.txt
M be/src/util/CMakeLists.txt
2 files changed, 8 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/4108/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I2fb1efd600bfaf17de730186fc80a0dbd976a049
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Martin Grund 


[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

2016-08-23 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
..


Patch Set 10:

(51 comments)

http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/runtime/fragment-instance-exec-state.h
File be/src/runtime/fragment-instance-exec-state.h:

Line 32: /// Execution state of a single plan fragment.
instance


PS10, Line 33: FragmentInstanceExecState
Let's consider FragmentInstanceState and QueryState (if they're not executing, 
what are they doing having state?)


PS10, Line 39: lambda(
 :   
boost::mem_fn(&FragmentInstanceExecState::ReportStatusCb),
 :   this, _1, _2, _3)
Try and find a clearer way to write this. If you don't want to write the lambda 
in the constructor list, consider a static function that returns a lambda given 
'this' as an input:

  std::function 
MakeCb(FragmentInstanceExecState* fragment_instance) {
return [fragment_instance](const Status& s, RuntimeProfile* profile, bool 
done) {
  fragment_instance->ReportStatusCb(s, profile, done);
 }
  }

Please leave a TODO to aggregate all the reports into a single CB managed by 
the QES (and file a JIRA as well).


PS10, Line 80: ImpalaBackendClientCache* client_cache_;
Is this necessary given that ExecEnv can give us the same pointer?


PS10, Line 77: QueryExecState* query_exec_state_;  // not owned
 :   TPlanFragmentInstanceCtx fragment_instance_ctx_;
 :   FragmentInstanceExecutor executor_;
 :   ImpalaBackendClientCache* client_cache_;
 :   TExecPlanFragmentParams exec_params_;
Good opportunity to add some comments for these.


PS10, Line 83: plan fragment
fragment instance.


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/runtime/fragment-instance-executor.cc
File be/src/runtime/fragment-instance-executor.cc:

PS10, Line 80: Close
Since we are aiming for deterministic, manual management of finst lifetimes, 
how about moving this out of the destructor and ensuring that the FInst calls 
Close() explicitly?


Line 349: // We also want to call sink_->Close() here rather than in 
FragmentInstanceExecutor::Close
long line


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/runtime/fragment-instance-executor.h
File be/src/runtime/fragment-instance-executor.h:

Line 48: /// FragmentInstanceExecutor handles all aspects of the execution of a 
single plan fragment,
long line


PS10, Line 155: ExecEnv* exec_env_;  // not owned
Can you remove this? ExecEnv::GetInstance() works fine.


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/runtime/query-exec-state.cc
File be/src/runtime/query-exec-state.cc:

PS10, Line 60:  
remove space


PS10, Line 73: 
DCHECK_EQ(fragment_inst_exec_state_map_.find(GetInstanceIdx(fragment_instance_id)),
 :   fragment_inst_exec_state_map_)
these aren't the same type (does this compile?).


PS10, Line 76: FragmentInstanceExecState* fragment_inst_exec_state =
 :   new FragmentInstanceExecState(exec_params, this, 
ExecEnv::GetInstance());
Presumably this should be managed by obj_pool. If not, the value type of the 
map should be unique_ptr<...>.


PS10, Line 80: his RPC returns
This isn't an RPC any more. Update comment.


PS10, Line 113: lexical_cast
PrintId()


PS10, Line 128: lexical_cast
PrintId()


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/runtime/query-exec-state.h
File be/src/runtime/query-exec-state.h:

Line 38: /// This class is responsible for creating, running and destroying 
FragmentInstanceExecStates.
Talk about lifetime here: when is this created, when is it destroyed, what are 
the guarantees about the subordinate data structures like finstexecstates. Can 
this outlive a query on the coordinator? etc.


PS10, Line 41: // Special constructor for the coordinator.
TODO: remove.


PS10, Line 59: CleanFragmentInstanceStates
Don't talk about implementation details (cleaning the map). When would this be 
called?


Line 61:   /// Registers a new FragmentInstanceExecState and launches the 
thread that calls Prepare() and
long line.


PS10, Line 90: std::unique_ptr obj_pool_
Comment broadly what this is used to manage, since that affects lifecycles.


PS10, Line 90: *
do you mean unique_ptr ?


PS10, Line 90: Owned
redundant wrt unique_ptr


PS10, Line 94: active
expand. What does it mean for a fragment to be active?


PS10, Line 105: shared_ptr
Update comment.


PS10, Line 109: FragmentInstanceExecState
Are the exec states managed by the obj pool? If so, say so otherwise the 
lifecycle is not clear.


PS10, Line 113: shared pointer 
update comment


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/service/client-request-exec-state.cc
File be/src/service/client-request-exec-state.cc:

Line 17: #include "service/client-request-exec-state.h"
add a blank line before this one


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/service/client-request-ex

[Impala-ASF-CR] IMPALA-(3895,3859): Don't log file data on parse errors

2016-08-23 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-(3895,3859): Don't log file data on parse errors
..


Patch Set 5:

Exhaustive tests failed because the offsets for the GZIP version of 
alltypeserrorornulls is different than for most other file formats. I need to 
change that test to use row_regex instead.

-- 
To view, visit http://gerrit.cloudera.org:8080/4020
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5a604f8784a9ff7b4bf878f82ee7f56697df3272
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1731,IMPALA-3868: Float values are not parsed correctly

2016-08-23 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-1731,IMPALA-3868: Float values are not parsed correctly
..


Patch Set 5: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3791
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9e17d0f051b300a22a520ce34e276c2d4460d35e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1731,IMPALA-3868: Float values are not parsed correctly

2016-08-23 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-1731,IMPALA-3868: Float values are not parsed correctly
..


IMPALA-1731,IMPALA-3868: Float values are not parsed correctly

Fixed StringToFloatInternal() not to parse strings like "1.23inf"
and "infinite" with leading/trailing garbage as Infinity. These
strings are now rejected with PARSE_FAILURE.
Only "inf" and "infinity" are accepted, parsing is case-insensitive.

"NaN" values are handled similarly: strings with leading/trailing
garbage like "nana" are rejected, parsing is case-insensitive.

Other changes:
- StringToFloatInternal() was cleaned up a bit. Parsing inf and NaN
strings was moved out of the main loop.
- Use std::numeric_limits::infinity() instead of INFINITY macro
and std::numeric_limits::quiet_NaN() instead of NAN macro.
- Fixed another minor bug: multiple dots are allowed when parsing
float values (e.g. "2.1..6" is interpreted as 2.16).
- New BE and E2E tests were added.

Change-Id: I9e17d0f051b300a22a520ce34e276c2d4460d35e
Reviewed-on: http://gerrit.cloudera.org:8080/3791
Reviewed-by: Michael Ho 
Tested-by: Internal Jenkins
---
M be/src/exprs/expr-test.cc
M be/src/util/string-parser-test.cc
M be/src/util/string-parser.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
4 files changed, 205 insertions(+), 41 deletions(-)

Approvals:
  Michael Ho: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3791
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I9e17d0f051b300a22a520ce34e276c2d4460d35e
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-3832: test invalid data handling in lzo text scanner

2016-08-23 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3832: test invalid data handling in lzo text scanner
..


Patch Set 4: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3832: test invalid data handling in lzo text scanner

2016-08-23 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3832: test invalid data handling in lzo text scanner
..


IMPALA-3832: test invalid data handling in lzo text scanner

This adds the lzo text scanner to the fuzz testing.

I ran the test in a loop under ASAN overnight and didn't see any
failures.

Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
Reviewed-on: http://gerrit.cloudera.org:8080/4096
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M tests/query_test/test_scanners_fuzz.py
1 file changed, 17 insertions(+), 8 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4001: qgen: add proof of concept tests for Query() objects

2016-08-23 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-4001: qgen: add proof of concept tests for Query() 
objects
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4081/1/tests/comparison/tests/fake_query.py
File tests/comparison/tests/fake_query.py:

PS1, Line 97: must at least a
"must at least *contain* a" ...?
"must at least *be* a" ...?


PS1, Line 97: fram
s/fram/from/


http://gerrit.cloudera.org:8080/#/c/4081/1/tests/comparison/tests/query_object_testdata.py
File tests/comparison/tests/query_object_testdata.py:

PS1, Line 64: DATASET
So -- these are really test cases, yes? Could this just be called 
QUERY_TEST_CASES, or something. DATASET seems confusing, at least to me.


http://gerrit.cloudera.org:8080/#/c/4081/1/tests/comparison/tests/test_query_objects.py
File tests/comparison/tests/test_query_objects.py:

PS1, Line 25: query_something
I'm a little confused by this. query_something is a ... QueryTest object 
(namedtuple), right? From the DATASET list? Maybe a more descriptive variable 
name could be employed here, to help clarify some of the pytest magic going on.


Line 53: @pytest.mark.parametrize('query_test', DATASET, ids=_idfn)
How do you feel about moving all of the tests to the end of the file, rather 
than interspersing them with utility functions?


-- 
To view, visit http://gerrit.cloudera.org:8080/4081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2ed1960430ae0af469986e33f88aecb6fa74e999
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2428: Support multiple-character string as the field delimiter

2016-08-23 Thread Yuanhao Luo (Code Review)
Yuanhao Luo has abandoned this change.

Change subject: IMPALA-2428: Support multiple-character string as the field 
delimiter
..


Abandoned

I will re-push this commit when Impala comes to 3.0

-- 
To view, visit http://gerrit.cloudera.org:8080/3314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: Id1437ca35dc4fdc58a7db1c2c70d4da30adf0c3e
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Yuanhao Luo 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Yuanhao Luo 


[Impala-ASF-CR] IMPALA-3945: Forbid create text table with nonsensical delimiter combinations.

2016-08-23 Thread Yuanhao Luo (Code Review)
Yuanhao Luo has abandoned this change.

Change subject: IMPALA-3945: Forbid create text table with nonsensical 
delimiter combinations.
..


Abandoned

I will re-push this commit when Impala comes to 3.0

-- 
To view, visit http://gerrit.cloudera.org:8080/3839
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I836b45f930e680ea03274405b90b1ee14822152b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Yuanhao Luo 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yuanhao Luo 


[Impala-ASF-CR] IMPALA-3988: Only use first 96 bits of query id

2016-08-23 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3988: Only use first 96 bits of query id
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4065/8/be/src/util/uid-util.h
File be/src/util/uid-util.h:

PS8, Line 51: const
> what's the advantage in this context?
Tells the reader and the compiler that this is really and truly a constant 
(i.e. not dependent on any variable), and is available for use in other 
constexpr contexts; for example template parameters. At this point it's mostly 
a readability aid.


Line 55:   memcpy(&result.hi, &uuid.data[0], 8);
> what is the concern?
Seemed more readable as an assignment to me. But the compiler special-cases the 
8-byte copy AFAICT, so not worth changing.


-- 
To view, visit http://gerrit.cloudera.org:8080/4065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia56a03ed9a1d7e77c72b66a01cd48c5b6bf3624f
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2932: Extend DistributedPlanner to account for hash table build cost

2016-08-23 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-2932: Extend DistributedPlanner to account for hash 
table build cost
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4098/2/fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java
File fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java:

Line 461: lhsTree.getCardinality() * 
ExchangeNode.getAvgSerializedRowSize(lhsTree));
> you're not guaranteed to have computed rhsDataSize at this point
Sorry about that.

It seems like this might be a bug with the existing code, though? Since it 
means that we don't do the hash table size check at line 489 in situations 
where we have a rhs cardinality estimate but no lhs num nodes, since we don't 
calculate the broadcast cost even though we could calculate the size of the rhs 
in this situation.

I submitted a new version with both of these things fixed, but of course I can 
do it a different way if it was intentional that we don't do the hash table 
size check for the broadcast in situations where we can't calculate the 
broadcast cost.


-- 
To view, visit http://gerrit.cloudera.org:8080/4098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2932: Extend DistributedPlanner to account for hash table build cost

2016-08-23 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new patch set (#3).

Change subject: IMPALA-2932: Extend DistributedPlanner to account for hash 
table build cost
..

IMPALA-2932: Extend DistributedPlanner to account for hash table build cost

When deciding between a broadcast or repartition join, Impala calculates
the cost of each join as the total amount of data that is sent over the
network. This ignores some relevant costs, and can lead to bad plans.

One such relevant cost is the work to create the hash table used in the
join. This patch accounts for this by adding the amount of data inserted
into the hash table (the size of the right side of the join) to the
previous cost.

This generally increases the estimated cost of broadcast joins relative
to repartitioning joins, as the broadcast join must build the hash table
on each node the data was broadcast to, so its effect will be to make
repartitioning joins more likely to be chosen, especially in large
clusters.

This patch has not yet been performance tested.

Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1
---
M fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-planner/queries/PlannerTest/views.test
8 files changed, 153 insertions(+), 130 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/4098/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-2932: Extend DistributedPlanner to account for hash table build cost

2016-08-23 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2932: Extend DistributedPlanner to account for hash 
table build cost
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4098/2/fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java
File fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java:

Line 461:   partitionCost = Math.round(lhsNetworkCost + rhsNetworkCost + 
rhsDataSize);
you're not guaranteed to have computed rhsDataSize at this point


-- 
To view, visit http://gerrit.cloudera.org:8080/4098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3944: Fix crash in SimpleScheduler

2016-08-23 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4102

Change subject: IMPALA-3944: Fix crash in SimpleScheduler
..

IMPALA-3944: Fix crash in SimpleScheduler

SimpleScheduler::BackendConfig maintains two maps, hostname->IP and
IP->[list of backends]. In situations where multiple backends run on a
single host, BackendConfig::RemoveBackend was erroneously removing the
hostname->IP mapping, even when the list of backends for a host was not
empty. The fix is to only remove this mapping when the last backend of a
host gets removed.

I will push a test in a separate change.

Change-Id: I1c2c58c2f9e7de0936aba03dcce86cdc4b31a866
---
M be/src/scheduling/simple-scheduler.cc
1 file changed, 11 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/4102/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I1c2c58c2f9e7de0936aba03dcce86cdc4b31a866
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-1731,IMPALA-3868: Float values are not parsed correctly

2016-08-23 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-1731,IMPALA-3868: Float values are not parsed correctly
..


Patch Set 5: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/124/

-- 
To view, visit http://gerrit.cloudera.org:8080/3791
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9e17d0f051b300a22a520ce34e276c2d4460d35e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

2016-08-23 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#10).

Change subject: IMPALA-4014: Introduce query-wide execution state.
..

IMPALA-4014: Introduce query-wide execution state.

This patch is a header preview of a query wide execution state for
a backend.

Changes in design are as follows:

FragmentMgr -> QueryExecMgr
  The QueryExecMgr now receives incoming fragments, creates a new
  QueryExecState if it is the first fragment to arrive for that
  query, and passes this received fragment along to the QueryExecState.
  This class is responsible for cleaning up the QueryExecState.

QueryExecState (new):
  This is the query wide state for the backend. It is initialized by
  the first fragment to arrive to the QueryExecMgr. This class is now
  responsible for creating FragmentExecStates and executing them.
  Its life is protected by a ref counting mechanism and it is
  scheduled for destruction once the ref count reaches zero.
  Once scheduled for destruction, a thread in the QueryExecMgr will
  destroy the QueryExecState.
  Every user of the QueryExecState must access it within the scope of
  ScopedQueryExecStateRef which guarantees ref count incrementing and
  decrementing at entry/exit of the scope.

QueryExecState (old) -> ClientRequestExecState:
  This is just a class name change.

  We still do not include shared state into the QueryExecState because
  there needs to be changes to DescriptorTbl, etc. before we can
  incorporate them into the QueryExecState. Will be done as separate
  patches.

P.S: This only shows what the changes would look like and as of yet
still does not compile.

Excludes:
 - Some class renames in places not important to the core patch logic.
 - Most .cc files to make use of changes made.

Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
---
M be/src/runtime/CMakeLists.txt
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
R be/src/runtime/fragment-instance-exec-state.cc
R be/src/runtime/fragment-instance-exec-state.h
R be/src/runtime/fragment-instance-executor.cc
R be/src/runtime/fragment-instance-executor.h
A be/src/runtime/query-exec-state.cc
A be/src/runtime/query-exec-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/service/CMakeLists.txt
R be/src/service/client-request-exec-state.cc
R be/src/service/client-request-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-internal-service.h
A be/src/service/query-exec-mgr.cc
A be/src/service/query-exec-mgr.h
M common/thrift/ImpalaInternalService.thrift
20 files changed, 735 insertions(+), 360 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/3817/10
-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

2016-08-23 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3817/7/be/src/service/query-exec-mgr.cc
File be/src/service/query-exec-mgr.cc:

Line 85: 
> The decrementing of ref cnt occurs in ReleaseQueryExecState, which will rem
Ah yes, my bad. I've made this a DCHECK


-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3832: test invalid data handling in lzo text scanner

2016-08-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3832: test invalid data handling in lzo text scanner
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4096/3/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

PS3, Line 143: corrupt LZO index file
> should we file a JIRA for this?  technically we should skip the file when a
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3832: test invalid data handling in lzo text scanner

2016-08-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3832: test invalid data handling in lzo text scanner
..


Patch Set 4: Code-Review+2

Carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3832: test invalid data handling in lzo text scanner

2016-08-23 Thread Tim Armstrong (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4096

to look at the new patch set (#4).

Change subject: IMPALA-3832: test invalid data handling in lzo text scanner
..

IMPALA-3832: test invalid data handling in lzo text scanner

This adds the lzo text scanner to the fuzz testing.

I ran the test in a loop under ASAN overnight and didn't see any
failures.

Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
---
M tests/query_test/test_scanners_fuzz.py
1 file changed, 17 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4096/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.

2016-08-23 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-2550 Introduce query-wide execution context.
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3817/7/be/src/service/query-exec-mgr.cc
File be/src/service/query-exec-mgr.cc:

Line 85:   if (i->second->num_current_references_ == 0) return NULL;
> No that's not how it works.
The decrementing of ref cnt occurs in ReleaseQueryExecState, which will remove 
from the map when the cnt hits 0.  So we never have a QES in the map with cnt 
== 0.


-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3832: test invalid data handling in lzo text scanner

2016-08-23 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3832: test invalid data handling in lzo text scanner
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4096/3/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

PS3, Line 143: corrupt LZO index file
should we file a JIRA for this?  technically we should skip the file when 
abort_on_error=0 in this case, right?


-- 
To view, visit http://gerrit.cloudera.org:8080/4096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4009: qgen: add docs, utility for installing Oracle as ref database

2016-08-23 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4009: qgen: add docs, utility for installing Oracle as 
ref database
..


Patch Set 2:

Thanks for the reviews. In patch set 2, I added a small utility to verify 
installation and connectivity. Thomas, let me address your comments in the next 
patch set.

-- 
To view, visit http://gerrit.cloudera.org:8080/4095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3bf84b7afa8851c49a8d0f0a1ceded94f4de158
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4009: qgen: add docs, utility for installing Oracle as ref database

2016-08-23 Thread Michael Brown (Code Review)
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4095

to look at the new patch set (#2).

Change subject: IMPALA-4009: qgen: add docs, utility for installing Oracle as 
ref database
..

IMPALA-4009: qgen: add docs, utility for installing Oracle as ref database

This patch adds documentation and a small utility for users wishing to
run the random query generator against Oracle as a reference database.
In particular, the patch

* points users toward installing Oracle Database.
* provides directions on incorporating a Python Oracle client
  (cx_Oracle) into our impala-python virtual environment.
* provides a small utility that will verify installation and
  connectivity between client and server

This documentation isn't meant to be a complete guide. Other patches
will add documentation for actually loading random query generator test
data into Oracle, for example.

Change-Id: Ib3bf84b7afa8851c49a8d0f0a1ceded94f4de158
---
A tests/comparison/ORACLE.txt
M tests/comparison/README
A tests/comparison/util/verify-oracle-connection.py
3 files changed, 176 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/4095/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib3bf84b7afa8851c49a8d0f0a1ceded94f4de158
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4009: qgen: add documentation for installing Oracle as ref database

2016-08-23 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-4009: qgen: add documentation for installing Oracle as 
ref database
..


Patch Set 1:

(1 comment)

I performed the steps (with the docker instructions), and it all works, modulo 
the few things specific to my environment that instructions like these can't 
really anticipate.

http://gerrit.cloudera.org:8080/#/c/4095/1/tests/comparison/ORACLE.txt
File tests/comparison/ORACLE.txt:

Line 68: 1. Refer to the instant-client installation instructions. As of this
I'm confused by the structure here, since these instructions are referring to 
the same step as the "1. Get Oracle instant-client" above.

I get that its "pre-reqs" vs. "installation instructions", but I think it would 
be clearer if there was just one ordered list of instructions, with installing 
the pre-reqs as the first couple of bullet points.


-- 
To view, visit http://gerrit.cloudera.org:8080/4095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3bf84b7afa8851c49a8d0f0a1ceded94f4de158
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.

2016-08-23 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-2550 Introduce query-wide execution context.
..


Patch Set 9:

(3 comments)

In this patch the following is addressed:
- DescriptorTbl is not shared anymore. I've moved it back to RuntimeState as we 
still need to change how the serialization/deserialization is done before we 
can make it shared. It will be done in a separate patch.

- QueryExecMgr now starts fragment threads so that refcount management stays in 
the scope of QueryExecMgr and is not passed to a lower level.

- FragmentInstanceExecState refcount mechanism removed. Now these objects have 
the same lifetime as QueryExecState and are destroyed when QueryExecState is 
destroyed.

- fragment_inst_exec_state_map_ is now indexed by fragment instance index and 
not fragment instance IDs. (Rebased on top of IMPALA-3988)

http://gerrit.cloudera.org:8080/#/c/3817/7/be/src/service/query-exec-mgr.cc
File be/src/service/query-exec-mgr.cc:

Line 85: 
> how is this possible? it looks like an invariant that anything in the map w
No that's not how it works.
Everytime a new fragment comes in, FindOrCreateQueryExecState() is called (in 
RegisterFragmentInstanceWithQuery()) which increases the refcount by 1.
Everytime a fragment completes (in QEM->ExecInstance()), ReleaseQES() is called 
which decrements the refcount by 1.

This means that this refcount will ultimately reach 0.

Alternatively CancelPlanFragment() and PublishFilter() use QESGuard() which 
calls a GetQES() at the beginning of its scope and a ReleaseQES() when it goes 
out of scope which will effectively balance the refcount.


http://gerrit.cloudera.org:8080/#/c/3817/8/be/src/service/query-exec-mgr.cc
File be/src/service/query-exec-mgr.cc:

Line 54:   QueryExecState* query_exec_state = 
FindOrCreateQueryExecState(exec_params.query_ctx);
> comment seems redundant with the code (especially if you rename the method 
Done


PS8, Line 100: 
> FindOrCreateQueryExecState() given that it creates (as opposed to inserting
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.

2016-08-23 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#9).

Change subject: IMPALA-2550 Introduce query-wide execution context.
..

IMPALA-2550 Introduce query-wide execution context.

This patch is a header preview of a query wide execution state for
a backend.

Changes in design are as follows:

FragmentMgr -> QueryExecMgr
  The QueryExecMgr now receives incoming fragments, creates a new
  QueryExecState if it is the first fragment to arrive for that
  query, and passes this received fragment along to the QueryExecState.
  This class is responsible for cleaning up the QueryExecState.

QueryExecState (new):
  This is the query wide state for the backend. It is initialized by
  the first fragment to arrive to the QueryExecMgr. This class is now
  responsible for creating FragmentExecStates and executing them.
  Its life is protected by a ref counting mechanism and it is
  scheduled for destruction once the ref count reaches zero.
  Once scheduled for destruction, a thread in the QueryExecMgr will
  destroy the QueryExecState.
  Every user of the QueryExecState must access it within the scope of
  ScopedQueryExecStateRef which guarantees ref count incrementing and
  decrementing at entry/exit of the scope.

QueryExecState (old) -> ClientRequestExecState:
  This is just a class name change.

  We still do not include shared state into the QueryExecState because
  there needs to be changes to DescriptorTbl, etc. before we can
  incorporate them into the QueryExecState. Will be done as separate
  patches.

P.S: This only shows what the changes would look like and as of yet
still does not compile.

Excludes:
 - Some class renames in places not important to the core patch logic.
 - Most .cc files to make use of changes made.

Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
---
M be/src/runtime/CMakeLists.txt
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
R be/src/runtime/fragment-instance-exec-state.cc
R be/src/runtime/fragment-instance-exec-state.h
R be/src/runtime/fragment-instance-executor.cc
R be/src/runtime/fragment-instance-executor.h
A be/src/runtime/query-exec-state.cc
A be/src/runtime/query-exec-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/service/CMakeLists.txt
R be/src/service/client-request-exec-state.cc
R be/src/service/client-request-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-internal-service.h
A be/src/service/query-exec-mgr.cc
A be/src/service/query-exec-mgr.h
M common/thrift/ImpalaInternalService.thrift
20 files changed, 735 insertions(+), 360 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/3817/9
-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner

2016-08-23 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3629: Codegen TransferScratchTuples() in 
hdfs-parquet-scanner
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner

2016-08-23 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3629: Codegen TransferScratchTuples() in 
hdfs-parquet-scanner
..


IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner

We currently don't do any codegen in the hdfs-parquet scanner,
which limits performance. This patch creates a new function,
ProcessScratchBatch, which contains the inner loop of
TransferScratchTuples, allowing us to cross-compile only this
performance critical part. It also uses CodegenEvalConjuncts to
replace the call to EvalConjuncts in ProcessScratchBatch with a
codegen'd version.

Additionally, it modifies the Codegen functions in
hdfs-avro/text/sequence-scanner to take an output parameter for
the codegen'd function and return a Status in order to improve
logging around codegen failures.

Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
Reviewed-on: http://gerrit.cloudera.org:8080/4093
Reviewed-by: Thomas Tauber-Marshall 
Reviewed-by: Matthew Jacobs 
Tested-by: Internal Jenkins
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/CMakeLists.txt
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
A be/src/exec/hdfs-parquet-scanner-ir.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
15 files changed, 276 insertions(+), 128 deletions(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved
  Internal Jenkins: Verified
  Thomas Tauber-Marshall: Looks good to me, but someone else must approve



-- 
To view, visit http://gerrit.cloudera.org:8080/4093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-2932: Extend DistributedPlanner to account for hash table build cost

2016-08-23 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-2932: Extend DistributedPlanner to account for hash 
table build cost
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4098/1/fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java
File fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java:

Line 463: LOG.debug("partition: cost=" + Long.toString(partitionCost));
> this isn't correct, because rhsCost is the network cost (which can be 0 if 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2932: Extend DistributedPlanner to account for hash table build cost

2016-08-23 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new patch set (#2).

Change subject: IMPALA-2932: Extend DistributedPlanner to account for hash 
table build cost
..

IMPALA-2932: Extend DistributedPlanner to account for hash table build cost

When deciding between a broadcast or repartition join, Impala calculates
the cost of each join as the total amount of data that is sent over the
network. This ignores some relevant costs, and can lead to bad plans.

One such relevant cost is the work to create the hash table used in the
join. This patch accounts for this by adding the amount of data inserted
into the hash table (the size of the right side of the join) to the
previous cost.

This generally increases the estimated cost of broadcast joins relative
to repartitioning joins, as the broadcast join must build the hash table
on each node the data was broadcast to, so its effect will be to make
repartitioning joins more likely to be chosen, especially in large
clusters.

This patch has not yet been performance tested.

Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1
---
M fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-planner/queries/PlannerTest/views.test
8 files changed, 150 insertions(+), 129 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/4098/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 


[Impala-ASF-CR] IMPALA-3945: Forbid create text table with nonsensical delimiter combinations.

2016-08-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3945: Forbid create text table with nonsensical 
delimiter combinations.
..


Patch Set 3:

Can we abandon this for now if we don't have any intent to merge it?

-- 
To view, visit http://gerrit.cloudera.org:8080/3839
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I836b45f930e680ea03274405b90b1ee14822152b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Yuanhao Luo 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yuanhao Luo 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4006: dangerous rm -rf statements in scripts

2016-08-23 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4006: dangerous rm -rf statements in scripts
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4078/1/bin/clean.sh
File bin/clean.sh:

PS1, Line 49: $IMPALA_ALL_LOGS_DIRS
> Yes and no. On one hand, this is supposed to contain multiple log, so havin
May be I misunderstood your comment but for all practical purposes, we don't 
usually use multiple different log directories.


-- 
To view, visit http://gerrit.cloudera.org:8080/4078
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7503794180dee99eeb979e67f34e3b2edade70fe
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2.

2016-08-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or 
AVX2.
..


Patch Set 47:

Should I start the merge for this?

-- 
To view, visit http://gerrit.cloudera.org:8080/3081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1
Gerrit-PatchSet: 47
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Youwei Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Youwei Wang 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4006: dangerous rm -rf statements in scripts

2016-08-23 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4006: dangerous rm -rf statements in scripts
..


Patch Set 3:

(5 comments)

It appears that you are already fixing some of the places which are considered 
"non-dangerous". Can you please just bite the bullet and fix up the rest 
pointed out below ?

http://gerrit.cloudera.org:8080/#/c/4078/3/bin/impala-config.sh
File bin/impala-config.sh:

PS3, Line 45: $IMPALA_HOME/toolchain
missing quote ?


http://gerrit.cloudera.org:8080/#/c/4078/3/bin/run-all-tests.sh
File bin/run-all-tests.sh:

PS3, Line 111:  ${IMPALA_HOME}/testdata/bin/split-hbase.sh
missing quote ?


PS3, Line 137: ${IMPALA_HOME}/bin/run-workload.py
missing quote ?


PS3, Line 173: ${IMPALA_HOME}/bin/start-impala-cluster.py
missing quote ?


PS3, Line 177: ${IMPALA_HOME}/bin/mvn-quiet.sh
missing quote ?


-- 
To view, visit http://gerrit.cloudera.org:8080/4078
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7503794180dee99eeb979e67f34e3b2edade70fe
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-2700: ASCII NUL characters are doubled on insert into text tables

2016-08-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has abandoned this change.

Change subject: IMPALA-2700: ASCII NUL characters are doubled on insert into 
text tables
..


Abandoned

This was already merged to master: https://gerrit.cloudera.org/#/c/3876/

-- 
To view, visit http://gerrit.cloudera.org:8080/3703
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: Ia30fa314d1ee1e99f9e7598466eb1570ca7940fc
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-3832: test invalid data handling in lzo text scanner

2016-08-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3832: test invalid data handling in lzo text scanner
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4096/2/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

Line 61:   pytest.skip()
> why?
We don't have decimal_tbl for these file formats. Added comment.


Line 144:   table_format.compression_codec != 'none'):
> what errors do we produce?
Good point.  I see a lot of:

   Error while reading index file: 
hdfs://localhost:20500/test-warehouse/test_fuzz_alltypes_c35099ab.db/alltypes/year=2010/month=2/22_0.lzo.index

Updated comment


-- 
To view, visit http://gerrit.cloudera.org:8080/4096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3832: test invalid data handling in lzo text scanner

2016-08-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#3).

Change subject: IMPALA-3832: test invalid data handling in lzo text scanner
..

IMPALA-3832: test invalid data handling in lzo text scanner

This adds the lzo text scanner to the fuzz testing.

I ran the test in a loop under ASAN overnight and didn't see any
failures.

Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
---
M tests/query_test/test_scanners_fuzz.py
1 file changed, 16 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4096/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 


[Impala-ASF-CR] IMPALA-3938: Disallow implicit references between nested collections.

2016-08-23 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3938: Disallow implicit references between nested 
collections.
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4079/3/fe/src/test/java/com/cloudera/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/com/cloudera/impala/analysis/AnalyzeStmtsTest.java:

Line 578: // Tests that an attempted implicit match must be succeeded by an 
explicit match.
Please adjust this comment. It is no longer accurate after your changes. Same 
for the comments in this test cluster below.

This is basically the same case as the one you fixed below (implicit references 
cannot go through collections)


Line 589: // - however, we are not allowed to implicitly skip 'item' again, 
since we have
Please adjust this comment also.


Line 591: // - the rule is: an implicit match must be followed by an 
explicit one
Please adjust this comment also. What is the rule?


Line 594: AnalysisError("select 1 from d.t8.s1.s2.a, a.f",
you could you add a few tests below that are similar to this? i.e, the invalid 
reference is in the FROM clause


Line 597: // Test to validate that implicit references between collection 
types cannot be used.
Tests that implicit references are not allowed through collection types.


-- 
To view, visit http://gerrit.cloudera.org:8080/4079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I703c7e9c8c9c95c9fee714767367373eec10cd0c
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Christopher Channing 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Christopher Channing 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3832: test invalid data handling in lzo text scanner

2016-08-23 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3832: test invalid data handling in lzo text scanner
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4096/2/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

Line 61:   pytest.skip()
why?


Line 144:   table_format.compression_codec != 'none'):
what errors do we produce?


-- 
To view, visit http://gerrit.cloudera.org:8080/4096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.

2016-08-23 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-2550 Introduce query-wide execution context.
..


Patch Set 8:

(3 comments)

Just to summarize some key points from the discussion today so we don't forget:

- We agreed that ref counting is the way to go and explicit ref counting avoids 
"accidental" leakage of shared_ptr copies.
- We agreed to restructure the fragment instance thread code so that the 
reference held by that thread would be done from the top-level thread method, 
rather than the reference being passed implicitly between classes like in the 
current patchset.
- We agreed to try to have a hierarchical structure to the code to make 
ownership and lifetime as clear as possible.
- We agreed that we only need reference counting at the QES level.  Everything 
"below" would have the same lifetime as QES and does not need separate ref 
counting.

http://gerrit.cloudera.org:8080/#/c/3817/8/be/src/service/query-exec-mgr.cc
File be/src/service/query-exec-mgr.cc:

Line 54:   // Create or retrieve the QueryExecState for this query id.
comment seems redundant with the code (especially if you rename the method as 
suggested below).


Line 85:   if (i->second->num_current_references_ == 0) return NULL;
how is this possible? it looks like an invariant that anything in the map will 
have refcnt > 0.


PS8, Line 100: Insert
FindOrCreateQueryExecState() given that it creates (as opposed to inserting one 
that was created by the caller).


-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-(3895,3859): Don't log file data on parse errors

2016-08-23 Thread Dan Hecht (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4020

to look at the new patch set (#5).

Change subject: IMPALA-(3895,3859): Don't log file data on parse errors
..

IMPALA-(3895,3859): Don't log file data on parse errors

Logging file or table data is a bad idea, and doing it by default is
particularly bad. This patch changes HdfsScanNode::LogRowParseError() to
log a file and offset only.

Testing: See rewritten tests.

To support testing this change, we also fix IMPALA-3895, by introducing
a canonical string __HDFS_FILENAME__ that all Hadoop filenames in the ERROR
output are replaced with before comparing with the expected
results. This fixes a number of issues with the old way of matching
filenames which purported to be a regex, but really wasn't. In
particular, we can now match the rest of an ERROR line after the
filename, which was not possible before.

In some cases, we don't want to substitute filenames because the ERROR
output is looking for a very specific output. In that case we can write:

$NAMENODE/

and this patch will not perform _any_ filename substitutions on ERROR
sections that contain the $NAMENODE string.

Finally, this patch fixes a bug where a test that had an ERRORS section
but no RESULTS section would silently pass without testing anything.

Change-Id: I5a604f8784a9ff7b4bf878f82ee7f56697df3272
---
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
M testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hbase-scan-node-errors.test
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-rcfile-scan-node-errors.test
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-sequence-scan-errors.test
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-continue-on-error.test
M testdata/workloads/functional-query/queries/QueryTest/strict-mode-abort.test
M testdata/workloads/functional-query/queries/QueryTest/strict-mode.test
M tests/common/impala_test_suite.py
M tests/common/test_result_verifier.py
M tests/util/filesystem_utils.py
M tests/util/hdfs_util.py
19 files changed, 393 insertions(+), 408 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/4020/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4020
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a604f8784a9ff7b4bf878f82ee7f56697df3272
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-(3895,3859): Don't log file data on parse errors

2016-08-23 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-(3895,3859): Don't log file data on parse errors
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4020
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5a604f8784a9ff7b4bf878f82ee7f56697df3272
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-(3895,3859): Don't log file data on parse errors

2016-08-23 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-(3895,3859): Don't log file data on parse errors
..


Patch Set 4:

> This passes a core test run. Running exhaustive, and local (to test
 > FS changes) now.

Did those tests pass?  If not, do you want me to take over fixing the tests 
since you are on PTO this week?

-- 
To view, visit http://gerrit.cloudera.org:8080/4020
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5a604f8784a9ff7b4bf878f82ee7f56697df3272
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2932: Extend DistributedPlanner to account for hash table build cost

2016-08-23 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2932: Extend DistributedPlanner to account for hash 
table build cost
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4098/1/fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java
File fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java:

Line 463:   partitionCost = Math.round(lhsCost + 2 * rhsCost);
this isn't correct, because rhsCost is the network cost (which can be 0 if the 
rhs is already partitioned in the desired way). you need to account for the 
hash table build cost independently.


-- 
To view, visit http://gerrit.cloudera.org:8080/4098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4009: qgen: add documentation for installing Oracle as ref database

2016-08-23 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4009: qgen: add documentation for installing Oracle as 
ref database
..


Patch Set 1: Code-Review+1

This looks great, thanks for writing up the readme. I didn't walk through the 
steps but it makes sense.

-- 
To view, visit http://gerrit.cloudera.org:8080/4095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3bf84b7afa8851c49a8d0f0a1ceded94f4de158
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3938: Disallow implicit references between nested collections.

2016-08-23 Thread Christopher Channing (Code Review)
Christopher Channing has posted comments on this change.

Change subject: IMPALA-3938: Disallow implicit references between nested 
collections.
..


Patch Set 2:

(4 comments)

Done

http://gerrit.cloudera.org:8080/#/c/4079/2/fe/src/main/java/com/cloudera/impala/analysis/Path.java
File fe/src/main/java/com/cloudera/impala/analysis/Path.java:

Line 227:   && 
isOptionalFieldCollectionType((CollectionStructType)structType);
> Based on this latest version I think the code can be simplified to not even
Done


http://gerrit.cloudera.org:8080/#/c/4079/2/fe/src/test/java/com/cloudera/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/com/cloudera/impala/analysis/AnalyzeStmtsTest.java:

Line 597: // Test implicit key/value reference is not abused when using a 
map within an array.
> can you explain which resolution is not allowed?
Done


Line 598: addTestTable("create table d.t9 (c1 array>)");
> nit: use 'c' instead of 'c1' for consistency with the other tests
Based on the comment below I've left this as c1 and added a new column c2 
(adhered to the tests above).


Line 599: AnalysisError("select key from d.t9.c1",
> add similar tests for a map nested inside a map (we can run into the same i
As discussed, I've added a map>


-- 
To view, visit http://gerrit.cloudera.org:8080/4079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I703c7e9c8c9c95c9fee714767367373eec10cd0c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Christopher Channing 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Christopher Channing 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3938: Disallow implicit references between nested collections.

2016-08-23 Thread Christopher Channing (Code Review)
Christopher Channing has uploaded a new patch set (#3).

Change subject: IMPALA-3938: Disallow implicit references between nested 
collections.
..

IMPALA-3938: Disallow implicit references between nested collections.

The Bug: Prior to this patch Impala would accept and execute a query of
the form "select pos, key, value from tbl.arr_map_col order by pos asc",
where "arr_map_col" is an array of map. In this example, the map
is not referenced as a collection and therefore the implicit rule of
using "item" should not be applied. The results produced from this
erroneous query are arbitrary. The semantically correct way of
expressing the query would be of the form "select a.pos, m.key, m.value
from tbl t, t.arr_map_col a, a.item m order by a.pos asc".

The Fix: Removed the existing 'expectExplicitMatch' flag to simplify the
code. During query analysis (more specifically: path resolution), if
part of a path cannot be explicitly matched then the planner will fall
back to an implicit field check. As part of this check, the type of the
nested field inside the collection will be inspected to verify that it
does not match that of a collection type. If so, the path resolution will
fail and an AnalysisException will be thrown indicating that the path
cannot be resolved. For example, trying to execute the query "select pos,
key, value from tbl.arr_map_col order by pos asc" would yield the
following error "Could not resolve column/field reference: 'key'".
Regarding test coverage, the existing test suite "AnalyzeStmtsTest" has
been augmented with a set of new tests that will verify that implicit
references between collections cannot be used.

Change-Id: I703c7e9c8c9c95c9fee714767367373eec10cd0c
---
M fe/src/main/java/com/cloudera/impala/analysis/Path.java
M fe/src/test/java/com/cloudera/impala/analysis/AnalyzeStmtsTest.java
2 files changed, 31 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/4079/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I703c7e9c8c9c95c9fee714767367373eec10cd0c
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Christopher Channing 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Christopher Channing 


[Impala-ASF-CR] Infomation schema (preview for frontend)

2016-08-23 Thread Kathy Sun (Code Review)
Kathy Sun has posted comments on this change.

Change subject: Infomation_schema (preview for frontend)
..


Patch Set 8:

(13 comments)

Please skip patch 7, it's a wrong reformat. 
Patch 8 is a new reformat using .clang_format

below is the reply I was working on, those will take effect in next patch.

http://gerrit.cloudera.org:8080/#/c/3863/6/be/src/exec/info-schema-scan-node.cc
File be/src/exec/info-schema-scan-node.cc:

PS6, Line 100: alue = m
> Constants should be named like: COL_NAMES
Done


PS6, Line 136: xt** ctxs 
> I think we can get rid of the 'tuple_pool' variable and just use row_batch-
Done


Line 153: 
> This loop looks sufficient to me to materialise any number of rows.
unfortunately... I think it's not. I'm still testing this. discuss offline.


Line 168: 
> Our coding standard says to use '++idx'. Not a big deal but I think it's ea
Done


Line 192
> You can actually remove DebugString(). We don't have it for all scan nodes.
nice, that's a relief. :P


http://gerrit.cloudera.org:8080/#/c/3863/6/fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java
File fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java:

Line 532: case VIEW_METADATA:
> Can you add a TODO like:
Done


Line 533: case SELECT:
> There are some tabs instead of spaces here. Let's just figure out how to se
Done


http://gerrit.cloudera.org:8080/#/c/3863/6/fe/src/main/java/com/cloudera/impala/catalog/InformationSchemaTable.java
File fe/src/main/java/com/cloudera/impala/catalog/InformationSchemaTable.java:

PS6, Line 34: InformationSchemaTable
> I spent some time thinking about this and I'm thinking we should call this 
to discuss in 1o1


Line 45: 
> Your text editor is leaving in trailing whitespace - it should be possible 
Done


PS6, Line 52: Name
> We should keep the column names lower-case (they're converted to lower-case
Done


Line 89:   /**
> Is this ever called? I think since isLoaded() is always true, this should n
Done


Line 109: TResultSetMetadata resultSchema = new TResultSetMetadata();
> Comment not needed.
Done


http://gerrit.cloudera.org:8080/#/c/3863/6/fe/src/main/java/com/cloudera/impala/planner/InfoSchemaScanNode.java
File fe/src/main/java/com/cloudera/impala/planner/InfoSchemaScanNode.java:

PS6, Line 42: InfoSchemaScanNode
> Maybe this should be SystemTableScanNode (assuming we do the rename I sugge
discuss offline


-- 
To view, visit http://gerrit.cloudera.org:8080/3863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7adbeb45220c468e43b424d70c30b952f6cec2cd
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Kathy Sun 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Infomation schema (preview for frontend)

2016-08-23 Thread Kathy Sun (Code Review)
Kathy Sun has uploaded a new patch set (#8).

Change subject: Infomation_schema (preview for frontend)
..

Infomation_schema (preview for frontend)

Since this is just a preview and not the final patch, request reviewers
to please focus their comments more on the design aspect.

This is to expose metadata (current status of impala, e.g impala-metrics) into 
virtual
table, so that users could query them via sql.

You can run impala-shell.sh and type query like:
select * from information_schema.scheduler;
select name, value, description from information_schema.scheduler;

Change-Id: I7adbeb45220c468e43b424d70c30b952f6cec2cd
---
M be/src/exec/CMakeLists.txt
M be/src/exec/exec-node.cc
A be/src/exec/info-schema-scan-node.cc
A be/src/exec/info-schema-scan-node.h
M be/src/runtime/coordinator.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/scheduling/simple-scheduler.cc
M common/thrift/CatalogObjects.thrift
M common/thrift/Descriptors.thrift
M common/thrift/PlanNodes.thrift
M fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java
M fe/src/main/java/com/cloudera/impala/analysis/Analyzer.java
M fe/src/main/java/com/cloudera/impala/analysis/DescriptorTable.java
M fe/src/main/java/com/cloudera/impala/analysis/TupleDescriptor.java
M fe/src/main/java/com/cloudera/impala/catalog/Catalog.java
A fe/src/main/java/com/cloudera/impala/catalog/InformationSchemaDb.java
A fe/src/main/java/com/cloudera/impala/catalog/InformationSchemaTable.java
A fe/src/main/java/com/cloudera/impala/planner/InfoSchemaScanNode.java
M fe/src/main/java/com/cloudera/impala/planner/SingleNodePlanner.java
M fe/src/main/java/com/cloudera/impala/service/Frontend.java
M fe/src/test/java/com/cloudera/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/com/cloudera/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/info-schema.test
24 files changed, 709 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/3863/8
-- 
To view, visit http://gerrit.cloudera.org:8080/3863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7adbeb45220c468e43b424d70c30b952f6cec2cd
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Kathy Sun 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2932: Extend DistributedPlanner to account for hash table build cost

2016-08-23 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4098

Change subject: IMPALA-2932: Extend DistributedPlanner to account for hash 
table build cost
..

IMPALA-2932: Extend DistributedPlanner to account for hash table build cost

When deciding between a broadcast or repartition join, Impala calculates
the cost of each join as the total amount of data that is sent over the
network. This ignores some relevant costs, and can lead to bad plans.

One such relevant cost is the work to create the hash table used in the
join. This patch accounts for this by adding the amount of data inserted
into the hash table (the size of the right side of the join) to the
previous cost.

This generally increases the estimated cost of broadcast joins relative
to repartitioning joins, as the broadcast join must build the hash table
on each node the data was broadcast to, so its effect will be to make
repartitioning joins more likely to be chosen, especially in large
clusters.

This patch has not yet been performance tested.

Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1
---
M fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-planner/queries/PlannerTest/views.test
8 files changed, 148 insertions(+), 125 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/4098/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 


[Impala-ASF-CR] Add .clang-format for Impala's C++ style

2016-08-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Add .clang-format for Impala's C++ style
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3886/6/.clang-format
File .clang-format:

Line 1: BasedOnStyle: Google
Google's style has DerivePointerAlignment = true, which means that if pointers 
are formatted like int *p in a file instead of int* p, it doesn't update them.

I think we should override this to false in this patch or a follow-on.


-- 
To view, visit http://gerrit.cloudera.org:8080/3886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1731,IMPALA-3868: Float values are not parsed correctly

2016-08-23 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-1731,IMPALA-3868: Float values are not parsed correctly
..


Patch Set 5: Code-Review+2

Carry +2 forward.

-- 
To view, visit http://gerrit.cloudera.org:8080/3791
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9e17d0f051b300a22a520ce34e276c2d4460d35e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1731,IMPALA-3868: Float values are not parsed correctly

2016-08-23 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#5).

Change subject: IMPALA-1731,IMPALA-3868: Float values are not parsed correctly
..

IMPALA-1731,IMPALA-3868: Float values are not parsed correctly

Fixed StringToFloatInternal() not to parse strings like "1.23inf"
and "infinite" with leading/trailing garbage as Infinity. These
strings are now rejected with PARSE_FAILURE.
Only "inf" and "infinity" are accepted, parsing is case-insensitive.

"NaN" values are handled similarly: strings with leading/trailing
garbage like "nana" are rejected, parsing is case-insensitive.

Other changes:
- StringToFloatInternal() was cleaned up a bit. Parsing inf and NaN
strings was moved out of the main loop.
- Use std::numeric_limits::infinity() instead of INFINITY macro
and std::numeric_limits::quiet_NaN() instead of NAN macro.
- Fixed another minor bug: multiple dots are allowed when parsing
float values (e.g. "2.1..6" is interpreted as 2.16).
- New BE and E2E tests were added.

Change-Id: I9e17d0f051b300a22a520ce34e276c2d4460d35e
---
M be/src/exprs/expr-test.cc
M be/src/util/string-parser-test.cc
M be/src/util/string-parser.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
4 files changed, 205 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/3791/5
-- 
To view, visit http://gerrit.cloudera.org:8080/3791
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9e17d0f051b300a22a520ce34e276c2d4460d35e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.

2016-08-23 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-2550 Introduce query-wide execution context.
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3817/7/be/src/service/query-exec-mgr.cc
File be/src/service/query-exec-mgr.cc:

Line 85:   if (i->second->num_current_references_ == 0) return NULL;
how is this possible? it looks like an invariant that anything in the map will 
have refcnt > 0.


http://gerrit.cloudera.org:8080/#/c/3817/7/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

PS7, Line 127:   /// The thread pool that is in charge of destroying a 
QueryExecState when it is no
 :   /// longer in use.
 :   ThreadPool destroy_thread_;
> Currently I've made it that way because if a fragment is the last user of t
The QES doesn't delete itself. The QEM does it, so there isn't a problem of 
doing it directly.


-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3832: test invalid data handling in lzo text scanner

2016-08-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4096

Change subject: IMPALA-3832: test invalid data handling in lzo text scanner
..

IMPALA-3832: test invalid data handling in lzo text scanner

This adds the lzo text scanner to the fuzz testing.

I ran the test in a loop under ASAN overnight and didn't see any
failures.

Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
---
M tests/query_test/test_scanners_fuzz.py
1 file changed, 13 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4096/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib707014c1fcfb80cb8076f644fc2b62a5ae758d7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-23 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has submitted this change and it was merged.

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..


IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, 
overly verbose

The webserver address was always configured as 0.0.0.0 (meaning that
the webserver could be reached on any IP for that machine) unless
otherwise specified. This is not a correct value to dispay to the
user. This patch returns the hostname of the node, when requested,
if the webserver host address is 0.0.0.0.

This patch also does not print the coordinator link for very simple
queries, as it's not necessary and is unnecessarily verbose.

This patch also does away with pinging the impalad an extra time per
query for finding the host time and webserver address. It instead
remembers the webserver address at connect time and displays client
local time for every query instead.

Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Reviewed-on: http://gerrit.cloudera.org:8080/3994
Tested-by: Internal Jenkins
Reviewed-by: Sailesh Mukil 
Tested-by: Sailesh Mukil 
---
M be/src/service/impala-beeswax-server.cc
M be/src/util/webserver.cc
M common/thrift/ImpalaService.thrift
M shell/impala_client.py
M shell/impala_shell.py
M tests/shell/test_shell_commandline.py
6 files changed, 69 insertions(+), 30 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Sailesh Mukil: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-23 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..


Patch Set 6: Verified+1

> Patch Set 6: Code-Review+2

Carrying +2 again.

This ran through GVM and got verified+1, but for some reason the CR+2 status 
got dropped. Since this passed GVM, submitting manually.

-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-23 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] Infomation schema (preview for frontend)

2016-08-23 Thread Kathy Sun (Code Review)
Kathy Sun has posted comments on this change.

Change subject: Infomation_schema (preview for frontend)
..


Patch Set 7:

(1 comment)

Patch 7 is only for reformatting. you can skip reviewing this

http://gerrit.cloudera.org:8080/#/c/3863/5/fe/src/main/java/com/cloudera/impala/analysis/TupleDescriptor.java
File fe/src/main/java/com/cloudera/impala/analysis/TupleDescriptor.java:

PS5, Line 206: Table table = getTable();
 :   TableId id = table.getId();
TODO: delete this


-- 
To view, visit http://gerrit.cloudera.org:8080/3863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7adbeb45220c468e43b424d70c30b952f6cec2cd
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Kathy Sun 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4009: qgen: add documentation for installing Oracle as ref database

2016-08-23 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4095

Change subject: IMPALA-4009: qgen: add documentation for installing Oracle as 
ref database
..

IMPALA-4009: qgen: add documentation for installing Oracle as ref database

This patch adds documentation for users wishing to run the random query
generator against Oracle as a reference database. In particular, the
patch points users toward installing Oracle Database, and it provides
directions on incorporating a Python Oracle client into our
impala-python virtual environment.

This documentation isn't meant to be a complete guide. Other patches
will add documentation for actually loading random query generator test
data into Oracle, for example.

Change-Id: Ib3bf84b7afa8851c49a8d0f0a1ceded94f4de158
---
A tests/comparison/ORACLE.txt
M tests/comparison/README
2 files changed, 110 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/4095/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib3bf84b7afa8851c49a8d0f0a1ceded94f4de158
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 


[Impala-ASF-CR] Infomation schema (preview for frontend)

2016-08-23 Thread Kathy Sun (Code Review)
Kathy Sun has uploaded a new patch set (#7).

Change subject: Infomation_schema (preview for frontend)
..

Infomation_schema (preview for frontend)

Since this is just a preview and not the final patch, request reviewers
to please focus their comments more on the design aspect.

This is to expose metadata (current status of impala, e.g impala-metrics) into 
virtual
table, so that users could query them via sql.

You can run impala-shell.sh and type query like:
select * from information_schema.scheduler;
select name, value, description from information_schema.scheduler;

Change-Id: I7adbeb45220c468e43b424d70c30b952f6cec2cd
---
M be/src/exec/CMakeLists.txt
M be/src/exec/exec-node.cc
A be/src/exec/info-schema-scan-node.cc
A be/src/exec/info-schema-scan-node.h
M be/src/runtime/coordinator.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/scheduling/simple-scheduler.cc
M common/thrift/CatalogObjects.thrift
M common/thrift/Descriptors.thrift
M common/thrift/PlanNodes.thrift
M fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java
M fe/src/main/java/com/cloudera/impala/analysis/Analyzer.java
M fe/src/main/java/com/cloudera/impala/analysis/DescriptorTable.java
M fe/src/main/java/com/cloudera/impala/analysis/TupleDescriptor.java
M fe/src/main/java/com/cloudera/impala/catalog/Catalog.java
A fe/src/main/java/com/cloudera/impala/catalog/InformationSchemaDb.java
A fe/src/main/java/com/cloudera/impala/catalog/InformationSchemaTable.java
A fe/src/main/java/com/cloudera/impala/planner/InfoSchemaScanNode.java
M fe/src/main/java/com/cloudera/impala/planner/SingleNodePlanner.java
M fe/src/main/java/com/cloudera/impala/service/Frontend.java
M fe/src/test/java/com/cloudera/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/com/cloudera/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/info-schema.test
24 files changed, 739 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/3863/7
-- 
To view, visit http://gerrit.cloudera.org:8080/3863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7adbeb45220c468e43b424d70c30b952f6cec2cd
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Kathy Sun 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3973: add position and occurrence to instr()

2016-08-23 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4094

Change subject: IMPALA-3973: add position and occurrence to instr()
..

IMPALA-3973: add position and occurrence to instr()

Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd
---
M be/src/exprs/expr-test.cc
M be/src/exprs/string-functions-ir.cc
M be/src/exprs/string-functions.h
M common/function-registry/impala_functions.py
4 files changed, 82 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/4094/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4094
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 


[Impala-ASF-CR] IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner

2016-08-23 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3629: Codegen TransferScratchTuples() in 
hdfs-parquet-scanner
..


Patch Set 1: Code-Review+2

Carrying the +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3938: Prevent illegal implicit collection references being used.

2016-08-23 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3938: Prevent illegal implicit collection references 
being used.
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4079/2/fe/src/main/java/com/cloudera/impala/analysis/Path.java
File fe/src/main/java/com/cloudera/impala/analysis/Path.java:

Line 227:   && 
isOptionalFieldCollectionType((CollectionStructType)structType);
Based on this latest version I think the code can be simplified to not even use 
expectExplicitMatch at all, as follows:

// Map all remaining raw-path elements to field types and positions.
while (rawPathIdx < rawPath_.size()) {
  if (!currentType.isComplexType()) return false;
  StructType structType = getTypeAsStruct(currentType);
  // Resolve explicit path.
  StructField field = structType.getField(rawPath_.get(rawPathIdx));
  if (field == null) {
// Resolve implicit path.
if (structType instanceof CollectionStructType) {
  field = ((CollectionStructType) structType).getOptionalField();
  // Collections must be matched explicitly.
  if (field.getType().isCollectionType()) return false;
} else {
  // Failed to resolve implicit or explicit path.
  return false;
}
// Update the physical types/positions.
matchedTypes_.add(field.getType());
matchedPositions_.add(field.getPosition());
currentType = field.getType();
// Do not consume a raw-path element.
continue;
  }
  matchedTypes_.add(field.getType());
  matchedPositions_.add(field.getPosition());
  if (field.getType().isCollectionType() && firstCollectionPathIdx_ == -1) {
Preconditions.checkState(firstCollectionTypeIdx_ == -1);
firstCollectionPathIdx_ = rawPathIdx;
firstCollectionTypeIdx_ = matchedTypes_.size() - 1;
  }
  currentType = field.getType();
  ++rawPathIdx;
}


http://gerrit.cloudera.org:8080/#/c/4079/2/fe/src/test/java/com/cloudera/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/com/cloudera/impala/analysis/AnalyzeStmtsTest.java:

Line 597: // Test implicit key/value reference is not abused when using a 
map within an array.
can you explain which resolution is not allowed?

"is not abused" is not very clear

In fact, could you rephrase the comment in L578 and followin as well? The 
underlying issue seems to be that collections must always be matched 
explicitly. Do you agree?


Line 598: addTestTable("create table d.t9 (c1 array>)");
nit: use 'c' instead of 'c1' for consistency with the other tests


Line 599: AnalysisError("select key from d.t9.c1",
add similar tests for a map nested inside a map (we can run into the same issue)


-- 
To view, visit http://gerrit.cloudera.org:8080/4079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I703c7e9c8c9c95c9fee714767367373eec10cd0c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Christopher Channing 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Christopher Channing 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata

2016-08-23 Thread Greg Rahn (Code Review)
Greg Rahn has posted comments on this change.

Change subject: IMPALA-1659: Netezza compatibility functions: metadata
..


Patch Set 2:

I'm in favor of implementing CURRENT_SESSION then adding CURRENT_SID as an 
alias.

-- 
To view, visit http://gerrit.cloudera.org:8080/4063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4006: dangerous rm -rf statements in scripts

2016-08-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4006: dangerous rm -rf statements in scripts
..


Patch Set 3: Code-Review+1

Will wait for Michael's +1 too.

-- 
To view, visit http://gerrit.cloudera.org:8080/4078
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7503794180dee99eeb979e67f34e3b2edade70fe
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata

2016-08-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1659: Netezza compatibility functions: metadata
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4063/2/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 2702:   TestStringValue("current_user()", "impala_test_user");
We should extend tests/custom_cluster/test_delegation.py to test that these 
return the right results when delegation is enabled.


http://gerrit.cloudera.org:8080/#/c/4063/2/be/src/exprs/utility-functions-ir.cc
File be/src/exprs/utility-functions-ir.cc:

Line 121:   return AnyValUtil::FromString(
This should fit in a single 90-character line.


http://gerrit.cloudera.org:8080/#/c/4063/2/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

Line 550:   [['current_sid'], 'STRING', [], 
'impala::UtilityFunctions::CurrentSid'],
I spoke with Greg offline and he pointed out that there wasn't a clear use-case 
for current_sid, since there's nothing you can really do with the session id 
once you have it, e.g. cancel the session.

Greg, do you see any risks in adding this now?

My feeling is that it's ok to add it, but that we should call it 
current_session/CurrentSession and make current_sid an alias for it only.


Line 551:   [['user', 'current_user'], 'STRING', [], 
'impala::UtilityFunctions::User'],
I think you have current_user and session_user the wrong way around. 
session_user would be the user that initiated the session, so session_user 
doesn't change with impersonation, but current_user does change.

E.g. see 
http://www.ibm.com/support/knowledgecenter/SSULQD_7.1.0/com.ibm.nz.adv.doc/c_advsec_masquerading.html


-- 
To view, visit http://gerrit.cloudera.org:8080/4063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner

2016-08-23 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3629: Codegen TransferScratchTuples() in 
hdfs-parquet-scanner
..


Patch Set 1: Code-Review+1

Replaces: https://gerrit.cloudera.org/#/c/3774/

-- 
To view, visit http://gerrit.cloudera.org:8080/4093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner

2016-08-23 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has abandoned this change.

Change subject: IMPALA-3629: Codegen TransferScratchTuples() in 
hdfs-parquet-scanner
..


Abandoned

Replaced with: https://gerrit.cloudera.org/#/c/4093/

-- 
To view, visit http://gerrit.cloudera.org:8080/3774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
Gerrit-PatchSet: 11
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner

2016-08-23 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4093

Change subject: IMPALA-3629: Codegen TransferScratchTuples() in 
hdfs-parquet-scanner
..

IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner

We currently don't do any codegen in the hdfs-parquet scanner,
which limits performance. This patch creates a new function,
ProcessScratchBatch, which contains the inner loop of
TransferScratchTuples, allowing us to cross-compile only this
performance critical part. It also uses CodegenEvalConjuncts to
replace the call to EvalConjuncts in ProcessScratchBatch with a
codegen'd version.

Additionally, it modifies the Codegen functions in
hdfs-avro/text/sequence-scanner to take an output parameter for
the codegen'd function and return a Status in order to improve
logging around codegen failures.

Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/CMakeLists.txt
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
A be/src/exec/hdfs-parquet-scanner-ir.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
15 files changed, 276 insertions(+), 128 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/4093/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata

2016-08-23 Thread Greg Rahn (Code Review)
Greg Rahn has posted comments on this change.

Change subject: IMPALA-1659: Netezza compatibility functions: metadata
..


Patch Set 2:

> > My recommendation would be to skip non-obvious
 > > implementations (like CURRENT_SCHEMA) for now until we can
 > discuss
 > > further.
 > 
 > I removed CURRENT_SCHEMA() as you suggested. Should I remove any
 > other function?

I think we're good as the others are straightforward.  Thanks!

-- 
To view, visit http://gerrit.cloudera.org:8080/4063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] Infomation schema (preview for frontend)

2016-08-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Infomation_schema (preview for frontend)
..


Patch Set 6:

(18 comments)

Did a pass over it and made some general comments. I think we should do some 
cleanup and renaming of things before getting more eyes on it, so that the 
review will go as smoothly as possible for everyone.

http://gerrit.cloudera.org:8080/#/c/3863/6/be/src/exec/info-schema-scan-node.cc
File be/src/exec/info-schema-scan-node.cc:

PS6, Line 100: ColNames
Constants should be named like: COL_NAMES


PS6, Line 136: tuple_pool
I think we can get rid of the 'tuple_pool' variable and just use 
row_batch->tuple_data_pool() directly.

(this pattern appears in other scanners for historical reasons)


Line 153: while (!ReachedLimit() && !row_batch->AtCapacity()
This loop looks sufficient to me to materialise any number of rows.


Line 168:   idx++;
Our coding standard says to use '++idx'. Not a big deal but I think it's 
easiest to fix now.


Line 192:   //  TODO need to discuss what should be print here
You can actually remove DebugString(). We don't have it for all scan nodes. 
E.g. HdfsScanNode doesn't have an implementation.


http://gerrit.cloudera.org:8080/#/c/3863/6/fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java
File fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java:

Line 532: case VIEW_METADATA:
Can you add a TODO like:

  // TODO before committing: implement authorization for system tables

So that other reviewers know we intend to fix this.


Line 533:   case SELECT:
There are some tabs instead of spaces here. Let's just figure out how to set up 
your text editor to use spaces instead of tabs, it will save time overall.


http://gerrit.cloudera.org:8080/#/c/3863/6/fe/src/main/java/com/cloudera/impala/catalog/InformationSchemaTable.java
File fe/src/main/java/com/cloudera/impala/catalog/InformationSchemaTable.java:

PS6, Line 34: InformationSchemaTable
I spent some time thinking about this and I'm thinking we should call this 
SystemTable/SystemDatabase. Since information_schema is a special kind of 
system database that shows catalog metadata, and probably shouldn't include 
other non-catalog metadata.


Line 45:   
Your text editor is leaving in trailing whitespace - it should be possible to 
configure it to avoid this (it will save time in the long run).


PS6, Line 52: Name
We should keep the column names lower-case (they're converted to lower-case 
anyway).


Line 89: // do nothing since isLoaded already
Is this ever called? I think since isLoaded() is always true, this should never 
be called. If that's correct, we should just have this be:

 Preconditions.checkState(false);

To assert that load() is never called for this table type.


Line 109: // TODO Auto-generated method stub
Comment not needed.


http://gerrit.cloudera.org:8080/#/c/3863/6/fe/src/main/java/com/cloudera/impala/planner/InfoSchemaScanNode.java
File fe/src/main/java/com/cloudera/impala/planner/InfoSchemaScanNode.java:

Let's clean up the formatting in this file before getting others to review it - 
it's hard to read in gerrit.


PS6, Line 42: InfoSchemaScanNode
Maybe this should be SystemTableScanNode (assuming we do the rename I 
suggested).


PS6, Line 45: KuduScanNode
Needs updating


Line 71: // TODO: Does the port matter?
Add a TODO to have one scan range per backend (so other reviewers can see what 
we are planning).


http://gerrit.cloudera.org:8080/#/c/3863/6/fe/src/main/java/com/cloudera/impala/planner/SingleNodePlanner.java
File fe/src/main/java/com/cloudera/impala/planner/SingleNodePlanner.java:

PS6, Line 1256: tblRef.getTable()
Let's do some cleanup here by changing this to 'table'. I was confused why it 
was different.


http://gerrit.cloudera.org:8080/#/c/3863/6/testdata/workloads/functional-planner/queries/PlannerTest/info-schema.test
File testdata/workloads/functional-planner/queries/PlannerTest/info-schema.test:

When I open this file locally there are a bunch of nonsense characters. I think 
that's why it thinks this is a binary file.

   00:SCAN INFORMATION_SCHEMA 
[information_schema.impala_metrics]^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@


-- 
To view, visit http://gerrit.cloudera.org:8080/3863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7adbeb45220c468e43b424d70c30b952f6cec2cd
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Kathy Sun 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3938: Prevent illegal implicit collection references being used.

2016-08-23 Thread Christopher Channing (Code Review)
Christopher Channing has uploaded a new patch set (#2).

Change subject: IMPALA-3938: Prevent illegal implicit collection references 
being used.
..

IMPALA-3938: Prevent illegal implicit collection references being used.

The Bug: Prior to this patch Impala would accept and execute a query of
the form "select pos, key, value from tbl.arr_map_col order by pos asc",
where "arr_map_col" is an array of map. In this example, the map
is not referenced as a collection and therefore the implicit rule of
using "item" should not be applied. The results produced from this
erroneous query are arbitrary. The semantically correct way of
expressing the query would be of the form "select a.pos, m.key, m.value
from tbl t, t.arr_map_col a, a.item m order by a.pos asc".

The Fix: During query analysis (more specifically: path resolution), if
part of a path cannot be explicitly matched then the planner will fall
back to an implicit field check. As part of this check, the type of the
nested field inside the collection will be inspected to verify that it
does not match that of a MapType. If so, the path resolution will fail
and an AnalysisException will be thrown indicating that the path cannot
be resolved. For example, trying to execute the query "select pos, key,
value from tbl.arr_map_col order by pos asc" would yield the following
error "Could not resolve column/field reference: 'key'". Regarding test
coverage, the existing test suite "AnalyzeStmtsTest" has been augmented
with a set of new tests that will verify that the illegal implicit
collection reference from Array->Map cannot be used.

Change-Id: I703c7e9c8c9c95c9fee714767367373eec10cd0c
---
M fe/src/main/java/com/cloudera/impala/analysis/Path.java
M fe/src/test/java/com/cloudera/impala/analysis/AnalyzeStmtsTest.java
2 files changed, 33 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/4079/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I703c7e9c8c9c95c9fee714767367373eec10cd0c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Christopher Channing 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Christopher Channing 


[Impala-ASF-CR] IMPALA-3938: Prevent illegal implicit collection references being used.

2016-08-23 Thread Christopher Channing (Code Review)
Christopher Channing has posted comments on this change.

Change subject: IMPALA-3938: Prevent illegal implicit collection references 
being used.
..


Patch Set 1:

(3 comments)

I made the changes that you requested.

http://gerrit.cloudera.org:8080/#/c/4079/1/fe/src/main/java/com/cloudera/impala/analysis/Path.java
File fe/src/main/java/com/cloudera/impala/analysis/Path.java:

Line 232:   if (!structType.isMapType() && field.getType() instanceof 
MapType) {
> how could structType every be a map type? do you mean to call isMapStruct()
Sorry, not sure how I missed that. I tried out your suggestion, but it did not 
catch the following broken query: "select key from test_table.locations" (same 
behaviour that the bug exhibits). For this broken scenario, the flag @ L219 
could be set if the type is a collection and we look at the optional type, but 
it would mean duplicating a little code to recreate the structure (in order to 
get the optional field). Instead of that approach, I've reworked this a little 
to populate expectExplicitMatch inside the loop when we're dealing with a 
hierarchy of nested collections.


http://gerrit.cloudera.org:8080/#/c/4079/1/fe/src/test/java/com/cloudera/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/com/cloudera/impala/analysis/AnalyzeStmtsTest.java:

Line 438: // Array of structs and maps with name conflicts. Both implicit 
and explicit
> your new tests don't really belong in this test 'cluster' because there are
Done


Line 467: "Could not resolve column/field reference: 'm.value'");
> whitespace
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I703c7e9c8c9c95c9fee714767367373eec10cd0c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Christopher Channing 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Christopher Channing 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Infomation schema (preview for frontend)

2016-08-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Infomation_schema (preview for frontend)
..


Patch Set 6:

Could you run clang-format on the backend to clean up the formatting? 

If you get .clang_format from https://gerrit.cloudera.org/#/c/3886/ and add 
these directories to your path:

 export 
PATH=$IMPALA_TOOLCHAIN/llvm-3.8.0-p1/bin:$IMPALA_TOOLCHAIN/llvm-3.8.0-p1/share/clang/:$PATH


Then you can run this command to reformat your last commit:
  git diff -U0 HEAD^ | clang-format-diff.py -i -p1

-- 
To view, visit http://gerrit.cloudera.org:8080/3863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7adbeb45220c468e43b424d70c30b952f6cec2cd
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Kathy Sun 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3820: Handle linkage errors while loading Java UDFs in Catalog

2016-08-23 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3820: Handle linkage errors while loading Java UDFs in 
Catalog
..


IMPALA-3820: Handle linkage errors while loading Java UDFs in Catalog

Currently Catalog aborts at startup if the class loader loading Java
UDFs throws a LinkageError (e.g., NoClassDefError). The code has been
designed to catch a generic "Exception" and ignore any incompatible
or erroneous UDFs but it can't handle any "Error"s. Java generally
doesn't encourage to handle all types of "Error"s, however given the
Catalog tries to load thirdparty udfs, it is possible that there might
be missing dependencies that the Catalog is expected to handle.

Testing: Added an e-e test that loads a corrupt Java UDF from Hive and
restarts Catalog to make sure its up and the function count tallies.

Change-Id: I1109614f1617caa9bb27e8c151e902aae71a6b41
Reviewed-on: http://gerrit.cloudera.org:8080/4092
Reviewed-by: Bharath Vissapragada 
Tested-by: Internal Jenkins
---
M fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/com/cloudera/impala/hive/executor/UdfExecutor.java
M tests/custom_cluster/test_permanent_udfs.py
M tests/test-hive-udfs/pom.xml
A tests/test-hive-udfs/src/main/java/com/cloudera/impala/UnresolvedUdf.java
5 files changed, 77 insertions(+), 2 deletions(-)

Approvals:
  Bharath Vissapragada: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I1109614f1617caa9bb27e8c151e902aae71a6b41
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Internal Jenkins


[Impala-ASF-CR] IMPALA-3820: Handle linkage errors while loading Java UDFs in Catalog

2016-08-23 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3820: Handle linkage errors while loading Java UDFs in 
Catalog
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1109614f1617caa9bb27e8c151e902aae71a6b41
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4006: dangerous rm -rf statements in scripts

2016-08-23 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has uploaded a new patch set (#3).

Change subject: IMPALA-4006: dangerous rm -rf statements in scripts
..

IMPALA-4006: dangerous rm -rf statements in scripts

Quoted variable substitutions in rm -rf commands and in many other
places. This prevents disasters if those variables contain whitespace.

Redirected output of the cd commands to /dev/null. This prevents
polluting the target variable with the directory name when the CDPATH
environment variable is set.

Change-Id: I7503794180dee99eeb979e67f34e3b2edade70fe
---
M bin/clean.sh
M bin/impala-config.sh
M bin/impala-python
M bin/impala-python-common.sh
M bin/run-all-tests.sh
M bin/set-classpath.sh
M buildall.sh
M infra/python/deps/download_requirements
8 files changed, 88 insertions(+), 88 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4078/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4078
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7503794180dee99eeb979e67f34e3b2edade70fe
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 


[Impala-ASF-CR] IMPALA-4006: impala-config.sh contains dangerous rm -rf statements

2016-08-23 Thread Zoltan Ivanfi (Code Review)
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-4006: impala-config.sh contains dangerous rm -rf 
statements
..


Patch Set 1:

(1 comment)

> I believe you can avoid the spaces-in-lists-of-files problems using
 > bash arrays but I don't know if it's worth the effort

The problem with bash arrays is that they can not be exported, their scope is 
limited to the current shell process.

http://gerrit.cloudera.org:8080/#/c/4078/1/bin/run-all-tests.sh
File bin/run-all-tests.sh:

PS1, Line 101: LOG_DIR="${IMPALA_EE_TEST_LOGS_DIR}"
> It's used by run-step
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4078
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7503794180dee99eeb979e67f34e3b2edade70fe
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-23 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..


Patch Set 6: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3986: Python hive-client may fail silently while dropping partitions

2016-08-23 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3986: Python hive-client may fail silently while 
dropping partitions
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4241225d19c75a79179a61fef0c75c06a9a2ac39
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3986: Python hive-client may fail silently while dropping partitions

2016-08-23 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3986: Python hive-client may fail silently while 
dropping partitions
..


IMPALA-3986: Python hive-client may fail silently while dropping partitions

It was noticed during a test run that a partition existed in the HMS
when it shouldn't have. The only possible explanation for that is that
the python based hive client failed to drop the partition when asked
to and we did not check for an error when it did so.

This patch just adds an assert to make sure that the partition is
dropped when instructed to, else it will cause the test to fail in
the right place.

Change-Id: I4241225d19c75a79179a61fef0c75c06a9a2ac39
Reviewed-on: http://gerrit.cloudera.org:8080/4083
Reviewed-by: Matthew Jacobs 
Tested-by: Internal Jenkins
---
M tests/common/impala_test_suite.py
1 file changed, 2 insertions(+), 1 deletion(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I4241225d19c75a79179a61fef0c75c06a9a2ac39
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3820: Handle linkage errors while loading Java UDFs in Catalog

2016-08-23 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-3820: Handle linkage errors while loading Java UDFs in 
Catalog
..


Patch Set 1:

Carried +2 from Alex.

-- 
To view, visit http://gerrit.cloudera.org:8080/4092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1109614f1617caa9bb27e8c151e902aae71a6b41
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3820: Handle linkage errors while loading Java UDFs in Catalog

2016-08-23 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-3820: Handle linkage errors while loading Java UDFs in 
Catalog
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1109614f1617caa9bb27e8c151e902aae71a6b41
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3820: Handle linkage errors while loading Java UDFs in Catalog

2016-08-23 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has abandoned this change.

Change subject: IMPALA-3820: Handle linkage errors while loading Java UDFs in 
Catalog
..


Abandoned

Thanks Alex. Moving this to asf. http://gerrit.cloudera.org:8080/4092

-- 
To view, visit http://gerrit.cloudera.org:8080/3567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I1109614f1617caa9bb27e8c151e902aae71a6b41
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 


[Impala-ASF-CR] IMPALA-1731,IMPALA-3868: Float values are not parsed correctly

2016-08-23 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-1731,IMPALA-3868: Float values are not parsed correctly
..


Patch Set 4: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/119/

-- 
To view, visit http://gerrit.cloudera.org:8080/3791
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9e17d0f051b300a22a520ce34e276c2d4460d35e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3820: Handle linkage errors while loading Java UDFs in Catalog

2016-08-23 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4092

Change subject: IMPALA-3820: Handle linkage errors while loading Java UDFs in 
Catalog
..

IMPALA-3820: Handle linkage errors while loading Java UDFs in Catalog

Currently Catalog aborts at startup if the class loader loading Java
UDFs throws a LinkageError (e.g., NoClassDefError). The code has been
designed to catch a generic "Exception" and ignore any incompatible
or erroneous UDFs but it can't handle any "Error"s. Java generally
doesn't encourage to handle all types of "Error"s, however given the
Catalog tries to load thirdparty udfs, it is possible that there might
be missing dependencies that the Catalog is expected to handle.

Testing: Added an e-e test that loads a corrupt Java UDF from Hive and
restarts Catalog to make sure its up and the function count tallies.

Change-Id: I1109614f1617caa9bb27e8c151e902aae71a6b41
---
M fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/com/cloudera/impala/hive/executor/UdfExecutor.java
M tests/custom_cluster/test_permanent_udfs.py
M tests/test-hive-udfs/pom.xml
A tests/test-hive-udfs/src/main/java/com/cloudera/impala/UnresolvedUdf.java
5 files changed, 77 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/4092/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I1109614f1617caa9bb27e8c151e902aae71a6b41
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada