[Impala-ASF-CR] IMPALA-4671: (part-2) Replace kudu::ServicePool with one that uses Impala threads

2017-11-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8472 )

Change subject: IMPALA-4671: (part-2) Replace kudu::ServicePool with one that 
uses Impala threads
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@255
PS3, Line 255:   value->AddMember("idle_threads", idle_threads, 
document->GetAllocator());
should we also display the total number of service threads even though it's a 
constant, similar to having max queue size?



--
To view, visit http://gerrit.cloudera.org:8080/8472
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3e0fa9ae22c4803b0cea736eb35e333e2671448
Gerrit-Change-Number: 8472
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 15 Nov 2017 20:42:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4671: (part-2) Replace kudu::ServicePool with one that uses Impala threads

2017-11-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8472 )

Change subject: IMPALA-4671: (part-2) Replace kudu::ServicePool with one that 
uses Impala threads
..


Patch Set 3:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@182
PS3, Line 182: MonoTime
> how about using impala util/time.h routines so we don't have to read about
Hmm, okay I see we are fishing into the timing_ struct and doing math against 
those values, so let's leave this computation using MonoTime. However, how 
about instead of doing our own Now(), shouldn't we move this computation down 
and write it in terms of timing() fields?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@186
PS3, Line 186: internal
internal is too vague. Let's "to update the InboundCall timing".  timing() is a 
method on InboundCall so we can talk about it.


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@187
PS3, Line 187: unused_histogram_
> there's nothing useful to get out of this thing?
I guess you're basically doing this manually via time_in_queue computation.

But the handler_latency_histogram is inside a method_info_ thing that they 
allow to be null. Why isn't this histogram also in there?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@212
PS3, Line 212: method_name
there's no enumeration for the methods? i guess this is okay though.


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@213
PS3, Line 213:   method->num_in_handlers.Add(1);
is that purposely protected by the lock?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@220
PS3, Line 220: uint64_t time_to_handle = (MonoTime::Now() - 
monotime_after_dequeue).ToMilliseconds();
this is a bit decieving becuase we haven't necessarily actually handled the RPC 
at this point. It may be deferred. Do we have something equivalent to the 
method_info_->handler_latency_histogram?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@223
PS3, Line 223:   // rid of the spin lock.
why not just do that?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@240
PS3, Line 240:   }
doesn't krpc already have these histograms? is all we want to get out is the 
json representation? how hard would it be to just use the kudu ones and get the 
json out of those?

are there other histograms/metrics that Kudu has that would be useful to 
expose? if so it seems like duplicating them like this will be a pain.



--
To view, visit http://gerrit.cloudera.org:8080/8472
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3e0fa9ae22c4803b0cea736eb35e333e2671448
Gerrit-Change-Number: 8472
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 15 Nov 2017 20:35:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 8: Code-Review+2

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.h@381
PS8, Line 381: in microseconds since epoch)
or just say "in Unix time"


http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h@385
PS3, Line 385: strationId of the registered subscriber
> Discussed this a little more with Dimitris, leaving it as-is for now. We di
Thanks. yes, let's avoid shared_ptrs and especially weak_ptrs, and move toward 
single ownership when possible.


http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc@278
PS3, Line 278:   lock_guard l(subscribers_lock_);
 :   lock_guard t(topic_lock_);
> Fair point, I'll revert the spinlock change. Maybe we can address it again
Note that SpinLock is not a traditional spin-lock -- it's adaptive and will 
block like a mutex after attempting to spin for a while. So, it's pretty 
general-purpose.


http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.cc@415
PS8, Line 415: RegisteredSubscriberExists
Seems like this should just be "FindSubscriber()" or 
"FindRegisteredSubscriber()" but okay to leave if you prefer the "exists" name. 
 The "exists" naming makes it a bit surprising that it also returns the pointer.



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 8
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 15 Nov 2017 19:12:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8408 )

Change subject: IMPALA-6121: remove I/O mgr request context cache
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/disk-io-mgr-reader-context.h
File be/src/runtime/disk-io-mgr-reader-context.h:

http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/disk-io-mgr-reader-context.h@27
PS8, Line 27:  but some clients may need to include this header, e.g. to make 
the
: /// unique_ptr destructor work correctly.
: ///
> I think exposing it will make more sense after a follow-on patch. There are
Yeah, but then making sure this abstraction makes sense becomes more important, 
which is why I was getting a little nervous about what a DiskIoRequestContext 
actually represents. But we can deal with that in that change.



--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Nov 2017 19:04:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4671: (part-2) Replace kudu::ServicePool with one that uses Impala threads

2017-11-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8472 )

Change subject: IMPALA-4671: (part-2) Replace kudu::ServicePool with one that 
uses Impala threads
..


Patch Set 3:

(21 comments)

http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.h
File be/src/rpc/impala-service-pool.h:

http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.h@52
PS3, Line 52:   kudu::rpc::RpcMethodInfo* LookupMethod(const 
kudu::rpc::RemoteMethod& method) override;
:
:   virtual kudu::Status
:   QueueInboundCall(gscoped_ptr call) 
OVERRIDE;
:
:   const std::string service_name() const;
:
:
let's add header function comments to these (and other methods), per usual.


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.h@69
PS3, Line 69: Method
What is this referring to? RPC service methods, or something different?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.h@72
PS3, Line 72: std::unique_ptr handling_time;
what is "time" in these cases? wallclock, cpu, ...?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.h@74
PS3, Line 74: AtomicInt32 num_in_handlers;
what's that? some comments here would help.


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.h@92
PS3, Line 92:   boost::mutex shutdown_lock_;
what does that protect?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@182
PS3, Line 182: MonoTime
how about using impala util/time.h routines so we don't have to read about 
another set of time functions?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@186
PS3, Line 186: // We need to call RecordHandlingStarted() to update some 
internal InboundCall state.
That seems weird. why is that?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@187
PS3, Line 187: unused_histogram_
there's nothing useful to get out of this thing?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@207
PS3, Line 207:
note to self: finish here


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-mgr.h
File be/src/rpc/rpc-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-mgr.h@a160
PS3, Line 160:
is that not still true? i.e. since we need to inherit from kudu::RpcService


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-mgr.cc@129
PS3, Line 129: this->
delete


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-mgr.cc@138
PS3, Line 138: inbound
shouldn't that be outbound?

let's make sure we test this change somehow.


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-mgr.cc@146
PS3, Line 146:   for (auto service_pool : service_pools_) {
is there a danger this could run concurrently with RegisterService()?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.h
File be/src/rpc/rpc-trace.h:

http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.h@38
PS3, Line 38: per ThriftServer.
does this need updating?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.h@129
PS3, Line 129: /// Initialises rpc event tracing, must be called before any 
RpcEventHandlers are created.
document the rpc_mgr parameter


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.cc
File be/src/rpc/rpc-trace.cc:

http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.cc@86
PS3, Line 86: rpc_mgr
nit: != NULL (or nullptr and adjust below)


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.cc@91
PS3, Line 91: webserver->RegisterUrlCallback("/rpcz", "rpcz.tmpl", json);
I guess the whole point in integrating with this code is to share the /rpcz 
page? if that's the case, how about being more explicit about that?  or rather, 
that the point of InitRpcEventTracing() is to register callbacks for these 
paths.


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.cc@95
PS3, Line 95: "/rpcz_reset"
should something happen for krpc stuff in this case?


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.cc@100
PS3, Line 100: )
nit: != nullptr


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.cc@122
PS3, Line 122: )
same


http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/service/impalad-main.cc
File be/src/service/impalad-main.cc:

http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/service/impalad-main.cc@86
PS3, Line 86: InitRpcEventTracing(exec_env.webserver(), exec_env.rpc_mgr());
couldn't we just do that in either case?



--
To view, visit http://gerrit.cloudera.org:8080/84

[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-13 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8449/7/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/7/be/src/statestore/statestore.h@394
PS7, Line 394:   typedef std::pair 
ScheduledSubscriberUpdate;
I meant flatten both pairs -- i.e. turn ScheduledSubscriberUpdate into a struct 
(with three fields).



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 7
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 14 Nov 2017 05:40:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu

2017-11-13 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7793 )

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
..


Patch Set 14: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7793/14/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/7793/14/common/thrift/ImpalaInternalService.thrift@203
PS14, Line 203:   41: optional i32 max_num_runtime_filters = 10
> so this (and two options below) only apply to bloom but not min/max filters
I see you addressed this in a doc you had already linked:
https://docs.google.com/document/d/1G-SPZelateebNxzVb67urEVtjc5Itw-B_jjfS85bSCE/edit?usp=sharing

That reasoning seems okay with me.



--
To view, visit http://gerrit.cloudera.org:8080/7793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c
Gerrit-Change-Number: 7793
Gerrit-PatchSet: 14
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Anonymous Coward #345
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 13 Nov 2017 22:09:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-13 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8408 )

Change subject: IMPALA-6121: remove I/O mgr request context cache
..


Patch Set 8: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/disk-io-mgr-reader-context.h
File be/src/runtime/disk-io-mgr-reader-context.h:

http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/disk-io-mgr-reader-context.h@27
PS8, Line 27:  but some clients may need to include this header, e.g. to make 
the
: /// unique_ptr destructor work correctly.
: ///
that's really unfortunate, but the change is a net win.


http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@654
PS7, Line 654:   std::unique_ptr 
RegisterContext(MemTracker* reader_mem_tracker);
> I think it's an implementation detail the RegisterContext() doesn't registe
I wasn't thinking in terms of implementation. I was thinking in terms of the 
interface -- how can a context be "registered" when the caller doesn't pass a 
context to be registered?

Anyway, I'm fine with leaving the terminology alone in this change.


http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/runtime-state.h@39
PS8, Line 39: class DiskIoRequestContext;
delete



--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 13 Nov 2017 22:05:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-13 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8449/6/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/6/be/src/statestore/statestore.h@383
PS6, Line 383: std::pair
once we have two level pair, I think it's time to start naming the fields. How 
about defining a struct for this thing instead?



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 6
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 13 Nov 2017 21:13:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu

2017-11-13 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7793 )

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
..


Patch Set 14:

(2 comments)

looked through the headers and thrift and it looks fine to me. Just a question 
about the query options.

http://gerrit.cloudera.org:8080/#/c/7793/14/be/src/exec/kudu-util.cc
File be/src/exec/kudu-util.cc:

http://gerrit.cloudera.org:8080/#/c/7793/14/be/src/exec/kudu-util.cc@116
PS14, Line 116: Status ConvertTimestampValue(const TimestampValue* tv, int64_t* 
ts_micros) {
static


http://gerrit.cloudera.org:8080/#/c/7793/14/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/7793/14/common/thrift/ImpalaInternalService.thrift@203
PS14, Line 203:   41: optional i32 max_num_runtime_filters = 10
so this (and two options below) only apply to bloom but not min/max filters? is 
that going to be clear to a user? hopefully we don't expect most to use these 
options, so maybe we should rename them?



--
To view, visit http://gerrit.cloudera.org:8080/7793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c
Gerrit-Change-Number: 7793
Gerrit-PatchSet: 14
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Anonymous Coward #345
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 13 Nov 2017 18:59:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-09 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8408 )

Change subject: IMPALA-6121: remove I/O mgr request context cache
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h
File be/src/runtime/disk-io-mgr-reader-context.h:

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h@25
PS7, Line 25: /// Internal per request-context state. This object maintains a 
lot of state that is
so no one should include this file except for disk io mgr, is that right? i.e. 
this class is opaque to everyone else?

I think that's less obvious now that it's not in an -internal.h header.


http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.cc
File be/src/runtime/disk-io-mgr-reader-context.cc:

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.cc@108
PS7, Line 108:   state_ = Inactive;
why is it that Cancel() has to do so much more work than this, when this 
supposedly does a Cancel in addition to marking it inactive?  Is this really a 
different kind of cancel?


http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@654
PS7, Line 654:   std::unique_ptr 
RegisterContext(MemTracker* reader_mem_tracker);
I'm not really sure what it means to "register" a context. Should this just be 
Create()?  (And I guess we have these next three methods to keep the context 
opaque outside of disk-io-mgr, is that right?)


http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@662
PS7, Line 662:   void UnregisterContext(DiskIoRequestContext* context);
> I don't feel strongly. I think your argument has merit but destroying the d
Yeah, except that the statement "context cannot be used after this call" is 
what contradicts the control structure lifetime being tied to QueryState.  I'm 
fine with leaving it around.

But, what does it mean to "unregister" the context? The whole 
register/unregister terminology seems confusing. Does the lifecycle of these 
things mirror the lifecycle that we're moving toward and does it make sense to 
use consistent terminology?



--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 09 Nov 2017 18:58:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-09 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8023/13/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/13/be/src/runtime/krpc-data-stream-recvr.cc@163
PS13, Line 163: condition_variable_any
these should be changed to impala ConditionVariables now, but you can do that 
as a follow on



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 13
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 09 Nov 2017 18:24:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-08 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 10
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 08 Nov 2017 20:22:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8202 )

Change subject: IMPALA-4704: Turns on client connections when local catalog 
initialized.
..


Patch Set 22: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Gerrit-Change-Number: 8202
Gerrit-PatchSet: 22
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 08 Nov 2017 00:55:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 9:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-mgr.cc@62
PS9, Line 62:   min(FLAGS_datastream_service_num_deserialization_threads, 
CpuInfo::num_cores()),
why disallow the setting of this greater than num_cores? Given that it takes 
locks, there could be some benefit to doing so, right?


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@67
PS9, Line 67: data
the resources


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@195
PS9, Line 195: cur_batch_
current_batch_


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@225
PS9, Line 225: DCHECK
shoudl we also DCHECK that num_pending_enqueue_ == 0 (otherwise, we could have 
acquired the lock while it was dropped for deserialization and there is still a 
row batch)?


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@248
PS9, Line 248: batch_queue_.pop_front();
this is part of the same operation as line 244 (both are adjusting the queue 
state), so mind moving it to be adjacent?


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@252
PS9, Line 252: // Don't hold lock when calling EnqueueDeserializeTask() as 
it may block.
how about moving this to line 250 now that it's that scope that drops the lock. 
Also, it's important that this happens after batch_queue_.pop_front(), right? 
maybe note that.



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 08 Nov 2017 01:32:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8408 )

Change subject: IMPALA-6121: remove I/O mgr request context cache
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h
File be/src/runtime/disk-io-mgr-reader-context.h:

PS7:
I assume this was a wholesale move of the class without modifications, except 
for the new methods defined in the .cc file?


http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@662
PS7, Line 662:   void UnregisterContext(DiskIoRequestContext* context);
Given that the context cannot be used after this call, should we move the 
unique_ptr into this call and delete it then?



--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:56:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8323 )

Change subject: IMPALA-1575: part 2: yield admission control resources
..


Patch Set 8:

> It's still technically accurate

Yeah, I guess that's true.


--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:34:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8202 )

Change subject: IMPALA-4704: Turns on client connections when local catalog 
initialized.
..


Patch Set 20:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h@89
PS19, Line 89: must call the following
 : /// methods:
> good question-- I've kept it the same as before so not sure what the reason
That DCHECK:

  DCHECK(exec_env.process_mem_tracker() != nullptr)
  << "ExecEnv::StartServices() must be called before starting RPC services";

looks stale since the process_mem_tracker() is set in exec_env_.Init() which is 
called earlier in main (not to mention StartServices() doesn't exist anymore). 
I also don't see what the specific dependency is that this is trying to point 
out (this DCHECK should have been closer to the code that requires this to be 
true).

If you want to keep the dcheck, I see no reason it couldn't be moved before 
both Init and Start. Then we could combine them, since there doesn't appear to 
be any logic behind the separation of those two.  But you can defer cleaning 
this up if you prefer (though the dcheck wording should be updated at least).


http://gerrit.cloudera.org:8080/#/c/8202/20/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8202/20/be/src/service/impala-server.cc@1936
PS20, Line 1936:
I think a quick comment is worth it to highlight the dependency between these 
two lines:
// Subscribe to the catalog topic and then wait for the initial catalog update.
(or whatever is accurate)



--
To view, visit http://gerrit.cloudera.org:8080/8202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Gerrit-Change-Number: 8202
Gerrit-PatchSet: 20
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:04:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 5: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc@953
PS5, Line 953:   ImpaladMetrics::NUM_QUERIES_REGISTERED->Increment(1L);
On the /queries webpage, I think "num_in_flight_queries" is this same number 
(the count of client_request_state_map_ entries). Should we make the names 
consistent? Probably "registered" is a better name than "in-flight" (since 
internally, we use in-flight to mean something slightly different and in-flight 
seems more ambiguous). But I'm not sure if we can change the webserver names 
without breaking something.



-- 
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 17:32:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8323 )

Change subject: IMPALA-1575: part 2: yield admission control resources
..


Patch Set 8:

Maybe this tooltip should be updated?

  document->AddMember("waiting-tooltip", "These queries are no longer 
executing, either "
  "because they encountered an error or because they have returned all of 
their "
  "results, but they are still active so that their results can be 
inspected. To "
  "free the resources they are using, they must be closed.",


--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 07 Nov 2017 17:28:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@211
PS8, Line 211: deferred_rpcs_.empty() && batch_queue_.empty()
> why not write this condition as:
Actually, I think the loop exiting condition is not quite right, which led to 
this confusin conditional.  The loop exiting condition for "we're done" should 
check that there are no more senders, and that there's nothing left to drain 
from the deferred_rpcs_ queue and that there's no pending insert into 
batch_queue_.

So, the third wait loop condition should be something like:

(num_remaining_senders > 0 || !deferred_rpcs_.empty() || num_pending_enqueue_ > 
0)

and then this if-stmt conditional can just be:
if (batch_queue_.empty())

and then the DCHECK can be the negation of that third wait loop conditional:

// Wait loop is exited with an empty batch_queue_ only when there will be no 
more batches.
DCHECK(num_remaining_senders == 0 && deferred_rpcs_.empty() && 
num_pending_enqueue_ == 0);

And then you can get rid of the outer loop. That outer loop should be removed 
since it's effectively a busy wait (and I think we could get into a busy wait 
state in the previous patchsets).



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 07 Nov 2017 15:58:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 4:

> That's a good point, will make that change. Cancelled merge in the
 > meantime.

A test might be good (that it reaches 0).


--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 00:22:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

2017-11-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8267 )

Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
..


Patch Set 12:

(1 comment)

Looked through the headers and they look okay to me, but don't have time to do 
a full review of the implementation.

http://gerrit.cloudera.org:8080/#/c/8267/12/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/8267/12/be/src/exec/parquet-column-readers.h@96
PS12, Line 96: for different encodings.
what do we mean by that? Isn't this just for RLE?



--
To view, visit http://gerrit.cloudera.org:8080/8267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
Gerrit-Change-Number: 8267
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 00:09:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8461/4/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8461/4/be/src/service/impala-server.cc@871
PS4, Line 871: // Metric is decremented in UnregisterQuery().
but UnregisterQuery() is called for more paths then just this one. So why isn't 
this increment inside of RegisterQuery() so that they match up?



--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 00:00:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu

2017-11-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7793 )

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
..


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/exec/filter-context.h
File be/src/exec/filter-context.h:

http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/exec/filter-context.h@118
PS11, Line 118: Materialize filter values.
what does that mean?


http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/runtime/runtime-filter-bank.h
File be/src/runtime/runtime-filter-bank.h:

http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/runtime/runtime-filter-bank.h@78
PS11, Line 78:   /// may both be NULL, representing a filter that allows all 
rows to pass.
is it the case that at most one of bloom_filter and min_max_filter should be 
non-NULL? could you clarify that (including in the class comment)?


http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/util/min-max-filter.h
File be/src/util/min-max-filter.h:

http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/util/min-max-filter.h@62
PS11, Line 62: Materialize filter values
what does that mean?



--
To view, visit http://gerrit.cloudera.org:8080/7793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c
Gerrit-Change-Number: 7793
Gerrit-PatchSet: 11
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Anonymous Coward #345
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 06 Nov 2017 23:54:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8323 )

Change subject: IMPALA-1575: part 2: yield admission control resources
..


Patch Set 6: Code-Review+2

(1 comment)

If you do decide to migrate the test to HS2, have someone review those changes 
before carrying forward the +2.

http://gerrit.cloudera.org:8080/#/c/8323/5/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/8323/5/tests/custom_cluster/test_admission_controller.py@91
PS5, Line 91: ]
> Yeah I was checking the coverage bottom-up to make sure that all of the cod
We really should switch our pytests to use HS2 by default rather than beeswax, 
but that's clearly out of scope for this change.

Given how closely tied lifecycle is to the client API, it might be a good idea 
to see if you can easily move this test to HS2. I'll let you decide what the 
threshold for "easy" is, though. As you've reasoned, those last two cases are 
tied pretty closely to Close() and that's unlikely to change.



--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Mon, 06 Nov 2017 23:31:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.

2017-11-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8202 )

Change subject: IMPALA-4704: Turns on client connections when local catalog 
initialized.
..


Patch Set 19:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/frontend.h
File be/src/service/frontend.h:

http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/frontend.h@162
PS19, Line 162: Sets the FE catalog to be initialized.
I don't understand what that's trying to say.


http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h@79
PS19, Line 79: The executor role starts ImpalaInternalService API's
technically the coordinator also has to start the ImpalaInternalService API in 
order to run the coordinator fragment instance.


http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h@89
PS19, Line 89: must call the following
 : /// methods:
why is this split into two steps? Is there something the caller would want to 
do between opening the ports and starting the services?


http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.cc@1518
PS19, Line 1518:
nit: indentation glitch


http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.cc@1937
PS19, Line 1937:   RETURN_IF_ERROR(exec_env_->StartServices());
why does this have to be moved so early? i'm not sure if this will cause 
problems when krpc is enabled.



--
To view, visit http://gerrit.cloudera.org:8080/8202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Gerrit-Change-Number: 8202
Gerrit-PatchSet: 19
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 06 Nov 2017 23:22:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable

2017-11-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8428 )

Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9
Gerrit-Change-Number: 8428
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 06 Nov 2017 23:08:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8323 )

Change subject: IMPALA-1575: part 2: yield admission control resources
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8323/5/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/8323/5/be/src/runtime/coordinator.h@454
PS5, Line 454: probably released
probably have been released


http://gerrit.cloudera.org:8080/#/c/8323/5/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/8323/5/be/src/runtime/coordinator.cc@831
PS5, Line 831:
nit: here and elsewhere, we seem to have gratuitous blank lines which makes it 
harder to read code since you can't fit as much on a single screen. could you 
condense the places you've touched a bit?


http://gerrit.cloudera.org:8080/#/c/8323/5/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/8323/5/tests/custom_cluster/test_admission_controller.py@91
PS5, Line 91: ]
there is also session close and session timeout. but those should both result 
in close, so are we considering them covered by CLIENT_CLOSE?



--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Mon, 06 Nov 2017 23:00:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 8:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.h@435
PS8, Line 435: a request
'num_request' requests


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.cc@62
PS8, Line 62: 1
how was that chosen? do we have a test case that causes this queue to fill up?


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.cc@109
PS8, Line 109:   // Transfer the early senders into 'deferred_rpcs_' queue 
of the corresponding
 :   // sender queue. This makes sure new incoming RPCs won't 
pass these early senders,
 :   // leading to starvation.
this comment seems out of place. this is more an implementation detail of the 
receiver and handled properly inside ProcessEarlySender().  You could 
incorporate this in the comment for ProcessEarlySender() (to motivate why it 
uses the deferred_rpcs_ queue).


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h
File be/src/runtime/krpc-data-stream-recvr.h:

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@129
PS8, Line 129: .
and start a deserialization task to process it asynchronously.


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@130
PS8, Line 130: Transfer
this "transfer" is in the oppose direction of how our "Transfer" methods 
usually go (e.g. src->TransferResourcesOwnership(dest)). Maybe call this 
ProcessEarlySender() (though I don't love "process" either since it's so vague).


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@197
PS8, Line 197: time
cpu time or wall time?


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@200
PS8, Line 200: time
same question


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@67
PS8, Line 67: data
the resources


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@72
PS8, Line 72: 'payload'
RPC state


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@73
PS8, Line 73: deferred_rpcs_
we shouldn't normally refer to private fields in public class comments, but 
given this is an internal class to the recvr, we can leave this.


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@87
PS8, Line 87:   void TransferEarlySender(std::unique_ptr ctx);
same comment as recvr header comment.


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@195
PS8, Line 195: while (current_batch_.get() == nullptr) {
I don't think we need this loop. see other comments in this function.


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@197
PS8, Line 197: !is_cancelled_ && batch_queue_.empty()
nit: consider swapping the order of these so that the fast case comes first 
(!batch_queue_.empty()) but also to match the comment ("or we know we're done" 
corresponds to the is_cancelled_ and num_remaining_senders_ == 0 cases).


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@201
PS8, Line 201:   CANCEL_SAFE_SCOPED_TIMER(recvr_->data_arrival_timer_, 
&is_cancelled_);
 :   CANCEL_SAFE_SCOPED_TIMER(recvr_->inactive_timer_, 
&is_cancelled_);
 :   CANCEL_SAFE_SCOPED_TIMER(
 :   received_first_batch_ ? nullptr : 
recvr_->first_batch_wait_total_timer_,
 :   &is_cancelled_);
there's got to be a cleaner way to do this but ignore for now


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@210
PS8, Line 210:
nit: i think we could do without some of the blank lines in this method to make 
more code fit on a screen


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@211
PS8, Line 211: deferred_rpcs_.empty() && batch_queue_.empty()
why not write this condition as:
num_renaming_senders_ == 0

then, it's more clear that these three conditions correspond to the loop exit 
conditions.


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@218
PS8, Line 218: !batch_queue_.empty()
given that we just checked the other two loop exit conditions, isn't this 
definitely true? i

[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-03 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 7:

(37 comments)

http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.h@102
PS7, Line 102: the batch is added
 : /// to the receiver's 'deferred_batches_'
it's not really the batch added. and it's not just a single structure for the 
receiver (it may go into one of many queues for merging exchange). So how about 
saying:
... the RPC state is saved into a deferred queue.


http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.h@104
PS7, Line 104: from the pending sender queue
how about: ... from a deferred RPC queue and the row batch is deserialized.


http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.h@393
PS7, Line 393:
quick comment for why we define a move constructor and move operator=, since we 
don't typically want to define those.


http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc@199
PS7, Line 199:   DeserializeTask payload =
 :   {DeserializeTaskType::EARLY_SENDERS, finst_id, 
dest_node_id, 0};
 :   deserialize_pool_.Offer(move(payload));
doesn't this mean we make early sender draining single threaded? shoudl we 
instead use the sender_id in this case as well and offer work per sender? or do 
we think this doesn't matter?


http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc@217
PS7, Line 217: already_unregistered
that shouldn't be possible in the DEFERRED_BATCHES case, right? so i'd probably 
move this DCHECK into the cases below so you can tighten it up.


http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc@235
PS7, Line 235: for (const unique_ptr& ctx : 
early_senders.waiting_sender_ctxs) {
shouldn't we process waiting_sender_ctxs before closed_sender_ctxs? Otherwise, 
if the same sender is in both lists we'll process those RPCs out of order. I 
guess that can't really happen given the current implementation of not 
responding to early RPCs and that senders only let one in flight, but it still 
seems to make more sense to do it the other way around.


http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc@236
PS7, Line 236: already_unregistered
why is this possible in the waiting_sender_ctxs case but not the 
closed_sender_ctxs case?


http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc@247
PS7, Line 247: already_unregistered
why is that possible?


http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc@248
PS7, Line 248: recvr->AddDeferredBatches(task.sender_id);
So I guess we no longer multithread within a single sender queue (and for 
non-merging, within a single receiver) doing it this way. I think that's okay 
but was it intentional?


http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-recvr.h
File be/src/runtime/krpc-data-stream-recvr.h:

http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-recvr.h@78
PS7, Line 78: The caller must acquire data from the
:   /// returned batch
is that talking about calling TransferAllResources(), or can the caller do it 
directly?


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@127
PS6, Line 127:   // If true, the receiver fragment for this stream got 
cancelled.
> For the non-merging case, there is essentially only one queue.
As mentioned elsewhere, I'm not totally convinced yet that this is the right 
way to do it but, yes, we can think about it more and change it later if 
necessary.


http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-recvr.cc@72
PS7, Line 72: s
typo


http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-recvr.cc@77
PS7, Line 77: Adds as many deferred batches as possible
hmm I'm still not convinced this is the right thing to do (in the merging 
case). It seems like it's left up to chance as to the order that deferred 
batches are drained across the sender queues. But we can think about this more 
and address it later.


http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-recvr.cc@97
PS7, Line 97: (1) 'batch_queue' is empty and there is no p

[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-03 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 5: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8438/5/be/src/exec/hdfs-text-scanner.h
File be/src/exec/hdfs-text-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8438/5/be/src/exec/hdfs-text-scanner.h@144
PS5, Line 144: in
if?



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 03 Nov 2017 18:10:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-03 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h
File be/src/exec/hdfs-text-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h@90
PS4, Line 90:   /// CheckForSplitDelimiter(). Valid if 'byte_buffer_filled_' is 
true.
Maybe briefly explain why we have it: the character is needed after the byte 
buffer is returned, or whatever.


http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h@144
PS4, Line 144: virtual
I'm not following why we need a second virtual method. Is the wrapper meant to 
always apply to all subclasses? If so, it shouldn't need (or want it) to be 
virtual right?



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 03 Nov 2017 16:52:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5564: Release lock during planning. (wip)

2017-11-03 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8434 )

Change subject: IMPALA-5564: Release lock during planning. (wip)
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8434/1/be/src/service/client-request-state.h
File be/src/service/client-request-state.h:

http://gerrit.cloudera.org:8080/#/c/8434/1/be/src/service/client-request-state.h@342
PS1, Line 342: is_planning_
> I started out with this being planning_finished_ (or some such). I ran into
I see - the ExecuteMetadataOp() cases.



--
To view, visit http://gerrit.cloudera.org:8080/8434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e3fc4c28d7a5ad8546d48bcd22c03fbce502e5b
Gerrit-Change-Number: 8434
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 03 Nov 2017 16:33:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-03 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 5:

Sorry I didn't see the discussion on the JIRA about "advanced". I don't think 
that solves the problem that the JIRA was meaning to solve (though I don't 
think implementation would have to change too much).  I'll add a comment to the 
JIRA.


--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 03 Nov 2017 16:17:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.

2017-11-02 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8215 )

Change subject: IMPALA-5142 EventSequence displays negative elapsed time.
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896
Gerrit-Change-Number: 8215
Gerrit-PatchSet: 7
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Thu, 02 Nov 2017 19:24:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.

2017-11-02 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8215 )

Change subject: IMPALA-5142 EventSequence displays negative elapsed time.
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896
Gerrit-Change-Number: 8215
Gerrit-PatchSet: 6
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Thu, 02 Nov 2017 19:24:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 6:

(8 comments)

Here's my last set of comments for this round.

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.cc@59
PS6, Line 59: boost::bind
I don't have a strong preference either way, but it'd be nice to be consistent 
between either using bind or [], rather than both...


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.cc@322
PS6, Line 322: RespondToTimedOutSenders
shouldn't be plural


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@52
PS6, Line 52: overflows the queue
it's not clear what that means just from reading the comment. It'd be nice to 
briefly explain that this is talking about the soft limit of the number of 
bytes across all sender queues.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@105
PS6, Line 105:   // number of pending row batch insertion.
 :   int num_pending_enqueue_ = 0;
it's not clear what a "pending insertion" means or why we have this.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@113
PS6, Line 113: RowBatch*
how about using unique_ptr since this owns the row batch (until it's 
transferred to current_batch_)?


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@127
PS6, Line 127:   queue> blocked_senders_;
given that a single soft limit is imposed across all sender queues, does it 
make sense that the blocked_senders_ are maintained per sender? Why don't we 
maintain a single blocked_senders_ list per datastream recvr?

Hmm, I guess we need to know if this sender has a blocked sender in GetBatch(). 
But given the single limit, it seems wrong that one sender's row batches can 
bypass another sender once we get into the blocked sender situation. i.e. the 
flow of batches across senders seems quite different depending on when the 
limit was reached.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@231
PS6, Line 231: proto_batch
update


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@322
PS6, Line 322: payload->rpc_context->RespondSuccess();
doing this in Close() goes against the paradigm that Close() is only about 
releasing (local) resources. We've been going that way because there might be 
no where to bubble up a status from Close().  At least RespondSuccess() doesn't 
return a status, I suppose.  But is there any place sooner we could do this? 
Does it make sense to do in during Cancel instead?



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 02 Nov 2017 00:10:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.

2017-11-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8215 )

Change subject: IMPALA-5142 EventSequence displays negative elapsed time.
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8215/5/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/8215/5/be/src/util/runtime-profile-counters.h@327
PS5, Line 327:int64_t prev = 0L;
 : for (const Event& event: events_) {
rather than creating a tmp eventlist_sorted, you should create a temp for this 
lamba (you can use "auto" for that), since it's used twice and must be 
consistent in order for this code to make sense.


http://gerrit.cloudera.org:8080/#/c/8215/5/be/src/util/runtime-profile-counters.h@330
PS5, Line 330:  = false;
our style uses ! rather than == false


http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h@326
PS4, Line 326: bool eventlist_sorted = true;
 : int64_t prev = 0L;
 : for (const Event& event: events_) {
 :   if (event.second < prev) {
 : eventlist_sorted = false;
 : break;
 :   }
 :   prev = event.second;
 : }
 :
> Since the list is sorted in most of the times calling sort() may be expensi
the flipside is that this becomes a cliff in the sense that normally it will be 
fast, but every once in a while (when the race occurs), we'll pay the cost of 
sort(). If we always call sort(), then the performance is more predictable.

I don't think the event list will ever be so long that this matters one way or 
another, so simplest seems best. That said, I don't feel strongly if you really 
want to keep the check.



--
To view, visit http://gerrit.cloudera.org:8080/8215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896
Gerrit-Change-Number: 8215
Gerrit-PatchSet: 4
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Wed, 01 Nov 2017 22:31:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 6:

(14 comments)

Note to self: remaining files: krpc-data-stream-{mgr,recvr}.cc

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/exec-env.cc@89
PS6, Line 89: "Number of datastream service processing threads");
how are these defaults chosen?


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@50
PS6, Line 50: outbound
I think we should say something about KRPC to at least give that hint. maybe:

A KRPC outbound row batch...


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@60
PS6, Line 60: sizeof(int32_t)
sizeof(tuple_offsets_[0]) seems clearer and more robust


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@354
PS6, Line 354:   /// it is ignored. This function does not Reset().
we should preserve this comment when removing the thrift variant. So you could 
just put the new decl here now so we don't forget that.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@424
PS6, Line 424:   ///
nit: i don't think we generally have all these line breaks between parameter 
comments.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@426
PS6, Line 426:  .
delete space


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@444
PS6, Line 444: nput_
delete


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@447
PS6, Line 447: input_
delete


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@537
PS6, Line 537:   std::string compression_scratch_;
this seems like a hack and we could do something simpler, but let's leave it 
alone for now.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.cc@241
PS6, Line 241:   // as sidecars to the RpcController.
this comment was probably meant to be deleted?


http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/data_stream_service.proto
File common/protobuf/data_stream_service.proto:

http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/data_stream_service.proto@29
PS6, Line 29: fragment
isn't this the id of the instance?  The comment in KrpcDataStreamSender is 
clearer, let's copy that:
  /// Sender instance id, unique within a fragment.
  int sender_id_;


http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/data_stream_service.proto@59
PS6, Line 59:   // Id of this fragment in its role as a sender.
same


http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto
File common/protobuf/row_batch.proto:

http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@32
PS3, Line 32: = 2;
> That's the tuple data sent as sidecar. Clarified in the new comments.
My point is that writing it like 'tuple_data' doesn't make sense since it's not 
a field in this struct. You should just write:
Size of the tuple data (sent as a sidecar) in bytes ...


http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/row_batch.proto
File common/protobuf/row_batch.proto:

http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/row_batch.proto@32
PS6, Line 32: epeated int32 row_tuples = 2;
why is this needed? i don't see it used. The size of it is used, though it 
seems like even that can be inferred from the descriptors.



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 01 Nov 2017 21:48:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717
PS2, Line 717: byte_buffer_last_byte_
> The byte_buffer_read_size_ check above prevents using a stale value from a
Oops, didn't read line 712 ;)

That does seem like a possibility, and seems worth addressing. But is there a 
way we can exercise that?



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 01 Nov 2017 21:20:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717
PS2, Line 717: byte_buffer_last_byte_
given that this doesn't happen when readsize is 0, is it possible to use a 
stale value here?



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 01 Nov 2017 20:49:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2494: Support for byte array encoded decimals in Parquet scanner

2017-11-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7822 )

Change subject: IMPALA-2494: Support for byte array encoded decimals in Parquet 
scanner
..


Patch Set 7:

> Looks good to me and it looks like you addressed Dan's comments.
 > Dan, did you want to take another look?

No, I don't plan to look through the details.

How confident are we that we have sufficient test coverage with the added 
tests? If we're confident then, Tim, please do the +2 (or ask for a second 
review from another reviewer if you think it's warranted).


--
To view, visit http://gerrit.cloudera.org:8080/7822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2c0e881045109f337fecba53fec21f9cfb9e619e
Gerrit-Change-Number: 7822
Gerrit-PatchSet: 7
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 01 Nov 2017 20:40:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.

2017-11-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8215 )

Change subject: IMPALA-5142 EventSequence displays negative elapsed time.
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h@326
PS4, Line 326: bool eventlist_sorted = true;
 : int64_t prev = 0L;
 : for (const Event& event: events_) {
 :   if (event.second < prev) {
 : eventlist_sorted = false;
 : break;
 :   }
 :   prev = event.second;
 : }
 :
why bother with this? the list is pretty short and this isn't a critical path, 
so I think it should be fine to always call sort().
(and if you were to do that, you might as well use std::is_sorted())


http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h@336
PS4, Line 336: [](Event const &event1,
nit: how about breaking the line before the [] instead of in the middle of the 
lambda param list?



--
To view, visit http://gerrit.cloudera.org:8080/8215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896
Gerrit-Change-Number: 8215
Gerrit-PatchSet: 4
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Wed, 01 Nov 2017 20:17:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5564: Release lock during planning. (wip)

2017-11-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8434 )

Change subject: IMPALA-5564: Release lock during planning. (wip)
..


Patch Set 1:

Also, besides the RPCs, let's consider how to handle the webserver cancelation 
path. Maybe we need to not show the cancel link until we can actually cancel?


--
To view, visit http://gerrit.cloudera.org:8080/8434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e3fc4c28d7a5ad8546d48bcd22c03fbce502e5b
Gerrit-Change-Number: 8434
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 01 Nov 2017 17:56:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5564: Release lock during planning. (wip)

2017-11-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8434 )

Change subject: IMPALA-5564: Release lock during planning. (wip)
..


Patch Set 1:

(2 comments)

I think this path is worth continuing down. The alternate solution would be to 
have a profile copy that can be returned before the actual profile is ready, 
but if releasing the lock works out, that seems simplest.  We should check how 
this minimal profile looks in CM -- there might be some other fields that they 
require?

http://gerrit.cloudera.org:8080/#/c/8434/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8434/1//COMMIT_MSG@17
PS1, Line 17: whereas the web UI would previously block on the lock.
> Looking over https://gerrit.cloudera.org/c/6707/12/be/src/service/impala-se
Right, this JIRA is a follow up to 1972 to improve things further so that the 
(partial) query profile can be retrieved.


http://gerrit.cloudera.org:8080/#/c/8434/1/be/src/service/client-request-state.h
File be/src/service/client-request-state.h:

http://gerrit.cloudera.org:8080/#/c/8434/1/be/src/service/client-request-state.h@342
PS1, Line 342: is_planning_
maybe make that is_planning_finished_ (or similar), and initialize it to false 
and just set to true after planning? Actually, since the goal is to guard 
against other RPCs to execute until the handle is returned, maybe just make it 
is_execute_stmt_finished?



--
To view, visit http://gerrit.cloudera.org:8080/8434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1e3fc4c28d7a5ad8546d48bcd22c03fbce502e5b
Gerrit-Change-Number: 8434
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 01 Nov 2017 17:54:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6136: Part 1: Query duration should not be normally negative.

2017-11-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8430 )

Change subject: IMPALA-6136: Part 1: Query duration should not be normally 
negative.
..


Patch Set 2: Code-Review+2

Please update the comment as suggested by Phil.


--
To view, visit http://gerrit.cloudera.org:8080/8430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6f971ebd5f0f00f3e38df0495692ffe9d52ef90
Gerrit-Change-Number: 8430
Gerrit-PatchSet: 2
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Wed, 01 Nov 2017 16:49:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 6:

(37 comments)

Next batch.

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@103
PS6, Line 103: processed
deserialized?


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@104
PS6, Line 104: respectively
what is this "respectively" referring to?


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@249
PS6, Line 249: proto batch
unclear what that means, maybe stale comment? And this should say something 
about the row batch being contained in 'request'.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@258
PS6, Line 258: the actual row batch
isn't that part of "memory pointed to by 'request'"? If so and you want to 
explicitly mention row batch, maybe say "including the serialized row batch"?


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@294
PS6, Line 294:   typedef std::unique_ptr DeserializeWorkItem;
How about getting rid of this typedef? The code seems easier to understand if 
the unique_ptr is visible in the fn decls. it's a bit harder than necessary to 
reasonable about DeserializeWorkItem& and &&, given that this is now directly a 
unique_ptr.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@428
PS6, Line 428: senders
a sender


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.h
File be/src/runtime/krpc-data-stream-recvr.h:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.h@119
PS6, Line 119: 'row_batch'
row batch is deserialized

('row_batch' isn't a variable in this context)


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.h
File be/src/runtime/krpc-data-stream-sender.h:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.h@57
PS6, Line 57: per_channel_buffer_size' is the buffer size in bytes allocated to 
each channel's
:   /// accumulated row batch.
still not clear what that means. This isn't really the size of a buffer, is it? 
How about something like:

... is a soft limit on the buffering, in bytes, into the channel's accumulating 
row batch before it will be sent.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.h@111
PS6, Line 111: cached protobuf
serialized


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.h@130
PS6, Line 130: cached proto
outbound row


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.h@133
PS6, Line 133: Two OutboundRowBatch reused across RPCs
Maybe say:
The outbound row batches are double-buffered.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@67
PS6, Line 67: can
it looks like there's a third interface now: SerializeAndSendBatch() that takes 
an Impala rowbatch.

But now that we have the outbound batch on the sender, why not just use that 
for the RANDOM case and do SerializeBatch() then TransmitData(), so that we can 
simplify the abstraction?


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@81
PS6, Line 81: TearDown
Teardown()


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@82
PS6, Line 82: TearDown
Teardown()


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@126
PS6, Line 126: .
or if the preceding RPC failed.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@141
PS6, Line 141:   // Flushes any buffered row batches and sends the EOS RPC to 
close the channel.
Returns error status if...
Also indicate that it blocks until the EOS RPC is complete.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@157
PS6, Line 157: below
delete.

If that's all we need it for, maybe just remember the capacity?


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@166
PS6, Line 166:   RuntimeState* runtime_state_ = nullptr;
is that actually used?


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@176
PS6, Line 176: current_outbound_batch_
the name of this is confusing because it's so similar to current_batch_idx_, 
but it means something different (and often the opposite).  How about calling 
this:

rpc_outbound_batch_ or rpc_in_flight_batch_?

You could even get rid of the r

[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

2017-10-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8267 )

Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
..


Patch Set 12:

> Dan, did you want to have a look or should I find someone else to
 > +2?

I plan to at least look at the new class abstractions, but maybe not all the 
implementation details.


--
To view, visit http://gerrit.cloudera.org:8080/8267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
Gerrit-Change-Number: 8267
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 31 Oct 2017 20:19:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4964: Fix Decimal modulo overflow

2017-10-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8329 )

Change subject: IMPALA-4964: Fix Decimal modulo overflow
..


Patch Set 3:

Tim, do you want a second pair of eyes on this? If not, could you do the +2 
review for this?


--
To view, visit http://gerrit.cloudera.org:8080/8329
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5420201d4440d421e33e443df005cdcc16b8a6cd
Gerrit-Change-Number: 8329
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 31 Oct 2017 18:23:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-10-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 8:

(3 comments)

Michael, can you do the +2 for this one too?

http://gerrit.cloudera.org:8080/#/c/8172/8/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/8172/8/be/src/exec/hdfs-scan-node.cc@238
PS8, Line 238:
nit: a lot of unnecessary blank lines. maybe condense a bit so more code fits 
on a screen.


http://gerrit.cloudera.org:8080/#/c/8172/8/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

http://gerrit.cloudera.org:8080/#/c/8172/8/be/src/exec/scanner-context.h@a184
PS8, Line 184:
nice to see this get simplified


http://gerrit.cloudera.org:8080/#/c/8172/8/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

http://gerrit.cloudera.org:8080/#/c/8172/8/be/src/runtime/row-batch.h@a215
PS8, Line 215:
great to see this go



--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 31 Oct 2017 18:21:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-30 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 6:

(5 comments)

Some initial comments for this round.

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/exec/kudu-util.h
File be/src/exec/kudu-util.h:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/exec/kudu-util.h@45
PS6, Line 45: status_
nit: use same mangling between this and KUDU_RETURN_IF_ERROR (prepend seems 
better too since it's not a member of a class and status_ is a comment member 
name).


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@97
PS6, Line 97: two things:
: /// process it immediately, add it to a fixed-size 'batch queue' 
for later processing,
: /// or defer processing the RPC if the 'batch-queue' is full
i'm confused what the "two" cases are from this comment.  Also, I think it's 
kind of confusing what "processing" means. Should it read something like:

two things: deserialize it immediately adding it to the a 'batch queue', or 
defer deserializing and respond to the RPC later if the 'batch queue' is full.

Also, the batch queue isn't really "fixed size", right?


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@171
PS6, Line 171: if (!blocked_senders_.empty()) {
we talked about this in person, but want to note it so I don't forget: this 
loop will deque a "random" number of blocked senders, until the first one 
happens to finish deserializing and shows up in batch_queue_. That seems wrong.


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@220
PS6, Line 220: DCHECK_GT(num_remaining_senders_, 0);
why do we have this DCHECK (i.e. why is this condition important here), and 
could it be violated with out of order RPCs?


http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@237
PS6, Line 237: recvr_->ExceedsLimit(batch_size)
it seems like we should be ORing that with !block_senders_.empty(), or 
something. Otherwise, stuff that's been in the block_senders_ queue for a while 
can be passed by new stuff. i.e. block_senders_ senders can get starved -- 
maybe that explains longer than expected RPC times we've seen in some cases?



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 30 Oct 2017 19:35:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-27 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 14: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8303/14/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/8303/14/be/src/runtime/runtime-state.cc@87
PS14, Line 87: profile_(RuntimeProfile::Create(obj_pool(), "")) {
maybe explain that this is the test constructor



--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 22:21:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: Part 2: copy out strings in uncompressed Avro

2017-10-27 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8146 )

Change subject: IMPALA-5307: Part 2: copy out strings in uncompressed Avro
..


Patch Set 14: Code-Review+1

Thanks. This approach looks okay to me, please see if Michael can do the +2 
review.


--
To view, visit http://gerrit.cloudera.org:8080/8146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233
Gerrit-Change-Number: 8146
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 20:33:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-27 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc
File be/src/runtime/mem-tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc@359
PS11, Line 359: was not available
> We do call it on the process MemTracker in QueryState::Init().
Okay, but my original question was whether the comment should be reworded for 
that case:

... if this tracker is part of the query hierarchy.

i.e. "not available" seems misleading now that we don't get this from 'state' 
(which is the thing that wasn't always available).


http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc@95
PS11, Line 95:   if (!released_exec_resources_) ReleaseExecResources();
> My initial concern was that handling it in Init() would get messy in the ca
What I proposed was not to handle it in Init(), but let the caller handle it. 
i.e. Init() always gets a resource ref count and QueryExecMgr::StartQuery() has 
to handle the error. It already has to handle this and other errors and release 
ref counts if it won't be passing them along...
Is there a reason to not just do that?


http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc@88
PS11, Line 88:   local_query_state_->AcquireExecResourceRefcount(); // 
Decremented in ReleaseResources().
> The resources used for evaluating exprs, etc - this is used in various test
Why are exvaluating exprs query-wide resources. Aren't they per-thread 
resources?



--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 19:37:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: Part 2: copy out strings in uncompressed Avro

2017-10-26 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8146 )

Change subject: IMPALA-5307: Part 2: copy out strings in uncompressed Avro
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8146/13/be/src/runtime/tuple.h
File be/src/runtime/tuple.h:

http://gerrit.cloudera.org:8080/#/c/8146/13/be/src/runtime/tuple.h@175
PS13, Line 175:   /// Materialize the var-len string data in this tuple into 
the provided memory
  :   /// pool.
the name and the comment didn't make it clear that this also rewrites this 
tuple. i.e. in that sense, this is different than MaterializeExprs(). In this 
case, the slots (and strings) are already materialized -- it's re-materializing 
(or copying) the strings, right?



--
To view, visit http://gerrit.cloudera.org:8080/8146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233
Gerrit-Change-Number: 8146
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Oct 2017 23:46:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-26 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 11:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc
File be/src/runtime/mem-tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc@328
PS11, Line 328: RuntimeState* state
definitely not this change, but should we just get rid of this parameter. do we 
really need to explicitly do the LogError() below? If we still rely on that 
then it seems we should fix status propagation.


http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc@359
PS11, Line 359: was not available
does that comment still make sense? isn't the query_tracker==nullptr case now 
mean that it's part of the query hierarchy? though should we even call this 
function in that case?


http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.h@205
PS11, Line 205: backend
what do we mean by "backend" here?  is that the same thing as "execution"?


http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.h@206
PS11, Line 206: Resources for this query
How about: Query wide resources ...

since some FIS-local resources are also resources used for this query but those 
should be owned and released solely by the FIS thread.


http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.h@258
PS11, Line 258: backend
same question


http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc@95
PS11, Line 95:   if (!released_exec_resources_) ReleaseExecResources();
we've been trying to move away from doing resource releasing from destructors, 
to make the lifetimes super clear.  shouldn't this case just be handled in the 
caller of Init() when !status.ok() (which already has to handle releasing the 
QS ref count, and the thread creation path already handles this case. we could 
similarly release the resource ref count if we acquired that sooner before Init 
can fail).


http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc@88
PS11, Line 88:   local_query_state_->AcquireExecResourceRefcount(); // 
Decremented in ReleaseResources().
why is that needed? what resource needs to be held by the runtime state in this 
case?



--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Oct 2017 23:26:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-26 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8303/10/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/8303/10/be/src/runtime/coordinator.cc@1095
PS10, Line 1095: query_state_ != nullptr)
> That would require that query_state_ was not initialised but  we acquired a
Sure, so what you are saying is that query_state_ being not null is not an 
invariant we are trying to maintain -- we expect to be able to run the cancel 
code even if it didn't get so far as creating a query state. The destructor 
code already says that basically. So, yes, let's leave this.



--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Oct 2017 22:30:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-26 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8303/10/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/8303/10/be/src/runtime/coordinator.cc@1095
PS10, Line 1095: query_state_ != nullptr)
> Your reasoning appears to be correct. My tendency is to keep the null check
But if the query_state_ were null, then we wouldn't release the resources, 
which in itself is a bug. So, wouldn't it be better to know that such a bug 
existed?



--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Oct 2017 22:05:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2494: Support for byte array encoded decimals in Parquet scanner

2017-10-26 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7822 )

Change subject: IMPALA-2494: Support for byte array encoded decimals in Parquet 
scanner
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7822/6/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/7822/6/be/src/exec/parquet-column-readers.cc@207
PS6, Line 207: InternalType, parquet::Type::type PARQUET_TYPE
you should explain these template parameters


http://gerrit.cloudera.org:8080/#/c/7822/6/be/src/exec/parquet-common.h
File be/src/exec/parquet-common.h:

http://gerrit.cloudera.org:8080/#/c/7822/6/be/src/exec/parquet-common.h@88
PS6, Line 88: InternalType
let's explain what that is


http://gerrit.cloudera.org:8080/#/c/7822/6/be/src/exec/parquet-common.h@177
PS6, Line 177:   static int DecodeByParquetType(const uint8_t* buffer, const 
uint8_t* buffer_end, int fixed_len_size,
nit: long line



--
To view, visit http://gerrit.cloudera.org:8080/7822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2c0e881045109f337fecba53fec21f9cfb9e619e
Gerrit-Change-Number: 7822
Gerrit-PatchSet: 6
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Oct 2017 21:42:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows

2017-10-26 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8226 )

Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows
..


Patch Set 4:

Tim, can you do the +2 review for this one?


--
To view, visit http://gerrit.cloudera.org:8080/8226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414
Gerrit-Change-Number: 8226
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Oct 2017 17:49:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.

2017-10-25 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8202 )

Change subject: IMPALA-4704: Turns on client connections when local catalog 
initialized.
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8202/12/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8202/12/be/src/service/impala-server.cc@2045
PS12, Line 2045:   if (thrift_be_server_.get()) {
> working on them... starting/stopping cluster processes does not interact we
Have you tried a custom_cluster test?



--
To view, visit http://gerrit.cloudera.org:8080/8202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Gerrit-Change-Number: 8202
Gerrit-PatchSet: 12
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 25 Oct 2017 17:27:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-25 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 3:

(16 comments)

Some more comments, still going though.

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@75
PS3, Line 75: cached
see comment in row-batch.h about this terminology.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@89
PS3, Line 89: // safe to free them once the callback has been invoked.
I think we should add a reference to KUDU-2011 somewhere here like:

Note that due to KUDU-2011, timeout cannot be used with outbound sidecars. The 
client has no idea when it is safe to reclaim the sidecar buffer (~RpcSidecar() 
should be the right place, except that's currently called too early).  
RpcController::Cancel(), however, ensures that the callback is called only 
after the RPC layer no longer references the sidecar buffer.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@92
PS3, Line 92: query
query? does it mean fragment instance?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@124
PS3, Line 124: Shutdown the RPC thread
is that still accurate?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@139
PS3, Line 139:   int buffer_size_;
that could use a comment.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@141
PS3, Line 141:   const TNetworkAddress address_;
 :   TUniqueId fragment_instance_id_;
 :   PlanNodeId dest_node_id_;
those could be commented together to say they identify the destination. it's a 
little odd that plan node id is prefixed "dest" when the others are not.

it also seems weird that we need both these and the req_ field since shouldn't 
they just be stored there?  Or seems we should get rid of the req_ and just 
generate it when sending.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@159
PS3, Line 159: num_cached_proto_batches_
caps since constant?
also, can this ever be something other than 2 without writing the code? i.e. 
doesn't the code assume this value is 2 in various ways?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@175
PS3, Line 175: proxy
proxy_


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@197
PS3, Line 197:   bool remote_recvr_closed_ = false;
why is that needed now?
also, shouldn't we do something different, at a higher level, in that case 
(like cancel this instance)?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@200
PS3, Line 200:   Status rpc_status_;
I think it would help to associate this with rpc_in_flight_ - move them 
adjacent and say that rpc_status_ is valid only when rpc_in_flight_ is false, 
or something.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@218
PS3, Line 218: 2
if we have the constant, shouldn't that use it?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@334
PS3, Line 334:   auto pred = [this]() -> bool { return !rpc_in_flight_ || 
ShouldTerminate(); };
 :   auto timeout = std::chrono::system_clock::now() + 
milliseconds(50);
 :   while (!rpc_done_cv_.wait_until(*lock, timeout, pred)) {
 : timeout = system_clock::now() + milliseconds(50);
 :   }
seems simpler to just write:

while (rpc_in_flight_ && !ShouldTerminate()) {
  auto timeout = std::chrono::system_clock::now() + milliseconds(50);
  rpc_done_cv_.wait_until(*lock, timeout);
}

or even better to use wait_for() which takes the relative timeout.
Or should we use our ConditionVariable wrapper? Especially if we want to start 
instrumenting these things better. But if it's work to switch it over, it's 
okay to keep it condition_variable, but let's at least make the code more 
straight forward.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/row-batch.h@52
PS3, Line 52: struct ProtoRowBatch {
I think we should get rid of this structure all together. IMO, the indirection 
just adds confusion.

On the receive side, it seems we can just get the header and sidecars directly 
from the request, which is already threaded through the RPC handler anyway.  
Pulling it into a ProtoRowBatch just makes it unclear where the not yet 
deserialized rowbatch comes from.

On the send side, I think we should just work directly on CachedProtoRowBatch 
(but rename that thing, see below). The indirect

[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.

2017-10-25 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8215 )

Change subject: IMPALA-5142 EventSequence displays negative elapsed time.
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8215/2/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/8215/2/be/src/util/runtime-profile.cc@658
PS2, Line 658:   event_sequence.second->GetEvents(&events);
 :   if (last == 0 && events.size() > 0) last = 
events.back().second;
that looks like it could be inconsistent with the sorted order


http://gerrit.cloudera.org:8080/#/c/8215/2/be/src/util/runtime-profile.cc@672
PS2, Line 672:
runtime-profile-test.cc has a test case that expects GetEvents() to be in 
order. that case isn't correct if we are saying that GetEvents() returns events 
unsorted.

Also, the comment for events_ says the events are in sorted order.

I think we should either:
(a) maintain events in sorted order, or
(b) do this sort inside GetEvents()

either way the property is that GetEvents() always returns events in increasing 
time order, which seems like the most straightforward API.



--
To view, visit http://gerrit.cloudera.org:8080/8215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896
Gerrit-Change-Number: 8215
Gerrit-PatchSet: 2
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Wed, 25 Oct 2017 16:46:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.

2017-10-24 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8202 )

Change subject: IMPALA-4704: Turns on client connections when local catalog 
initialized.
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8202/12/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8202/12/be/src/service/impala-server.cc@2045
PS12, Line 2045:   if (thrift_be_server_.get()) {
> I wonder if this can cause problems. A coordinator+executor impalad might h
Yeah, we should make sure we have test cases for this and the various 
coord/exec combinations.



--
To view, visit http://gerrit.cloudera.org:8080/8202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Gerrit-Change-Number: 8202
Gerrit-PatchSet: 12
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 24 Oct 2017 20:26:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5018: Error on decimal modulo or divide by zero

2017-10-24 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8344 )

Change subject: IMPALA-5018: Error on decimal modulo or divide by zero
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If7a7131e657fcdd293ade78d62f851dac0f1e3eb
Gerrit-Change-Number: 8344
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 24 Oct 2017 20:20:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5957: print memory address, not memory

2017-10-24 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8371 )

Change subject: IMPALA-5957: print memory address, not memory
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89250646bf683dd2d636dcb37a66ceb0428af8b2
Gerrit-Change-Number: 8371
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 24 Oct 2017 20:18:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5018: Error on decimal modulo or divide by zero

2017-10-24 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8344 )

Change subject: IMPALA-5018: Error on decimal modulo or divide by zero
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/decimal-operators-ir.cc
File be/src/exprs/decimal-operators-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/decimal-operators-ir.cc@729
PS3, Line 729: divide
> In one of the previous comments, Vuk mentioned that Postgres returns an ide
I'm fine with leaving it as is.



--
To view, visit http://gerrit.cloudera.org:8080/8344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If7a7131e657fcdd293ade78d62f851dac0f1e3eb
Gerrit-Change-Number: 8344
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 24 Oct 2017 20:12:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.

2017-10-24 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8305 )

Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src.
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8305
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c
Gerrit-Change-Number: 8305
Gerrit-PatchSet: 10
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Tue, 24 Oct 2017 17:27:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.

2017-10-23 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8305 )

Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src.
..


Patch Set 9: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8305
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c
Gerrit-Change-Number: 8305
Gerrit-PatchSet: 9
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Mon, 23 Oct 2017 19:17:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.

2017-10-23 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8202 )

Change subject: IMPALA-4704: Turns on client connections when local catalog 
initialized.
..


Patch Set 11: Code-Review+1

(1 comment)

BE looks fine.

http://gerrit.cloudera.org:8080/#/c/8202/11/be/src/service/frontend.cc
File be/src/service/frontend.cc:

http://gerrit.cloudera.org:8080/#/c/8202/11/be/src/service/frontend.cc@242
PS11, Line 242: Status Frontend::SetCatalogInitialized() {
  :   JNIEnv* jni_env = getJNIEnv();
  :   JniLocalFrame jni_frame;
  :   RETURN_IF_ERROR(jni_frame.push(jni_env));
  :   jni_env->CallObjectMethod(fe_, set_catalog_initialized_id_);
  :   RETURN_ERROR_IF_EXC(jni_env);
  :   return Status::OK();
  : }
is that needed? could it be handled entirely within InProcessImpalaServer? 
Maybe orthogonal to this change to so okay to defer.



--
To view, visit http://gerrit.cloudera.org:8080/8202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Gerrit-Change-Number: 8202
Gerrit-PatchSet: 11
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 23 Oct 2017 19:01:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.

2017-10-23 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8305 )

Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src.
..


Patch Set 8: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8305/8/be/src/util/time.h
File be/src/util/time.h:

http://gerrit.cloudera.org:8080/#/c/8305/8/be/src/util/time.h@106
PS8, Line 106: /// Convenience function to return the date-time string of the 
current time
 : /// derived from UnixMicros().
how about we mention that this is in the local time zone.



--
To view, visit http://gerrit.cloudera.org:8080/8305
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c
Gerrit-Change-Number: 8305
Gerrit-PatchSet: 8
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Mon, 23 Oct 2017 18:33:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5018: Error on decimal modulo or divide by zero

2017-10-23 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8344 )

Change subject: IMPALA-5018: Error on decimal modulo or divide by zero
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/decimal-operators-ir.cc
File be/src/exprs/decimal-operators-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/decimal-operators-ir.cc@729
PS3, Line 729: divide
this path will also be taken for modulo. Should the message take that into 
account to avoid confusion?


http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/expr-test.cc@2264
PS3, Line 2264:  { true, false, 0, 38, 19 }}},
it would be good to test also

10.0 / 0  or  cast(10.0 as decimal) / 0

(i.e. where divisor is integer not decimal). I don't remember what type that 
expressino would have had in decimal v1 though.

Added a comment to pytest about this too, which might be the easier place to 
verify.


http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/expr-test.cc@2506
PS3, Line 2506: 0/0
it's a bit weird/confusing that we don't do the same thing for integer divide 
by 0, but I guess it's out of scope.


http://gerrit.cloudera.org:8080/#/c/8344/3/testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test
File testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test:

http://gerrit.cloudera.org:8080/#/c/8344/3/testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test@46
PS3, Line 46: cast
is that cast necessary? Let's also verify you get the error without the cast 
there.



--
To view, visit http://gerrit.cloudera.org:8080/8344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If7a7131e657fcdd293ade78d62f851dac0f1e3eb
Gerrit-Change-Number: 8344
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Mon, 23 Oct 2017 16:49:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5018: Error on decimal modulo or divide by zero

2017-10-23 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8344 )

Change subject: IMPALA-5018: Error on decimal modulo or divide by zero
..


Patch Set 3:

> I spoke to Greg and Alex yesterday, and we agreed that decimal_v2
 > should always be in strict mode. This means that when decimal_v2 is
 > enabled, we should always error on overflows and division by zero.
 > This is the behavior that more traditional databases have.

I agree with that.


--
To view, visit http://gerrit.cloudera.org:8080/8344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If7a7131e657fcdd293ade78d62f851dac0f1e3eb
Gerrit-Change-Number: 8344
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Mon, 23 Oct 2017 16:32:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5018: Error on decimal modulo or divide by zero

2017-10-20 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8344 )

Change subject: IMPALA-5018: Error on decimal modulo or divide by zero
..


Patch Set 2:

(1 comment)

In the JIRA, Greg made a comment about tying this to strict mode. Did we reach 
consensus on that with him?

http://gerrit.cloudera.org:8080/#/c/8344/2/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8344/2/be/src/exprs/expr-test.cc@1561
PS2, Line 1561: expected null, scaled_val, precision, scale for V1
  : //expected null, scaled_val, precision, scale for V2 }}
that needs updating



--
To view, visit http://gerrit.cloudera.org:8080/8344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If7a7131e657fcdd293ade78d62f851dac0f1e3eb
Gerrit-Change-Number: 8344
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Fri, 20 Oct 2017 22:37:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.

2017-10-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8305 )

Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src.
..


Patch Set 4:

What about common-metrics.cc?

And then, I think we should move TimestampValue::{LocalTime(), UtcTime()} to 
expr-test.cc so that it's not even part of the TimestampValue interface.


--
To view, visit http://gerrit.cloudera.org:8080/8305
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c
Gerrit-Change-Number: 8305
Gerrit-PatchSet: 4
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Thu, 19 Oct 2017 19:57:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.

2017-10-18 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8305 )

Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src.
..


Patch Set 2: Code-Review+1

(1 comment)

Code change looks good.

How did you test it? Please add a Testing section to commit message indicating 
that. Please at least manually inspect each affected field by e.g. looking at 
the profile.

http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc
File be/src/service/impala-http-handler.cc:

http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc@303
PS1, Line 303:   int64_t duration_us = record.end_time_us - 
record.start_time_us;
> Please see my response to DanH's comment...
Using monotonic clock for durations is generally the right thing to do, but 
given we also print the start and end times in this case, let's leave it alone 
since the Print() function handles it in a reasonable way,



--
To view, visit http://gerrit.cloudera.org:8080/8305
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c
Gerrit-Change-Number: 8305
Gerrit-PatchSet: 2
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Wed, 18 Oct 2017 23:35:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Allow the SASL protocol service name to be configurable

2017-10-18 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8230 )

Change subject: Allow the SASL protocol service name to be configurable
..


Patch Set 2: Code-Review+2

Okay, thats fine. We can revisit if these conflicts cause future work.


--
To view, visit http://gerrit.cloudera.org:8080/8230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e
Gerrit-Change-Number: 8230
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 18 Oct 2017 18:42:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet

2017-10-18 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8085 )

Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of 
parquet
..


Patch Set 8: Code-Review+2

(1 comment)

Yes, this is good with me. Could you get Mostafa to sign off on perf 
implications, perhaps recording that in the JIRA if not already 
(issues.apache.org not loading for me)?

http://gerrit.cloudera.org:8080/#/c/8085/8/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8085/8/be/src/exec/parquet-column-readers.cc@983
PS8, Line 983: data
buffers?



--
To view, visit http://gerrit.cloudera.org:8080/8085
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2
Gerrit-Change-Number: 8085
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 18 Oct 2017 18:16:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src/service.

2017-10-17 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8305 )

Change subject: IMPALA-5599: Clean up references to TimestampValue in 
be/src/service.
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/client-request-state.h
File be/src/service/client-request-state.h:

http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/client-request-state.h@341
PS1, Line 341: time
indicate the clock. e.g. monotonic vs unix time.


http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc
File be/src/service/impala-http-handler.cc:

http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc@303
PS1, Line 303:   int64_t duration_us = record.end_time_us - 
record.start_time_us;
that can be negative if settimeofday() occurred in the mean time. What should 
happen in that case?


http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.h@622
PS1, Line 622: /// Start and end time of the query
same comment about clock


http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.h@640
PS1, Line 640: Unix milliseconds
like that


http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.cc@1009
PS1, Line 1009:   int64_t duration_ms = duration_us / MICROS_PER_MILLI;
same question (though old code could go negative).



--
To view, visit http://gerrit.cloudera.org:8080/8305
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c
Gerrit-Change-Number: 8305
Gerrit-PatchSet: 1
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Tue, 17 Oct 2017 23:42:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.

2017-10-17 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8215 )

Change subject: IMPALA-5142 EventSequence displays negative elapsed time.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8215/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8215/1//COMMIT_MSG@22
PS1, Line 22: value.
> In the thread doing ClientRequestState::ClientRequestState()
There is only one call to AddEventSequence("Query Timeline") per query though, 
right?  so how does another thread operate on that event_sequence_ before the 
constructor of CRS() returns?

Are you able to reproduce this problem? If so, can you change your if-stmt to a 
DCHECK() and put the backtrace in the JIRA when the DCHECK fires.



--
To view, visit http://gerrit.cloudera.org:8080/8215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896
Gerrit-Change-Number: 8215
Gerrit-PatchSet: 1
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Tue, 17 Oct 2017 23:30:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.

2017-10-17 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8215 )

Change subject: IMPALA-5142 EventSequence displays negative elapsed time.
..


Patch Set 1:

> (1 comment)

But who is the caller of MarkEvent()? How can that be happening in concurrently 
with the construction of the ClientRequestState()? No one would have a 
reference to it yet...


--
To view, visit http://gerrit.cloudera.org:8080/8215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896
Gerrit-Change-Number: 8215
Gerrit-PatchSet: 1
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Tue, 17 Oct 2017 16:05:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.

2017-10-16 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8215 )

Change subject: IMPALA-5142 EventSequence displays negative elapsed time.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8215/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8215/1//COMMIT_MSG@22
PS1, Line 22: value.
but the code isn't synchronized at all, so you could (in theory) read a garbage 
start_ if it's being modified concurrently.

What are the callsites of Start() and ElapsedTime() in question?



--
To view, visit http://gerrit.cloudera.org:8080/8215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896
Gerrit-Change-Number: 8215
Gerrit-PatchSet: 1
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Comment-Date: Tue, 17 Oct 2017 00:16:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Reduce log spew from rpcz store.cc

2017-10-16 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8285 )

Change subject: Reduce log spew from rpcz_store.cc
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8efe68be0ae7f9ab05937d5c81fb53a709a881f1
Gerrit-Change-Number: 8285
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 16 Oct 2017 23:58:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-16 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 8: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7938/8/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/7938/8/be/src/rpc/authentication.cc@109
PS8, Line 109: "off a kinit process.");
can you file a jira to remove this flag (and the old code) within a release or 
two?



--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 8
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 16 Oct 2017 23:48:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 3:

(34 comments)

Another batch of comments...

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@63
PS3, Line 63: ongoing transmission from a client to a server as a 'stream'
I don't think that's accurate. see questions in krpc-data-stream-recvr.h about 
stream definition.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@93
PS3, Line 93: process it immediately, add it to a fixed-size 'batch queue' for 
later processing
XXX check whether these are really different


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@164
PS3, Line 164: deferred processing
is that because the recvr hasn't showed up yet, or because the stream's queue 
is full, or both?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@235
PS3, Line 235: node_id
dest_node_id?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@239
PS3, Line 239: Ownership of the receiver is shared between this DataStream mgr 
instance and the
 :   /// caller.
that seems unnecessary but don't change it now.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@246
PS3, Line 246:
'proto_batch'?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@248
PS3, Line 248: .
'request'.

Also document what 'response' and 'context' are.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@266
PS3, Line 266: Notifies the receiver
is this an RPC handler? I think we should just be explicit about which of these 
methods are RPC handlers.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@267
PS3, Line 267: The RPC
what RPC is this talking about? If this is a handler, then it's clear.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@274
PS3, Line 274: Closes
Does it close or cancel? (or is there no difference?)


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@284
PS3, Line 284: RPCs which were buffered
To be consistent with terminology used in class comment, maybe say "deferred 
RPCs"


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@340
PS3, Line 340: ragment instance id, 0
what is that saying? is that a misplaced comma or am I reading this wrong?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@341
PS3, Line 341: instance id changes
I don't understand this.  it kinda sounds like we're trying to be able to find 
all instances of this fragment, but then wouldn't we iterate until the fragment 
id changes (not the instance id)?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@349
PS3, Line 349:   struct EarlySendersList {
hmm, I guess we need this now that we can't block the RPC thread?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@358
PS3, Line 358: Time
Monotonic time


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@374
PS3, Line 374: time
monotonic time


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@382
PS3, Line 382:   boost::unordered_set closed_stream_cache_;
all this parallel startup stuff really needs to be revisited (but not for this 
change). it's too complex and brittle.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@386
PS3, Line 386: Deserialize
maybe call it DeserializeDeferred() or DeserializeWorker() to make it clearer 
that this is only for the deferred (slow) path, since the normal path will also 
have to deserialize (but doesn't use this code).


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@404
PS3, Line 404:   void EnqueueRowBatch(DeserializeWorkItem&& payload);
how about grouping this with Deferred function above since it's related. Also, 
I think the name should be less generic. Like maybe EnqueueDeferredBatch() or 
EnqueueDeferredRpc() (does the response happen before or after this deferred 
work?)


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@413
PS3, Line 413: block
what's that?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@414
PS3, Line 414: Status
I think that status is not getting checked by the caller. I thought Tim made 
Status warn on unused result -- why is it not catching this? (Or do we still 
need to annotate each method?).


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-da

[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8255 )

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 19:08:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8242 )

Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a 
node isn't a coordinator node
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Gerrit-Change-Number: 8242
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:31:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8242 )

Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a 
node isn't a coordinator node
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8242/2/be/src/runtime/exec-env.h
File be/src/runtime/exec-env.h:

http://gerrit.cloudera.org:8080/#/c/8242/2/be/src/runtime/exec-env.h@186
PS2, Line 186: General
I think this is meant to be a thread pool just for ExecQueryFInstances RPC (and 
that's what's meant by the "exec_rpc" in the name).



--
To view, visit http://gerrit.cloudera.org:8080/8242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Gerrit-Change-Number: 8242
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:06:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Allow the SASL protocol service name to be configurable

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8230 )

Change subject: Allow the SASL protocol service name to be configurable
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc
File be/src/kudu/rpc/client_negotiation.cc:

http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc@111
PS1, Line 111: const boost::optional& authn_token,
> There was a conflict here since we don't have this patch from the kudu code
Should we just take that patch to make (this and) future cherry picks easier?



--
To view, visit http://gerrit.cloudera.org:8080/8230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e
Gerrit-Change-Number: 8230
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 17:57:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8242 )

Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a 
node isn't a coordinator node
..


Patch Set 1:

It'd be good to update the header comment for these fields to say that they are 
set only for coordinator role.


--
To view, visit http://gerrit.cloudera.org:8080/8242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Gerrit-Change-Number: 8242
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Comment-Date: Wed, 11 Oct 2017 17:53:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] Allow the SASL protocol service name to be configurable

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8230 )

Change subject: Allow the SASL protocol service name to be configurable
..


Patch Set 1:

Is this a conflict free cherry-pick from something already in Kudu codebase?


--
To view, visit http://gerrit.cloudera.org:8080/8230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e
Gerrit-Change-Number: 8230
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 17:44:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8255 )

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/rpc/thrift-client.cc
File be/src/rpc/thrift-client.cc:

http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/rpc/thrift-client.cc@71
PS1, Line 71: }
I think we should make it clear why we use Expected() when we do (since we 
don't want to normally use it). So how about a quick comment about that 
explaining under what condition this error is expected.


http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc@1173
PS1, Line 1173: INTERNAL_ERROR
hmm, if this is expected (as the comment claims), then this shouldn't be an 
INTERNAL_ERROR. INTERNAL_ERROR means we've hit a bug of some sort (i.e. some 
invariant was violated). That's not this case.

Anyway, you don't have to fix that now if you don't want.



--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 17:39:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2758: Change BufferedTupleStream::GetRows to returning multi batches

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8226 )

Change subject: IMPALA-2758: Change BufferedTupleStream::GetRows to returning 
multi batches
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8226/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8226/1//COMMIT_MSG@10
PS1, Line 10: which returns multiple batches instead of one.
> I'm ok with the stream staying pinned, the main goal was to remove the GetR
WFM, agree the important step here is to eliminate GetRows().



--
To view, visit http://gerrit.cloudera.org:8080/8226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414
Gerrit-Change-Number: 8226
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 11 Oct 2017 16:55:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 3:

(17 comments)

Some initial comments.

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/common/status.cc
File be/src/common/status.cc:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/common/status.cc@246
PS3, Line 246:   }
this is the same as FromThrift() effectively, right? Can we make the two look 
the same to make that obvious?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/common/status.cc@262
PS3, Line 262:
same comment. let's make this and ToThrift look the same so it's obvious they 
do the same things.
nit: also, could we order the functions consistently? We currently have 
ToThrift, FromThrift, FromProto, ToProto, and that ordering just makes it 
slightly slower to read through since it doesn't follow a pattern.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/exec/exchange-node.cc
File be/src/exec/exchange-node.cc:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/exec/exchange-node.cc@109
PS3, Line 109:   RETURN_IF_CANCELLED(state);
why do we do this in some Open() but not all? Should we just do it in 
ExecNode::Open() and remove the ones in the derived classes?  okay to do 
separately from this patch.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/rpc/rpc-mgr.h
File be/src/rpc/rpc-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/rpc/rpc-mgr.h@154
PS3, Line 154:   }
nit: one-liner?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@94
PS3, Line 94: buffer
what buffer? do you mean queue?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@108
PS3, Line 108: During ordinary operation
what does that mean?  Is it saying that during unordinary operation, a sender 
can have both a TransmitData() and EndDataStream() call in-flight 
simultaneously?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@140
PS3, Line 140: it will quietly drop its
what are "it" and "its" here? "the sender" and "the RPC's"?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@141
PS3, Line 141: /// it returns.
is that still true now that we have cancellation of RPCs?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@153
PS3, Line 153: fragment
sending fragment?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@166
PS3, Line 166: request
is that talking about the 'request' field below, or something different?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@175
PS3, Line 175:   const TransmitDataRequestPB* request;
what's the relationship between this and proto_batch?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@178
PS3, Line 178:   /// responded to.
who owns it?


http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/data_stream_service.proto
File common/protobuf/data_stream_service.proto:

http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/data_stream_service.proto@29
PS3, Line 29:   optional int32 sender_id = 2;
:
:   optional int32 dest_node_id = 3;
what are "IDs" in these cases? let's improve the documentation here. Especially 
since type is no longer PlanNodeId (and why is that?).


http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto
File common/protobuf/row_batch.proto:

http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@30
PS3, Line 30: int32
in thrift we had TTupleId. Is there a reason we aren't defining those types as 
well to make the structure clearer?


http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@32
PS3, Line 32: tuple_data
what's tuple_data? not a field in this structure...


http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@39
PS3, Line 39: Size
size of what?


http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@42
PS3, Line 42: (TODO(KRPC): native enum)
do we plan to fix that?



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 20:15:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc@77
PS6, Line 77: "Only used when FLAGS_use_krpc is false");
> Currently all our users use this flag. This needs to be set when we use our
Okay. So in other words, we could have fixed the old code as well, and 
eliminated this flag in that case.  (Let's not bother with that though, given 
we will rip that code out soon).



--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 18:45:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc@77
PS6, Line 77: "Only used when FLAGS_use_krpc is false");
in what case would a user need to set this flag? and how is that case handled 
in the kudu kinit case?



--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 18:35:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc@842
PS6, Line 842: if (FLAGS_use_krpc_kinit) {
can this path only be used with KRPC, or could it be used with thrift?

If the latter, why have the flag? (Or at least, can we commit to removing the 
flag and the old code after one release?)



--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 18:21:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options

2017-10-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7805 )

Change subject: IMPALA-5425: Add test for validating input when setting query 
options
..


Patch Set 15: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/7805
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7
Gerrit-Change-Number: 7805
Gerrit-PatchSet: 15
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 10 Oct 2017 17:54:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 15: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Tue, 10 Oct 2017 17:48:04 +
Gerrit-HasComments: No


  1   2   3   4   5   6   7   8   9   10   >