[Impala-ASF-CR] IMPALA-4671: (part-2) Replace kudu::ServicePool with one that uses Impala threads
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8472 ) Change subject: IMPALA-4671: (part-2) Replace kudu::ServicePool with one that uses Impala threads .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc File be/src/rpc/impala-service-pool.cc: http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@255 PS3, Line 255: value->AddMember("idle_threads", idle_threads, document->GetAllocator()); should we also display the total number of service threads even though it's a constant, similar to having max queue size? -- To view, visit http://gerrit.cloudera.org:8080/8472 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3e0fa9ae22c4803b0cea736eb35e333e2671448 Gerrit-Change-Number: 8472 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Wed, 15 Nov 2017 20:42:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4671: (part-2) Replace kudu::ServicePool with one that uses Impala threads
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8472 ) Change subject: IMPALA-4671: (part-2) Replace kudu::ServicePool with one that uses Impala threads .. Patch Set 3: (8 comments) http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc File be/src/rpc/impala-service-pool.cc: http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@182 PS3, Line 182: MonoTime > how about using impala util/time.h routines so we don't have to read about Hmm, okay I see we are fishing into the timing_ struct and doing math against those values, so let's leave this computation using MonoTime. However, how about instead of doing our own Now(), shouldn't we move this computation down and write it in terms of timing() fields? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@186 PS3, Line 186: internal internal is too vague. Let's "to update the InboundCall timing". timing() is a method on InboundCall so we can talk about it. http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@187 PS3, Line 187: unused_histogram_ > there's nothing useful to get out of this thing? I guess you're basically doing this manually via time_in_queue computation. But the handler_latency_histogram is inside a method_info_ thing that they allow to be null. Why isn't this histogram also in there? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@212 PS3, Line 212: method_name there's no enumeration for the methods? i guess this is okay though. http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@213 PS3, Line 213: method->num_in_handlers.Add(1); is that purposely protected by the lock? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@220 PS3, Line 220: uint64_t time_to_handle = (MonoTime::Now() - monotime_after_dequeue).ToMilliseconds(); this is a bit decieving becuase we haven't necessarily actually handled the RPC at this point. It may be deferred. Do we have something equivalent to the method_info_->handler_latency_histogram? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@223 PS3, Line 223: // rid of the spin lock. why not just do that? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@240 PS3, Line 240: } doesn't krpc already have these histograms? is all we want to get out is the json representation? how hard would it be to just use the kudu ones and get the json out of those? are there other histograms/metrics that Kudu has that would be useful to expose? if so it seems like duplicating them like this will be a pain. -- To view, visit http://gerrit.cloudera.org:8080/8472 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3e0fa9ae22c4803b0cea736eb35e333e2671448 Gerrit-Change-Number: 8472 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Wed, 15 Nov 2017 20:35:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8449 ) Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber instances .. Patch Set 8: Code-Review+2 (4 comments) http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.h File be/src/statestore/statestore.h: http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.h@381 PS8, Line 381: in microseconds since epoch) or just say "in Unix time" http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h File be/src/statestore/statestore.h: http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h@385 PS3, Line 385: strationId of the registered subscriber > Discussed this a little more with Dimitris, leaving it as-is for now. We di Thanks. yes, let's avoid shared_ptrs and especially weak_ptrs, and move toward single ownership when possible. http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc File be/src/statestore/statestore.cc: http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc@278 PS3, Line 278: lock_guard l(subscribers_lock_); : lock_guard t(topic_lock_); > Fair point, I'll revert the spinlock change. Maybe we can address it again Note that SpinLock is not a traditional spin-lock -- it's adaptive and will block like a mutex after attempting to spin for a while. So, it's pretty general-purpose. http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.cc File be/src/statestore/statestore.cc: http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.cc@415 PS8, Line 415: RegisteredSubscriberExists Seems like this should just be "FindSubscriber()" or "FindRegisteredSubscriber()" but okay to leave if you prefer the "exists" name. The "exists" naming makes it a bit surprising that it also returns the pointer. -- To view, visit http://gerrit.cloudera.org:8080/8449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65 Gerrit-Change-Number: 8449 Gerrit-PatchSet: 8 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 15 Nov 2017 19:12:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8408 ) Change subject: IMPALA-6121: remove I/O mgr request context cache .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/disk-io-mgr-reader-context.h File be/src/runtime/disk-io-mgr-reader-context.h: http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/disk-io-mgr-reader-context.h@27 PS8, Line 27: but some clients may need to include this header, e.g. to make the : /// unique_ptr destructor work correctly. : /// > I think exposing it will make more sense after a follow-on patch. There are Yeah, but then making sure this abstraction makes sense becomes more important, which is why I was getting a little nervous about what a DiskIoRequestContext actually represents. But we can deal with that in that change. -- To view, visit http://gerrit.cloudera.org:8080/8408 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108 Gerrit-Change-Number: 8408 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 15 Nov 2017 19:04:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4671: (part-2) Replace kudu::ServicePool with one that uses Impala threads
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8472 ) Change subject: IMPALA-4671: (part-2) Replace kudu::ServicePool with one that uses Impala threads .. Patch Set 3: (21 comments) http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.h File be/src/rpc/impala-service-pool.h: http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.h@52 PS3, Line 52: kudu::rpc::RpcMethodInfo* LookupMethod(const kudu::rpc::RemoteMethod& method) override; : : virtual kudu::Status : QueueInboundCall(gscoped_ptr call) OVERRIDE; : : const std::string service_name() const; : : let's add header function comments to these (and other methods), per usual. http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.h@69 PS3, Line 69: Method What is this referring to? RPC service methods, or something different? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.h@72 PS3, Line 72: std::unique_ptr handling_time; what is "time" in these cases? wallclock, cpu, ...? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.h@74 PS3, Line 74: AtomicInt32 num_in_handlers; what's that? some comments here would help. http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.h@92 PS3, Line 92: boost::mutex shutdown_lock_; what does that protect? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc File be/src/rpc/impala-service-pool.cc: http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@182 PS3, Line 182: MonoTime how about using impala util/time.h routines so we don't have to read about another set of time functions? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@186 PS3, Line 186: // We need to call RecordHandlingStarted() to update some internal InboundCall state. That seems weird. why is that? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@187 PS3, Line 187: unused_histogram_ there's nothing useful to get out of this thing? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/impala-service-pool.cc@207 PS3, Line 207: note to self: finish here http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-mgr.h@a160 PS3, Line 160: is that not still true? i.e. since we need to inherit from kudu::RpcService http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-mgr.cc@129 PS3, Line 129: this-> delete http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-mgr.cc@138 PS3, Line 138: inbound shouldn't that be outbound? let's make sure we test this change somehow. http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-mgr.cc@146 PS3, Line 146: for (auto service_pool : service_pools_) { is there a danger this could run concurrently with RegisterService()? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.h File be/src/rpc/rpc-trace.h: http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.h@38 PS3, Line 38: per ThriftServer. does this need updating? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.h@129 PS3, Line 129: /// Initialises rpc event tracing, must be called before any RpcEventHandlers are created. document the rpc_mgr parameter http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.cc File be/src/rpc/rpc-trace.cc: http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.cc@86 PS3, Line 86: rpc_mgr nit: != NULL (or nullptr and adjust below) http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.cc@91 PS3, Line 91: webserver->RegisterUrlCallback("/rpcz", "rpcz.tmpl", json); I guess the whole point in integrating with this code is to share the /rpcz page? if that's the case, how about being more explicit about that? or rather, that the point of InitRpcEventTracing() is to register callbacks for these paths. http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.cc@95 PS3, Line 95: "/rpcz_reset" should something happen for krpc stuff in this case? http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.cc@100 PS3, Line 100: ) nit: != nullptr http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/rpc/rpc-trace.cc@122 PS3, Line 122: ) same http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/service/impalad-main.cc File be/src/service/impalad-main.cc: http://gerrit.cloudera.org:8080/#/c/8472/3/be/src/service/impalad-main.cc@86 PS3, Line 86: InitRpcEventTracing(exec_env.webserver(), exec_env.rpc_mgr()); couldn't we just do that in either case? -- To view, visit http://gerrit.cloudera.org:8080/84
[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8449 ) Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber instances .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/8449/7/be/src/statestore/statestore.h File be/src/statestore/statestore.h: http://gerrit.cloudera.org:8080/#/c/8449/7/be/src/statestore/statestore.h@394 PS7, Line 394: typedef std::pair ScheduledSubscriberUpdate; I meant flatten both pairs -- i.e. turn ScheduledSubscriberUpdate into a struct (with three fields). -- To view, visit http://gerrit.cloudera.org:8080/8449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65 Gerrit-Change-Number: 8449 Gerrit-PatchSet: 7 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 14 Nov 2017 05:40:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7793 ) Change subject: IMPALA-4252: Min-max runtime filters for Kudu .. Patch Set 14: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/7793/14/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: http://gerrit.cloudera.org:8080/#/c/7793/14/common/thrift/ImpalaInternalService.thrift@203 PS14, Line 203: 41: optional i32 max_num_runtime_filters = 10 > so this (and two options below) only apply to bloom but not min/max filters I see you addressed this in a doc you had already linked: https://docs.google.com/document/d/1G-SPZelateebNxzVb67urEVtjc5Itw-B_jjfS85bSCE/edit?usp=sharing That reasoning seems okay with me. -- To view, visit http://gerrit.cloudera.org:8080/7793 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c Gerrit-Change-Number: 7793 Gerrit-PatchSet: 14 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Anonymous Coward #345 Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 13 Nov 2017 22:09:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8408 ) Change subject: IMPALA-6121: remove I/O mgr request context cache .. Patch Set 8: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/disk-io-mgr-reader-context.h File be/src/runtime/disk-io-mgr-reader-context.h: http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/disk-io-mgr-reader-context.h@27 PS8, Line 27: but some clients may need to include this header, e.g. to make the : /// unique_ptr destructor work correctly. : /// that's really unfortunate, but the change is a net win. http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h File be/src/runtime/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@654 PS7, Line 654: std::unique_ptr RegisterContext(MemTracker* reader_mem_tracker); > I think it's an implementation detail the RegisterContext() doesn't registe I wasn't thinking in terms of implementation. I was thinking in terms of the interface -- how can a context be "registered" when the caller doesn't pass a context to be registered? Anyway, I'm fine with leaving the terminology alone in this change. http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/runtime-state.h@39 PS8, Line 39: class DiskIoRequestContext; delete -- To view, visit http://gerrit.cloudera.org:8080/8408 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108 Gerrit-Change-Number: 8408 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Nov 2017 22:05:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8449 ) Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber instances .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/8449/6/be/src/statestore/statestore.h File be/src/statestore/statestore.h: http://gerrit.cloudera.org:8080/#/c/8449/6/be/src/statestore/statestore.h@383 PS6, Line 383: std::pair once we have two level pair, I think it's time to start naming the fields. How about defining a struct for this thing instead? -- To view, visit http://gerrit.cloudera.org:8080/8449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65 Gerrit-Change-Number: 8449 Gerrit-PatchSet: 6 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 13 Nov 2017 21:13:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7793 ) Change subject: IMPALA-4252: Min-max runtime filters for Kudu .. Patch Set 14: (2 comments) looked through the headers and thrift and it looks fine to me. Just a question about the query options. http://gerrit.cloudera.org:8080/#/c/7793/14/be/src/exec/kudu-util.cc File be/src/exec/kudu-util.cc: http://gerrit.cloudera.org:8080/#/c/7793/14/be/src/exec/kudu-util.cc@116 PS14, Line 116: Status ConvertTimestampValue(const TimestampValue* tv, int64_t* ts_micros) { static http://gerrit.cloudera.org:8080/#/c/7793/14/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: http://gerrit.cloudera.org:8080/#/c/7793/14/common/thrift/ImpalaInternalService.thrift@203 PS14, Line 203: 41: optional i32 max_num_runtime_filters = 10 so this (and two options below) only apply to bloom but not min/max filters? is that going to be clear to a user? hopefully we don't expect most to use these options, so maybe we should rename them? -- To view, visit http://gerrit.cloudera.org:8080/7793 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c Gerrit-Change-Number: 7793 Gerrit-PatchSet: 14 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Anonymous Coward #345 Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 13 Nov 2017 18:59:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8408 ) Change subject: IMPALA-6121: remove I/O mgr request context cache .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h File be/src/runtime/disk-io-mgr-reader-context.h: http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h@25 PS7, Line 25: /// Internal per request-context state. This object maintains a lot of state that is so no one should include this file except for disk io mgr, is that right? i.e. this class is opaque to everyone else? I think that's less obvious now that it's not in an -internal.h header. http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.cc File be/src/runtime/disk-io-mgr-reader-context.cc: http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.cc@108 PS7, Line 108: state_ = Inactive; why is it that Cancel() has to do so much more work than this, when this supposedly does a Cancel in addition to marking it inactive? Is this really a different kind of cancel? http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h File be/src/runtime/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@654 PS7, Line 654: std::unique_ptr RegisterContext(MemTracker* reader_mem_tracker); I'm not really sure what it means to "register" a context. Should this just be Create()? (And I guess we have these next three methods to keep the context opaque outside of disk-io-mgr, is that right?) http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@662 PS7, Line 662: void UnregisterContext(DiskIoRequestContext* context); > I don't feel strongly. I think your argument has merit but destroying the d Yeah, except that the statement "context cannot be used after this call" is what contradicts the control structure lifetime being tied to QueryState. I'm fine with leaving it around. But, what does it mean to "unregister" the context? The whole register/unregister terminology seems confusing. Does the lifecycle of these things mirror the lifecycle that we're moving toward and does it make sense to use consistent terminology? -- To view, visit http://gerrit.cloudera.org:8080/8408 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108 Gerrit-Change-Number: 8408 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 09 Nov 2017 18:58:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/8023/13/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/8023/13/be/src/runtime/krpc-data-stream-recvr.cc@163 PS13, Line 163: condition_variable_any these should be changed to impala ConditionVariables now, but you can do that as a follow on -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 13 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 09 Nov 2017 18:24:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 10 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 08 Nov 2017 20:22:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8202 ) Change subject: IMPALA-4704: Turns on client connections when local catalog initialized. .. Patch Set 22: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 22 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 08 Nov 2017 00:55:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 9: (6 comments) http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-mgr.cc@62 PS9, Line 62: min(FLAGS_datastream_service_num_deserialization_threads, CpuInfo::num_cores()), why disallow the setting of this greater than num_cores? Given that it takes locks, there could be some benefit to doing so, right? http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@67 PS9, Line 67: data the resources http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@195 PS9, Line 195: cur_batch_ current_batch_ http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@225 PS9, Line 225: DCHECK shoudl we also DCHECK that num_pending_enqueue_ == 0 (otherwise, we could have acquired the lock while it was dropped for deserialization and there is still a row batch)? http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@248 PS9, Line 248: batch_queue_.pop_front(); this is part of the same operation as line 244 (both are adjusting the queue state), so mind moving it to be adjacent? http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@252 PS9, Line 252: // Don't hold lock when calling EnqueueDeserializeTask() as it may block. how about moving this to line 250 now that it's that scope that drops the lock. Also, it's important that this happens after batch_queue_.pop_front(), right? maybe note that. -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 9 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 08 Nov 2017 01:32:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8408 ) Change subject: IMPALA-6121: remove I/O mgr request context cache .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h File be/src/runtime/disk-io-mgr-reader-context.h: PS7: I assume this was a wholesale move of the class without modifications, except for the new methods defined in the .cc file? http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h File be/src/runtime/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@662 PS7, Line 662: void UnregisterContext(DiskIoRequestContext* context); Given that the context cannot be used after this call, should we move the unique_ptr into this call and delete it then? -- To view, visit http://gerrit.cloudera.org:8080/8408 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108 Gerrit-Change-Number: 8408 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 18:56:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8323 ) Change subject: IMPALA-1575: part 2: yield admission control resources .. Patch Set 8: > It's still technically accurate Yeah, I guess that's true. -- To view, visit http://gerrit.cloudera.org:8080/8323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac Gerrit-Change-Number: 8323 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Tue, 07 Nov 2017 18:34:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8202 ) Change subject: IMPALA-4704: Turns on client connections when local catalog initialized. .. Patch Set 20: (2 comments) http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h@89 PS19, Line 89: must call the following : /// methods: > good question-- I've kept it the same as before so not sure what the reason That DCHECK: DCHECK(exec_env.process_mem_tracker() != nullptr) << "ExecEnv::StartServices() must be called before starting RPC services"; looks stale since the process_mem_tracker() is set in exec_env_.Init() which is called earlier in main (not to mention StartServices() doesn't exist anymore). I also don't see what the specific dependency is that this is trying to point out (this DCHECK should have been closer to the code that requires this to be true). If you want to keep the dcheck, I see no reason it couldn't be moved before both Init and Start. Then we could combine them, since there doesn't appear to be any logic behind the separation of those two. But you can defer cleaning this up if you prefer (though the dcheck wording should be updated at least). http://gerrit.cloudera.org:8080/#/c/8202/20/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8202/20/be/src/service/impala-server.cc@1936 PS20, Line 1936: I think a quick comment is worth it to highlight the dependency between these two lines: // Subscribe to the catalog topic and then wait for the initial catalog update. (or whatever is accurate) -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 20 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 07 Nov 2017 18:04:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8461 ) Change subject: IMPALA-6151: add query-level fragment/backend counters .. Patch Set 5: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc@953 PS5, Line 953: ImpaladMetrics::NUM_QUERIES_REGISTERED->Increment(1L); On the /queries webpage, I think "num_in_flight_queries" is this same number (the count of client_request_state_map_ entries). Should we make the names consistent? Probably "registered" is a better name than "in-flight" (since internally, we use in-flight to mean something slightly different and in-flight seems more ambiguous). But I'm not sure if we can change the webserver names without breaking something. -- To view, visit http://gerrit.cloudera.org:8080/8461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9 Gerrit-Change-Number: 8461 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 17:32:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8323 ) Change subject: IMPALA-1575: part 2: yield admission control resources .. Patch Set 8: Maybe this tooltip should be updated? document->AddMember("waiting-tooltip", "These queries are no longer executing, either " "because they encountered an error or because they have returned all of their " "results, but they are still active so that their results can be inspected. To " "free the resources they are using, they must be closed.", -- To view, visit http://gerrit.cloudera.org:8080/8323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac Gerrit-Change-Number: 8323 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Tue, 07 Nov 2017 17:28:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@211 PS8, Line 211: deferred_rpcs_.empty() && batch_queue_.empty() > why not write this condition as: Actually, I think the loop exiting condition is not quite right, which led to this confusin conditional. The loop exiting condition for "we're done" should check that there are no more senders, and that there's nothing left to drain from the deferred_rpcs_ queue and that there's no pending insert into batch_queue_. So, the third wait loop condition should be something like: (num_remaining_senders > 0 || !deferred_rpcs_.empty() || num_pending_enqueue_ > 0) and then this if-stmt conditional can just be: if (batch_queue_.empty()) and then the DCHECK can be the negation of that third wait loop conditional: // Wait loop is exited with an empty batch_queue_ only when there will be no more batches. DCHECK(num_remaining_senders == 0 && deferred_rpcs_.empty() && num_pending_enqueue_ == 0); And then you can get rid of the outer loop. That outer loop should be removed since it's effectively a busy wait (and I think we could get into a busy wait state in the previous patchsets). -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 8 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 07 Nov 2017 15:58:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8461 ) Change subject: IMPALA-6151: add query-level fragment/backend counters .. Patch Set 4: > That's a good point, will make that change. Cancelled merge in the > meantime. A test might be good (that it reaches 0). -- To view, visit http://gerrit.cloudera.org:8080/8461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9 Gerrit-Change-Number: 8461 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 00:22:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8267 ) Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding .. Patch Set 12: (1 comment) Looked through the headers and they look okay to me, but don't have time to do a full review of the implementation. http://gerrit.cloudera.org:8080/#/c/8267/12/be/src/exec/parquet-column-readers.h File be/src/exec/parquet-column-readers.h: http://gerrit.cloudera.org:8080/#/c/8267/12/be/src/exec/parquet-column-readers.h@96 PS12, Line 96: for different encodings. what do we mean by that? Isn't this just for RLE? -- To view, visit http://gerrit.cloudera.org:8080/8267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6 Gerrit-Change-Number: 8267 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 00:09:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8461 ) Change subject: IMPALA-6151: add query-level fragment/backend counters .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/8461/4/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8461/4/be/src/service/impala-server.cc@871 PS4, Line 871: // Metric is decremented in UnregisterQuery(). but UnregisterQuery() is called for more paths then just this one. So why isn't this increment inside of RegisterQuery() so that they match up? -- To view, visit http://gerrit.cloudera.org:8080/8461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9 Gerrit-Change-Number: 8461 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 00:00:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7793 ) Change subject: IMPALA-4252: Min-max runtime filters for Kudu .. Patch Set 11: (3 comments) http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/exec/filter-context.h File be/src/exec/filter-context.h: http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/exec/filter-context.h@118 PS11, Line 118: Materialize filter values. what does that mean? http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/runtime/runtime-filter-bank.h File be/src/runtime/runtime-filter-bank.h: http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/runtime/runtime-filter-bank.h@78 PS11, Line 78: /// may both be NULL, representing a filter that allows all rows to pass. is it the case that at most one of bloom_filter and min_max_filter should be non-NULL? could you clarify that (including in the class comment)? http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/util/min-max-filter.h File be/src/util/min-max-filter.h: http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/util/min-max-filter.h@62 PS11, Line 62: Materialize filter values what does that mean? -- To view, visit http://gerrit.cloudera.org:8080/7793 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c Gerrit-Change-Number: 7793 Gerrit-PatchSet: 11 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Anonymous Coward #345 Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 06 Nov 2017 23:54:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8323 ) Change subject: IMPALA-1575: part 2: yield admission control resources .. Patch Set 6: Code-Review+2 (1 comment) If you do decide to migrate the test to HS2, have someone review those changes before carrying forward the +2. http://gerrit.cloudera.org:8080/#/c/8323/5/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/8323/5/tests/custom_cluster/test_admission_controller.py@91 PS5, Line 91: ] > Yeah I was checking the coverage bottom-up to make sure that all of the cod We really should switch our pytests to use HS2 by default rather than beeswax, but that's clearly out of scope for this change. Given how closely tied lifecycle is to the client API, it might be a good idea to see if you can easily move this test to HS2. I'll let you decide what the threshold for "easy" is, though. As you've reasoned, those last two cases are tied pretty closely to Close() and that's unlikely to change. -- To view, visit http://gerrit.cloudera.org:8080/8323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac Gerrit-Change-Number: 8323 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Mon, 06 Nov 2017 23:31:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8202 ) Change subject: IMPALA-4704: Turns on client connections when local catalog initialized. .. Patch Set 19: (5 comments) http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/frontend.h File be/src/service/frontend.h: http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/frontend.h@162 PS19, Line 162: Sets the FE catalog to be initialized. I don't understand what that's trying to say. http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h@79 PS19, Line 79: The executor role starts ImpalaInternalService API's technically the coordinator also has to start the ImpalaInternalService API in order to run the coordinator fragment instance. http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h@89 PS19, Line 89: must call the following : /// methods: why is this split into two steps? Is there something the caller would want to do between opening the ports and starting the services? http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.cc@1518 PS19, Line 1518: nit: indentation glitch http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.cc@1937 PS19, Line 1937: RETURN_IF_ERROR(exec_env_->StartServices()); why does this have to be moved so early? i'm not sure if this will cause problems when krpc is enabled. -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 19 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 06 Nov 2017 23:22:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8428 ) Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9 Gerrit-Change-Number: 8428 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 06 Nov 2017 23:08:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8323 ) Change subject: IMPALA-1575: part 2: yield admission control resources .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/8323/5/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/8323/5/be/src/runtime/coordinator.h@454 PS5, Line 454: probably released probably have been released http://gerrit.cloudera.org:8080/#/c/8323/5/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/8323/5/be/src/runtime/coordinator.cc@831 PS5, Line 831: nit: here and elsewhere, we seem to have gratuitous blank lines which makes it harder to read code since you can't fit as much on a single screen. could you condense the places you've touched a bit? http://gerrit.cloudera.org:8080/#/c/8323/5/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/8323/5/tests/custom_cluster/test_admission_controller.py@91 PS5, Line 91: ] there is also session close and session timeout. but those should both result in close, so are we considering them covered by CLIENT_CLOSE? -- To view, visit http://gerrit.cloudera.org:8080/8323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac Gerrit-Change-Number: 8323 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Mon, 06 Nov 2017 23:00:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 8: (23 comments) http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.h File be/src/runtime/krpc-data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.h@435 PS8, Line 435: a request 'num_request' requests http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.cc@62 PS8, Line 62: 1 how was that chosen? do we have a test case that causes this queue to fill up? http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.cc@109 PS8, Line 109: // Transfer the early senders into 'deferred_rpcs_' queue of the corresponding : // sender queue. This makes sure new incoming RPCs won't pass these early senders, : // leading to starvation. this comment seems out of place. this is more an implementation detail of the receiver and handled properly inside ProcessEarlySender(). You could incorporate this in the comment for ProcessEarlySender() (to motivate why it uses the deferred_rpcs_ queue). http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h File be/src/runtime/krpc-data-stream-recvr.h: http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@129 PS8, Line 129: . and start a deserialization task to process it asynchronously. http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@130 PS8, Line 130: Transfer this "transfer" is in the oppose direction of how our "Transfer" methods usually go (e.g. src->TransferResourcesOwnership(dest)). Maybe call this ProcessEarlySender() (though I don't love "process" either since it's so vague). http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@197 PS8, Line 197: time cpu time or wall time? http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@200 PS8, Line 200: time same question http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@67 PS8, Line 67: data the resources http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@72 PS8, Line 72: 'payload' RPC state http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@73 PS8, Line 73: deferred_rpcs_ we shouldn't normally refer to private fields in public class comments, but given this is an internal class to the recvr, we can leave this. http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@87 PS8, Line 87: void TransferEarlySender(std::unique_ptr ctx); same comment as recvr header comment. http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@195 PS8, Line 195: while (current_batch_.get() == nullptr) { I don't think we need this loop. see other comments in this function. http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@197 PS8, Line 197: !is_cancelled_ && batch_queue_.empty() nit: consider swapping the order of these so that the fast case comes first (!batch_queue_.empty()) but also to match the comment ("or we know we're done" corresponds to the is_cancelled_ and num_remaining_senders_ == 0 cases). http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@201 PS8, Line 201: CANCEL_SAFE_SCOPED_TIMER(recvr_->data_arrival_timer_, &is_cancelled_); : CANCEL_SAFE_SCOPED_TIMER(recvr_->inactive_timer_, &is_cancelled_); : CANCEL_SAFE_SCOPED_TIMER( : received_first_batch_ ? nullptr : recvr_->first_batch_wait_total_timer_, : &is_cancelled_); there's got to be a cleaner way to do this but ignore for now http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@210 PS8, Line 210: nit: i think we could do without some of the blank lines in this method to make more code fit on a screen http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@211 PS8, Line 211: deferred_rpcs_.empty() && batch_queue_.empty() why not write this condition as: num_renaming_senders_ == 0 then, it's more clear that these three conditions correspond to the loop exit conditions. http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@218 PS8, Line 218: !batch_queue_.empty() given that we just checked the other two loop exit conditions, isn't this definitely true? i
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 7: (37 comments) http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.h File be/src/runtime/krpc-data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.h@102 PS7, Line 102: the batch is added : /// to the receiver's 'deferred_batches_' it's not really the batch added. and it's not just a single structure for the receiver (it may go into one of many queues for merging exchange). So how about saying: ... the RPC state is saved into a deferred queue. http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.h@104 PS7, Line 104: from the pending sender queue how about: ... from a deferred RPC queue and the row batch is deserialized. http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.h@393 PS7, Line 393: quick comment for why we define a move constructor and move operator=, since we don't typically want to define those. http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc@199 PS7, Line 199: DeserializeTask payload = : {DeserializeTaskType::EARLY_SENDERS, finst_id, dest_node_id, 0}; : deserialize_pool_.Offer(move(payload)); doesn't this mean we make early sender draining single threaded? shoudl we instead use the sender_id in this case as well and offer work per sender? or do we think this doesn't matter? http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc@217 PS7, Line 217: already_unregistered that shouldn't be possible in the DEFERRED_BATCHES case, right? so i'd probably move this DCHECK into the cases below so you can tighten it up. http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc@235 PS7, Line 235: for (const unique_ptr& ctx : early_senders.waiting_sender_ctxs) { shouldn't we process waiting_sender_ctxs before closed_sender_ctxs? Otherwise, if the same sender is in both lists we'll process those RPCs out of order. I guess that can't really happen given the current implementation of not responding to early RPCs and that senders only let one in flight, but it still seems to make more sense to do it the other way around. http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc@236 PS7, Line 236: already_unregistered why is this possible in the waiting_sender_ctxs case but not the closed_sender_ctxs case? http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc@247 PS7, Line 247: already_unregistered why is that possible? http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-mgr.cc@248 PS7, Line 248: recvr->AddDeferredBatches(task.sender_id); So I guess we no longer multithread within a single sender queue (and for non-merging, within a single receiver) doing it this way. I think that's okay but was it intentional? http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-recvr.h File be/src/runtime/krpc-data-stream-recvr.h: http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-recvr.h@78 PS7, Line 78: The caller must acquire data from the : /// returned batch is that talking about calling TransferAllResources(), or can the caller do it directly? http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@127 PS6, Line 127: // If true, the receiver fragment for this stream got cancelled. > For the non-merging case, there is essentially only one queue. As mentioned elsewhere, I'm not totally convinced yet that this is the right way to do it but, yes, we can think about it more and change it later if necessary. http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-recvr.cc@72 PS7, Line 72: s typo http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-recvr.cc@77 PS7, Line 77: Adds as many deferred batches as possible hmm I'm still not convinced this is the right thing to do (in the merging case). It seems like it's left up to chance as to the order that deferred batches are drained across the sender queues. But we can think about this more and address it later. http://gerrit.cloudera.org:8080/#/c/8023/7/be/src/runtime/krpc-data-stream-recvr.cc@97 PS7, Line 97: (1) 'batch_queue' is empty and there is no p
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 5: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/8438/5/be/src/exec/hdfs-text-scanner.h File be/src/exec/hdfs-text-scanner.h: http://gerrit.cloudera.org:8080/#/c/8438/5/be/src/exec/hdfs-text-scanner.h@144 PS5, Line 144: in if? -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Fri, 03 Nov 2017 18:10:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h File be/src/exec/hdfs-text-scanner.h: http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h@90 PS4, Line 90: /// CheckForSplitDelimiter(). Valid if 'byte_buffer_filled_' is true. Maybe briefly explain why we have it: the character is needed after the byte buffer is returned, or whatever. http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h@144 PS4, Line 144: virtual I'm not following why we need a second virtual method. Is the wrapper meant to always apply to all subclasses? If so, it shouldn't need (or want it) to be virtual right? -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Fri, 03 Nov 2017 16:52:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5564: Release lock during planning. (wip)
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8434 ) Change subject: IMPALA-5564: Release lock during planning. (wip) .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8434/1/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/8434/1/be/src/service/client-request-state.h@342 PS1, Line 342: is_planning_ > I started out with this being planning_finished_ (or some such). I ran into I see - the ExecuteMetadataOp() cases. -- To view, visit http://gerrit.cloudera.org:8080/8434 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e3fc4c28d7a5ad8546d48bcd22c03fbce502e5b Gerrit-Change-Number: 8434 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 03 Nov 2017 16:33:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8447 ) Change subject: IMPALA-2181: Add query option levels for display .. Patch Set 5: Sorry I didn't see the discussion on the JIRA about "advanced". I don't think that solves the problem that the JIRA was meaning to solve (though I don't think implementation would have to change too much). I'll add a comment to the JIRA. -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 5 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 03 Nov 2017 16:17:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8215 ) Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 7 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Thu, 02 Nov 2017 19:24:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8215 ) Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 6 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Thu, 02 Nov 2017 19:24:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 6: (8 comments) Here's my last set of comments for this round. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.cc@59 PS6, Line 59: boost::bind I don't have a strong preference either way, but it'd be nice to be consistent between either using bind or [], rather than both... http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.cc@322 PS6, Line 322: RespondToTimedOutSenders shouldn't be plural http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@52 PS6, Line 52: overflows the queue it's not clear what that means just from reading the comment. It'd be nice to briefly explain that this is talking about the soft limit of the number of bytes across all sender queues. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@105 PS6, Line 105: // number of pending row batch insertion. : int num_pending_enqueue_ = 0; it's not clear what a "pending insertion" means or why we have this. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@113 PS6, Line 113: RowBatch* how about using unique_ptr since this owns the row batch (until it's transferred to current_batch_)? http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@127 PS6, Line 127: queue> blocked_senders_; given that a single soft limit is imposed across all sender queues, does it make sense that the blocked_senders_ are maintained per sender? Why don't we maintain a single blocked_senders_ list per datastream recvr? Hmm, I guess we need to know if this sender has a blocked sender in GetBatch(). But given the single limit, it seems wrong that one sender's row batches can bypass another sender once we get into the blocked sender situation. i.e. the flow of batches across senders seems quite different depending on when the limit was reached. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@231 PS6, Line 231: proto_batch update http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@322 PS6, Line 322: payload->rpc_context->RespondSuccess(); doing this in Close() goes against the paradigm that Close() is only about releasing (local) resources. We've been going that way because there might be no where to bubble up a status from Close(). At least RespondSuccess() doesn't return a status, I suppose. But is there any place sooner we could do this? Does it make sense to do in during Cancel instead? -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Thu, 02 Nov 2017 00:10:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8215 ) Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/8215/5/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/8215/5/be/src/util/runtime-profile-counters.h@327 PS5, Line 327:int64_t prev = 0L; : for (const Event& event: events_) { rather than creating a tmp eventlist_sorted, you should create a temp for this lamba (you can use "auto" for that), since it's used twice and must be consistent in order for this code to make sense. http://gerrit.cloudera.org:8080/#/c/8215/5/be/src/util/runtime-profile-counters.h@330 PS5, Line 330: = false; our style uses ! rather than == false http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h@326 PS4, Line 326: bool eventlist_sorted = true; : int64_t prev = 0L; : for (const Event& event: events_) { : if (event.second < prev) { : eventlist_sorted = false; : break; : } : prev = event.second; : } : > Since the list is sorted in most of the times calling sort() may be expensi the flipside is that this becomes a cliff in the sense that normally it will be fast, but every once in a while (when the race occurs), we'll pay the cost of sort(). If we always call sort(), then the performance is more predictable. I don't think the event list will ever be so long that this matters one way or another, so simplest seems best. That said, I don't feel strongly if you really want to keep the check. -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 4 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Wed, 01 Nov 2017 22:31:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 6: (14 comments) Note to self: remaining files: krpc-data-stream-{mgr,recvr}.cc http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/exec-env.cc File be/src/runtime/exec-env.cc: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/exec-env.cc@89 PS6, Line 89: "Number of datastream service processing threads"); how are these defaults chosen? http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h File be/src/runtime/row-batch.h: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@50 PS6, Line 50: outbound I think we should say something about KRPC to at least give that hint. maybe: A KRPC outbound row batch... http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@60 PS6, Line 60: sizeof(int32_t) sizeof(tuple_offsets_[0]) seems clearer and more robust http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@354 PS6, Line 354: /// it is ignored. This function does not Reset(). we should preserve this comment when removing the thrift variant. So you could just put the new decl here now so we don't forget that. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@424 PS6, Line 424: /// nit: i don't think we generally have all these line breaks between parameter comments. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@426 PS6, Line 426: . delete space http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@444 PS6, Line 444: nput_ delete http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@447 PS6, Line 447: input_ delete http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.h@537 PS6, Line 537: std::string compression_scratch_; this seems like a hack and we could do something simpler, but let's leave it alone for now. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.cc File be/src/runtime/row-batch.cc: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/row-batch.cc@241 PS6, Line 241: // as sidecars to the RpcController. this comment was probably meant to be deleted? http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/data_stream_service.proto File common/protobuf/data_stream_service.proto: http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/data_stream_service.proto@29 PS6, Line 29: fragment isn't this the id of the instance? The comment in KrpcDataStreamSender is clearer, let's copy that: /// Sender instance id, unique within a fragment. int sender_id_; http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/data_stream_service.proto@59 PS6, Line 59: // Id of this fragment in its role as a sender. same http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto File common/protobuf/row_batch.proto: http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@32 PS3, Line 32: = 2; > That's the tuple data sent as sidecar. Clarified in the new comments. My point is that writing it like 'tuple_data' doesn't make sense since it's not a field in this struct. You should just write: Size of the tuple data (sent as a sidecar) in bytes ... http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/row_batch.proto File common/protobuf/row_batch.proto: http://gerrit.cloudera.org:8080/#/c/8023/6/common/protobuf/row_batch.proto@32 PS6, Line 32: epeated int32 row_tuples = 2; why is this needed? i don't see it used. The size of it is used, though it seems like even that can be inferred from the descriptors. -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 01 Nov 2017 21:48:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc File be/src/exec/hdfs-text-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717 PS2, Line 717: byte_buffer_last_byte_ > The byte_buffer_read_size_ check above prevents using a stale value from a Oops, didn't read line 712 ;) That does seem like a possibility, and seems worth addressing. But is there a way we can exercise that? -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 01 Nov 2017 21:20:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc File be/src/exec/hdfs-text-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717 PS2, Line 717: byte_buffer_last_byte_ given that this doesn't happen when readsize is 0, is it possible to use a stale value here? -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 01 Nov 2017 20:49:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2494: Support for byte array encoded decimals in Parquet scanner
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7822 ) Change subject: IMPALA-2494: Support for byte array encoded decimals in Parquet scanner .. Patch Set 7: > Looks good to me and it looks like you addressed Dan's comments. > Dan, did you want to take another look? No, I don't plan to look through the details. How confident are we that we have sufficient test coverage with the added tests? If we're confident then, Tim, please do the +2 (or ask for a second review from another reviewer if you think it's warranted). -- To view, visit http://gerrit.cloudera.org:8080/7822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2c0e881045109f337fecba53fec21f9cfb9e619e Gerrit-Change-Number: 7822 Gerrit-PatchSet: 7 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 01 Nov 2017 20:40:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8215 ) Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h@326 PS4, Line 326: bool eventlist_sorted = true; : int64_t prev = 0L; : for (const Event& event: events_) { : if (event.second < prev) { : eventlist_sorted = false; : break; : } : prev = event.second; : } : why bother with this? the list is pretty short and this isn't a critical path, so I think it should be fine to always call sort(). (and if you were to do that, you might as well use std::is_sorted()) http://gerrit.cloudera.org:8080/#/c/8215/4/be/src/util/runtime-profile-counters.h@336 PS4, Line 336: [](Event const &event1, nit: how about breaking the line before the [] instead of in the middle of the lambda param list? -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 4 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Wed, 01 Nov 2017 20:17:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5564: Release lock during planning. (wip)
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8434 ) Change subject: IMPALA-5564: Release lock during planning. (wip) .. Patch Set 1: Also, besides the RPCs, let's consider how to handle the webserver cancelation path. Maybe we need to not show the cancel link until we can actually cancel? -- To view, visit http://gerrit.cloudera.org:8080/8434 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e3fc4c28d7a5ad8546d48bcd22c03fbce502e5b Gerrit-Change-Number: 8434 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 01 Nov 2017 17:56:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5564: Release lock during planning. (wip)
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8434 ) Change subject: IMPALA-5564: Release lock during planning. (wip) .. Patch Set 1: (2 comments) I think this path is worth continuing down. The alternate solution would be to have a profile copy that can be returned before the actual profile is ready, but if releasing the lock works out, that seems simplest. We should check how this minimal profile looks in CM -- there might be some other fields that they require? http://gerrit.cloudera.org:8080/#/c/8434/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8434/1//COMMIT_MSG@17 PS1, Line 17: whereas the web UI would previously block on the lock. > Looking over https://gerrit.cloudera.org/c/6707/12/be/src/service/impala-se Right, this JIRA is a follow up to 1972 to improve things further so that the (partial) query profile can be retrieved. http://gerrit.cloudera.org:8080/#/c/8434/1/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/8434/1/be/src/service/client-request-state.h@342 PS1, Line 342: is_planning_ maybe make that is_planning_finished_ (or similar), and initialize it to false and just set to true after planning? Actually, since the goal is to guard against other RPCs to execute until the handle is returned, maybe just make it is_execute_stmt_finished? -- To view, visit http://gerrit.cloudera.org:8080/8434 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e3fc4c28d7a5ad8546d48bcd22c03fbce502e5b Gerrit-Change-Number: 8434 Gerrit-PatchSet: 1 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 01 Nov 2017 17:54:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6136: Part 1: Query duration should not be normally negative.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8430 ) Change subject: IMPALA-6136: Part 1: Query duration should not be normally negative. .. Patch Set 2: Code-Review+2 Please update the comment as suggested by Phil. -- To view, visit http://gerrit.cloudera.org:8080/8430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib6f971ebd5f0f00f3e38df0495692ffe9d52ef90 Gerrit-Change-Number: 8430 Gerrit-PatchSet: 2 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Wed, 01 Nov 2017 16:49:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 6: (37 comments) Next batch. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h File be/src/runtime/krpc-data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@103 PS6, Line 103: processed deserialized? http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@104 PS6, Line 104: respectively what is this "respectively" referring to? http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@249 PS6, Line 249: proto batch unclear what that means, maybe stale comment? And this should say something about the row batch being contained in 'request'. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@258 PS6, Line 258: the actual row batch isn't that part of "memory pointed to by 'request'"? If so and you want to explicitly mention row batch, maybe say "including the serialized row batch"? http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@294 PS6, Line 294: typedef std::unique_ptr DeserializeWorkItem; How about getting rid of this typedef? The code seems easier to understand if the unique_ptr is visible in the fn decls. it's a bit harder than necessary to reasonable about DeserializeWorkItem& and &&, given that this is now directly a unique_ptr. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@428 PS6, Line 428: senders a sender http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.h File be/src/runtime/krpc-data-stream-recvr.h: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.h@119 PS6, Line 119: 'row_batch' row batch is deserialized ('row_batch' isn't a variable in this context) http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.h File be/src/runtime/krpc-data-stream-sender.h: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.h@57 PS6, Line 57: per_channel_buffer_size' is the buffer size in bytes allocated to each channel's : /// accumulated row batch. still not clear what that means. This isn't really the size of a buffer, is it? How about something like: ... is a soft limit on the buffering, in bytes, into the channel's accumulating row batch before it will be sent. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.h@111 PS6, Line 111: cached protobuf serialized http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.h@130 PS6, Line 130: cached proto outbound row http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.h@133 PS6, Line 133: Two OutboundRowBatch reused across RPCs Maybe say: The outbound row batches are double-buffered. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@67 PS6, Line 67: can it looks like there's a third interface now: SerializeAndSendBatch() that takes an Impala rowbatch. But now that we have the outbound batch on the sender, why not just use that for the RANDOM case and do SerializeBatch() then TransmitData(), so that we can simplify the abstraction? http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@81 PS6, Line 81: TearDown Teardown() http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@82 PS6, Line 82: TearDown Teardown() http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@126 PS6, Line 126: . or if the preceding RPC failed. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@141 PS6, Line 141: // Flushes any buffered row batches and sends the EOS RPC to close the channel. Returns error status if... Also indicate that it blocks until the EOS RPC is complete. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@157 PS6, Line 157: below delete. If that's all we need it for, maybe just remember the capacity? http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@166 PS6, Line 166: RuntimeState* runtime_state_ = nullptr; is that actually used? http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-sender.cc@176 PS6, Line 176: current_outbound_batch_ the name of this is confusing because it's so similar to current_batch_idx_, but it means something different (and often the opposite). How about calling this: rpc_outbound_batch_ or rpc_in_flight_batch_? You could even get rid of the r
[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8267 ) Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding .. Patch Set 12: > Dan, did you want to have a look or should I find someone else to > +2? I plan to at least look at the new class abstractions, but maybe not all the implementation details. -- To view, visit http://gerrit.cloudera.org:8080/8267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6 Gerrit-Change-Number: 8267 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 31 Oct 2017 20:19:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4964: Fix Decimal modulo overflow
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8329 ) Change subject: IMPALA-4964: Fix Decimal modulo overflow .. Patch Set 3: Tim, do you want a second pair of eyes on this? If not, could you do the +2 review for this? -- To view, visit http://gerrit.cloudera.org:8080/8329 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5420201d4440d421e33e443df005cdcc16b8a6cd Gerrit-Change-Number: 8329 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Tue, 31 Oct 2017 18:23:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8172 ) Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq .. Patch Set 8: (3 comments) Michael, can you do the +2 for this one too? http://gerrit.cloudera.org:8080/#/c/8172/8/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/8172/8/be/src/exec/hdfs-scan-node.cc@238 PS8, Line 238: nit: a lot of unnecessary blank lines. maybe condense a bit so more code fits on a screen. http://gerrit.cloudera.org:8080/#/c/8172/8/be/src/exec/scanner-context.h File be/src/exec/scanner-context.h: http://gerrit.cloudera.org:8080/#/c/8172/8/be/src/exec/scanner-context.h@a184 PS8, Line 184: nice to see this get simplified http://gerrit.cloudera.org:8080/#/c/8172/8/be/src/runtime/row-batch.h File be/src/runtime/row-batch.h: http://gerrit.cloudera.org:8080/#/c/8172/8/be/src/runtime/row-batch.h@a215 PS8, Line 215: great to see this go -- To view, visit http://gerrit.cloudera.org:8080/8172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 Gerrit-Change-Number: 8172 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 31 Oct 2017 18:21:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 6: (5 comments) Some initial comments for this round. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/exec/kudu-util.h File be/src/exec/kudu-util.h: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/exec/kudu-util.h@45 PS6, Line 45: status_ nit: use same mangling between this and KUDU_RETURN_IF_ERROR (prepend seems better too since it's not a member of a class and status_ is a comment member name). http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h File be/src/runtime/krpc-data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-mgr.h@97 PS6, Line 97: two things: : /// process it immediately, add it to a fixed-size 'batch queue' for later processing, : /// or defer processing the RPC if the 'batch-queue' is full i'm confused what the "two" cases are from this comment. Also, I think it's kind of confusing what "processing" means. Should it read something like: two things: deserialize it immediately adding it to the a 'batch queue', or defer deserializing and respond to the RPC later if the 'batch queue' is full. Also, the batch queue isn't really "fixed size", right? http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@171 PS6, Line 171: if (!blocked_senders_.empty()) { we talked about this in person, but want to note it so I don't forget: this loop will deque a "random" number of blocked senders, until the first one happens to finish deserializing and shows up in batch_queue_. That seems wrong. http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@220 PS6, Line 220: DCHECK_GT(num_remaining_senders_, 0); why do we have this DCHECK (i.e. why is this condition important here), and could it be violated with out of order RPCs? http://gerrit.cloudera.org:8080/#/c/8023/6/be/src/runtime/krpc-data-stream-recvr.cc@237 PS6, Line 237: recvr_->ExceedsLimit(batch_size) it seems like we should be ORing that with !block_senders_.empty(), or something. Otherwise, stuff that's been in the block_senders_ queue for a while can be passed by new stuff. i.e. block_senders_ senders can get starved -- maybe that explains longer than expected RPC times we've seen in some cases? -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 30 Oct 2017 19:35:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. Patch Set 14: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/8303/14/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/8303/14/be/src/runtime/runtime-state.cc@87 PS14, Line 87: profile_(RuntimeProfile::Create(obj_pool(), "")) { maybe explain that this is the test constructor -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 22:21:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5307: Part 2: copy out strings in uncompressed Avro
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8146 ) Change subject: IMPALA-5307: Part 2: copy out strings in uncompressed Avro .. Patch Set 14: Code-Review+1 Thanks. This approach looks okay to me, please see if Michael can do the +2 review. -- To view, visit http://gerrit.cloudera.org:8080/8146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233 Gerrit-Change-Number: 8146 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 20:33:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. Patch Set 11: (3 comments) http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc@359 PS11, Line 359: was not available > We do call it on the process MemTracker in QueryState::Init(). Okay, but my original question was whether the comment should be reworded for that case: ... if this tracker is part of the query hierarchy. i.e. "not available" seems misleading now that we don't get this from 'state' (which is the thing that wasn't always available). http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc@95 PS11, Line 95: if (!released_exec_resources_) ReleaseExecResources(); > My initial concern was that handling it in Init() would get messy in the ca What I proposed was not to handle it in Init(), but let the caller handle it. i.e. Init() always gets a resource ref count and QueryExecMgr::StartQuery() has to handle the error. It already has to handle this and other errors and release ref counts if it won't be passing them along... Is there a reason to not just do that? http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc@88 PS11, Line 88: local_query_state_->AcquireExecResourceRefcount(); // Decremented in ReleaseResources(). > The resources used for evaluating exprs, etc - this is used in various test Why are exvaluating exprs query-wide resources. Aren't they per-thread resources? -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 27 Oct 2017 19:37:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5307: Part 2: copy out strings in uncompressed Avro
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8146 ) Change subject: IMPALA-5307: Part 2: copy out strings in uncompressed Avro .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/8146/13/be/src/runtime/tuple.h File be/src/runtime/tuple.h: http://gerrit.cloudera.org:8080/#/c/8146/13/be/src/runtime/tuple.h@175 PS13, Line 175: /// Materialize the var-len string data in this tuple into the provided memory : /// pool. the name and the comment didn't make it clear that this also rewrites this tuple. i.e. in that sense, this is different than MaterializeExprs(). In this case, the slots (and strings) are already materialized -- it's re-materializing (or copying) the strings, right? -- To view, visit http://gerrit.cloudera.org:8080/8146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233 Gerrit-Change-Number: 8146 Gerrit-PatchSet: 13 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Oct 2017 23:46:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. Patch Set 11: (7 comments) http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc@328 PS11, Line 328: RuntimeState* state definitely not this change, but should we just get rid of this parameter. do we really need to explicitly do the LogError() below? If we still rely on that then it seems we should fix status propagation. http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/mem-tracker.cc@359 PS11, Line 359: was not available does that comment still make sense? isn't the query_tracker==nullptr case now mean that it's part of the query hierarchy? though should we even call this function in that case? http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.h@205 PS11, Line 205: backend what do we mean by "backend" here? is that the same thing as "execution"? http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.h@206 PS11, Line 206: Resources for this query How about: Query wide resources ... since some FIS-local resources are also resources used for this query but those should be owned and released solely by the FIS thread. http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.h@258 PS11, Line 258: backend same question http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/query-state.cc@95 PS11, Line 95: if (!released_exec_resources_) ReleaseExecResources(); we've been trying to move away from doing resource releasing from destructors, to make the lifetimes super clear. shouldn't this case just be handled in the caller of Init() when !status.ok() (which already has to handle releasing the QS ref count, and the thread creation path already handles this case. we could similarly release the resource ref count if we acquired that sooner before Init can fail). http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/8303/11/be/src/runtime/runtime-state.cc@88 PS11, Line 88: local_query_state_->AcquireExecResourceRefcount(); // Decremented in ReleaseResources(). why is that needed? what resource needs to be held by the runtime state in this case? -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Oct 2017 23:26:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/8303/10/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/8303/10/be/src/runtime/coordinator.cc@1095 PS10, Line 1095: query_state_ != nullptr) > That would require that query_state_ was not initialised but we acquired a Sure, so what you are saying is that query_state_ being not null is not an invariant we are trying to maintain -- we expect to be able to run the cancel code even if it didn't get so far as creating a query state. The destructor code already says that basically. So, yes, let's leave this. -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Oct 2017 22:30:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8303 ) Change subject: IMPALA-1575: Part 1: eagerly release query exec resources .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/8303/10/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/8303/10/be/src/runtime/coordinator.cc@1095 PS10, Line 1095: query_state_ != nullptr) > Your reasoning appears to be correct. My tendency is to keep the null check But if the query_state_ were null, then we wouldn't release the resources, which in itself is a bug. So, wouldn't it be better to know that such a bug existed? -- To view, visit http://gerrit.cloudera.org:8080/8303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341 Gerrit-Change-Number: 8303 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Oct 2017 22:05:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2494: Support for byte array encoded decimals in Parquet scanner
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7822 ) Change subject: IMPALA-2494: Support for byte array encoded decimals in Parquet scanner .. Patch Set 6: (3 comments) http://gerrit.cloudera.org:8080/#/c/7822/6/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: http://gerrit.cloudera.org:8080/#/c/7822/6/be/src/exec/parquet-column-readers.cc@207 PS6, Line 207: InternalType, parquet::Type::type PARQUET_TYPE you should explain these template parameters http://gerrit.cloudera.org:8080/#/c/7822/6/be/src/exec/parquet-common.h File be/src/exec/parquet-common.h: http://gerrit.cloudera.org:8080/#/c/7822/6/be/src/exec/parquet-common.h@88 PS6, Line 88: InternalType let's explain what that is http://gerrit.cloudera.org:8080/#/c/7822/6/be/src/exec/parquet-common.h@177 PS6, Line 177: static int DecodeByParquetType(const uint8_t* buffer, const uint8_t* buffer_end, int fixed_len_size, nit: long line -- To view, visit http://gerrit.cloudera.org:8080/7822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2c0e881045109f337fecba53fec21f9cfb9e619e Gerrit-Change-Number: 7822 Gerrit-PatchSet: 6 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Oct 2017 21:42:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2758: Remove BufferedTupleStream::GetRows
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8226 ) Change subject: IMPALA-2758: Remove BufferedTupleStream::GetRows .. Patch Set 4: Tim, can you do the +2 review for this one? -- To view, visit http://gerrit.cloudera.org:8080/8226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 Gerrit-Change-Number: 8226 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 26 Oct 2017 17:49:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8202 ) Change subject: IMPALA-4704: Turns on client connections when local catalog initialized. .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/8202/12/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8202/12/be/src/service/impala-server.cc@2045 PS12, Line 2045: if (thrift_be_server_.get()) { > working on them... starting/stopping cluster processes does not interact we Have you tried a custom_cluster test? -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 12 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 25 Oct 2017 17:27:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 3: (16 comments) Some more comments, still going though. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@75 PS3, Line 75: cached see comment in row-batch.h about this terminology. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@89 PS3, Line 89: // safe to free them once the callback has been invoked. I think we should add a reference to KUDU-2011 somewhere here like: Note that due to KUDU-2011, timeout cannot be used with outbound sidecars. The client has no idea when it is safe to reclaim the sidecar buffer (~RpcSidecar() should be the right place, except that's currently called too early). RpcController::Cancel(), however, ensures that the callback is called only after the RPC layer no longer references the sidecar buffer. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@92 PS3, Line 92: query query? does it mean fragment instance? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@124 PS3, Line 124: Shutdown the RPC thread is that still accurate? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@139 PS3, Line 139: int buffer_size_; that could use a comment. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@141 PS3, Line 141: const TNetworkAddress address_; : TUniqueId fragment_instance_id_; : PlanNodeId dest_node_id_; those could be commented together to say they identify the destination. it's a little odd that plan node id is prefixed "dest" when the others are not. it also seems weird that we need both these and the req_ field since shouldn't they just be stored there? Or seems we should get rid of the req_ and just generate it when sending. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@159 PS3, Line 159: num_cached_proto_batches_ caps since constant? also, can this ever be something other than 2 without writing the code? i.e. doesn't the code assume this value is 2 in various ways? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@175 PS3, Line 175: proxy proxy_ http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@197 PS3, Line 197: bool remote_recvr_closed_ = false; why is that needed now? also, shouldn't we do something different, at a higher level, in that case (like cancel this instance)? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@200 PS3, Line 200: Status rpc_status_; I think it would help to associate this with rpc_in_flight_ - move them adjacent and say that rpc_status_ is valid only when rpc_in_flight_ is false, or something. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@218 PS3, Line 218: 2 if we have the constant, shouldn't that use it? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@334 PS3, Line 334: auto pred = [this]() -> bool { return !rpc_in_flight_ || ShouldTerminate(); }; : auto timeout = std::chrono::system_clock::now() + milliseconds(50); : while (!rpc_done_cv_.wait_until(*lock, timeout, pred)) { : timeout = system_clock::now() + milliseconds(50); : } seems simpler to just write: while (rpc_in_flight_ && !ShouldTerminate()) { auto timeout = std::chrono::system_clock::now() + milliseconds(50); rpc_done_cv_.wait_until(*lock, timeout); } or even better to use wait_for() which takes the relative timeout. Or should we use our ConditionVariable wrapper? Especially if we want to start instrumenting these things better. But if it's work to switch it over, it's okay to keep it condition_variable, but let's at least make the code more straight forward. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/row-batch.h File be/src/runtime/row-batch.h: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/row-batch.h@52 PS3, Line 52: struct ProtoRowBatch { I think we should get rid of this structure all together. IMO, the indirection just adds confusion. On the receive side, it seems we can just get the header and sidecars directly from the request, which is already threaded through the RPC handler anyway. Pulling it into a ProtoRowBatch just makes it unclear where the not yet deserialized rowbatch comes from. On the send side, I think we should just work directly on CachedProtoRowBatch (but rename that thing, see below). The indirect
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8215 ) Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/8215/2/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/8215/2/be/src/util/runtime-profile.cc@658 PS2, Line 658: event_sequence.second->GetEvents(&events); : if (last == 0 && events.size() > 0) last = events.back().second; that looks like it could be inconsistent with the sorted order http://gerrit.cloudera.org:8080/#/c/8215/2/be/src/util/runtime-profile.cc@672 PS2, Line 672: runtime-profile-test.cc has a test case that expects GetEvents() to be in order. that case isn't correct if we are saying that GetEvents() returns events unsorted. Also, the comment for events_ says the events are in sorted order. I think we should either: (a) maintain events in sorted order, or (b) do this sort inside GetEvents() either way the property is that GetEvents() always returns events in increasing time order, which seems like the most straightforward API. -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 2 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Wed, 25 Oct 2017 16:46:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8202 ) Change subject: IMPALA-4704: Turns on client connections when local catalog initialized. .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/8202/12/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8202/12/be/src/service/impala-server.cc@2045 PS12, Line 2045: if (thrift_be_server_.get()) { > I wonder if this can cause problems. A coordinator+executor impalad might h Yeah, we should make sure we have test cases for this and the various coord/exec combinations. -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 12 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 24 Oct 2017 20:26:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5018: Error on decimal modulo or divide by zero
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8344 ) Change subject: IMPALA-5018: Error on decimal modulo or divide by zero .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If7a7131e657fcdd293ade78d62f851dac0f1e3eb Gerrit-Change-Number: 8344 Gerrit-PatchSet: 4 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zach Amsden Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Tue, 24 Oct 2017 20:20:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5957: print memory address, not memory
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8371 ) Change subject: IMPALA-5957: print memory address, not memory .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8371 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89250646bf683dd2d636dcb37a66ceb0428af8b2 Gerrit-Change-Number: 8371 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Tue, 24 Oct 2017 20:18:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5018: Error on decimal modulo or divide by zero
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8344 ) Change subject: IMPALA-5018: Error on decimal modulo or divide by zero .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/decimal-operators-ir.cc File be/src/exprs/decimal-operators-ir.cc: http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/decimal-operators-ir.cc@729 PS3, Line 729: divide > In one of the previous comments, Vuk mentioned that Postgres returns an ide I'm fine with leaving it as is. -- To view, visit http://gerrit.cloudera.org:8080/8344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If7a7131e657fcdd293ade78d62f851dac0f1e3eb Gerrit-Change-Number: 8344 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zach Amsden Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Tue, 24 Oct 2017 20:12:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8305 ) Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src. .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8305 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c Gerrit-Change-Number: 8305 Gerrit-PatchSet: 10 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Tue, 24 Oct 2017 17:27:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8305 ) Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src. .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8305 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c Gerrit-Change-Number: 8305 Gerrit-PatchSet: 9 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Mon, 23 Oct 2017 19:17:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8202 ) Change subject: IMPALA-4704: Turns on client connections when local catalog initialized. .. Patch Set 11: Code-Review+1 (1 comment) BE looks fine. http://gerrit.cloudera.org:8080/#/c/8202/11/be/src/service/frontend.cc File be/src/service/frontend.cc: http://gerrit.cloudera.org:8080/#/c/8202/11/be/src/service/frontend.cc@242 PS11, Line 242: Status Frontend::SetCatalogInitialized() { : JNIEnv* jni_env = getJNIEnv(); : JniLocalFrame jni_frame; : RETURN_IF_ERROR(jni_frame.push(jni_env)); : jni_env->CallObjectMethod(fe_, set_catalog_initialized_id_); : RETURN_ERROR_IF_EXC(jni_env); : return Status::OK(); : } is that needed? could it be handled entirely within InProcessImpalaServer? Maybe orthogonal to this change to so okay to defer. -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 11 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 23 Oct 2017 19:01:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8305 ) Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src. .. Patch Set 8: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/8305/8/be/src/util/time.h File be/src/util/time.h: http://gerrit.cloudera.org:8080/#/c/8305/8/be/src/util/time.h@106 PS8, Line 106: /// Convenience function to return the date-time string of the current time : /// derived from UnixMicros(). how about we mention that this is in the local time zone. -- To view, visit http://gerrit.cloudera.org:8080/8305 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c Gerrit-Change-Number: 8305 Gerrit-PatchSet: 8 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Mon, 23 Oct 2017 18:33:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5018: Error on decimal modulo or divide by zero
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8344 ) Change subject: IMPALA-5018: Error on decimal modulo or divide by zero .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/decimal-operators-ir.cc File be/src/exprs/decimal-operators-ir.cc: http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/decimal-operators-ir.cc@729 PS3, Line 729: divide this path will also be taken for modulo. Should the message take that into account to avoid confusion? http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/expr-test.cc@2264 PS3, Line 2264: { true, false, 0, 38, 19 }}}, it would be good to test also 10.0 / 0 or cast(10.0 as decimal) / 0 (i.e. where divisor is integer not decimal). I don't remember what type that expressino would have had in decimal v1 though. Added a comment to pytest about this too, which might be the easier place to verify. http://gerrit.cloudera.org:8080/#/c/8344/3/be/src/exprs/expr-test.cc@2506 PS3, Line 2506: 0/0 it's a bit weird/confusing that we don't do the same thing for integer divide by 0, but I guess it's out of scope. http://gerrit.cloudera.org:8080/#/c/8344/3/testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test File testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test: http://gerrit.cloudera.org:8080/#/c/8344/3/testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test@46 PS3, Line 46: cast is that cast necessary? Let's also verify you get the error without the cast there. -- To view, visit http://gerrit.cloudera.org:8080/8344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If7a7131e657fcdd293ade78d62f851dac0f1e3eb Gerrit-Change-Number: 8344 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 23 Oct 2017 16:49:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5018: Error on decimal modulo or divide by zero
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8344 ) Change subject: IMPALA-5018: Error on decimal modulo or divide by zero .. Patch Set 3: > I spoke to Greg and Alex yesterday, and we agreed that decimal_v2 > should always be in strict mode. This means that when decimal_v2 is > enabled, we should always error on overflows and division by zero. > This is the behavior that more traditional databases have. I agree with that. -- To view, visit http://gerrit.cloudera.org:8080/8344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If7a7131e657fcdd293ade78d62f851dac0f1e3eb Gerrit-Change-Number: 8344 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 23 Oct 2017 16:32:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5018: Error on decimal modulo or divide by zero
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8344 ) Change subject: IMPALA-5018: Error on decimal modulo or divide by zero .. Patch Set 2: (1 comment) In the JIRA, Greg made a comment about tying this to strict mode. Did we reach consensus on that with him? http://gerrit.cloudera.org:8080/#/c/8344/2/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/8344/2/be/src/exprs/expr-test.cc@1561 PS2, Line 1561: expected null, scaled_val, precision, scale for V1 : //expected null, scaled_val, precision, scale for V2 }} that needs updating -- To view, visit http://gerrit.cloudera.org:8080/8344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If7a7131e657fcdd293ade78d62f851dac0f1e3eb Gerrit-Change-Number: 8344 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 20 Oct 2017 22:37:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8305 ) Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src. .. Patch Set 4: What about common-metrics.cc? And then, I think we should move TimestampValue::{LocalTime(), UtcTime()} to expr-test.cc so that it's not even part of the TimestampValue interface. -- To view, visit http://gerrit.cloudera.org:8080/8305 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c Gerrit-Change-Number: 8305 Gerrit-PatchSet: 4 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Thu, 19 Oct 2017 19:57:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8305 ) Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src. .. Patch Set 2: Code-Review+1 (1 comment) Code change looks good. How did you test it? Please add a Testing section to commit message indicating that. Please at least manually inspect each affected field by e.g. looking at the profile. http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc@303 PS1, Line 303: int64_t duration_us = record.end_time_us - record.start_time_us; > Please see my response to DanH's comment... Using monotonic clock for durations is generally the right thing to do, but given we also print the start and end times in this case, let's leave it alone since the Print() function handles it in a reasonable way, -- To view, visit http://gerrit.cloudera.org:8080/8305 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c Gerrit-Change-Number: 8305 Gerrit-PatchSet: 2 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Wed, 18 Oct 2017 23:35:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Allow the SASL protocol service name to be configurable
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8230 ) Change subject: Allow the SASL protocol service name to be configurable .. Patch Set 2: Code-Review+2 Okay, thats fine. We can revisit if these conflicts cause future work. -- To view, visit http://gerrit.cloudera.org:8080/8230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Gerrit-Change-Number: 8230 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 18 Oct 2017 18:42:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8085 ) Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet .. Patch Set 8: Code-Review+2 (1 comment) Yes, this is good with me. Could you get Mostafa to sign off on perf implications, perhaps recording that in the JIRA if not already (issues.apache.org not loading for me)? http://gerrit.cloudera.org:8080/#/c/8085/8/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: http://gerrit.cloudera.org:8080/#/c/8085/8/be/src/exec/parquet-column-readers.cc@983 PS8, Line 983: data buffers? -- To view, visit http://gerrit.cloudera.org:8080/8085 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2 Gerrit-Change-Number: 8085 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Oct 2017 18:16:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5599: Clean up references to TimestampValue in be/src/service.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8305 ) Change subject: IMPALA-5599: Clean up references to TimestampValue in be/src/service. .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/client-request-state.h@341 PS1, Line 341: time indicate the clock. e.g. monotonic vs unix time. http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-http-handler.cc@303 PS1, Line 303: int64_t duration_us = record.end_time_us - record.start_time_us; that can be negative if settimeofday() occurred in the mean time. What should happen in that case? http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.h@622 PS1, Line 622: /// Start and end time of the query same comment about clock http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.h@640 PS1, Line 640: Unix milliseconds like that http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8305/1/be/src/service/impala-server.cc@1009 PS1, Line 1009: int64_t duration_ms = duration_us / MICROS_PER_MILLI; same question (though old code could go negative). -- To view, visit http://gerrit.cloudera.org:8080/8305 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I642a1d713597826bb7c15cd2ecb6638cb813a02c Gerrit-Change-Number: 8305 Gerrit-PatchSet: 1 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Comment-Date: Tue, 17 Oct 2017 23:42:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8215 ) Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8215/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8215/1//COMMIT_MSG@22 PS1, Line 22: value. > In the thread doing ClientRequestState::ClientRequestState() There is only one call to AddEventSequence("Query Timeline") per query though, right? so how does another thread operate on that event_sequence_ before the constructor of CRS() returns? Are you able to reproduce this problem? If so, can you change your if-stmt to a DCHECK() and put the backtrace in the JIRA when the DCHECK fires. -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 1 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Tue, 17 Oct 2017 23:30:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8215 ) Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. Patch Set 1: > (1 comment) But who is the caller of MarkEvent()? How can that be happening in concurrently with the construction of the ClientRequestState()? No one would have a reference to it yet... -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 1 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Tue, 17 Oct 2017 16:05:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5142 EventSequence displays negative elapsed time.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8215 ) Change subject: IMPALA-5142 EventSequence displays negative elapsed time. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8215/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8215/1//COMMIT_MSG@22 PS1, Line 22: value. but the code isn't synchronized at all, so you could (in theory) read a garbage start_ if it's being modified concurrently. What are the callsites of Start() and ElapsedTime() in question? -- To view, visit http://gerrit.cloudera.org:8080/8215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c944396d96473b17b453da3e913ffc56680a896 Gerrit-Change-Number: 8215 Gerrit-PatchSet: 1 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dan Hecht Gerrit-Comment-Date: Tue, 17 Oct 2017 00:16:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Reduce log spew from rpcz store.cc
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8285 ) Change subject: Reduce log spew from rpcz_store.cc .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8285 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8efe68be0ae7f9ab05937d5c81fb53a709a881f1 Gerrit-Change-Number: 8285 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 16 Oct 2017 23:58:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7938 ) Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork .. Patch Set 8: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/7938/8/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/7938/8/be/src/rpc/authentication.cc@109 PS8, Line 109: "off a kinit process."); can you file a jira to remove this flag (and the old code) within a release or two? -- To view, visit http://gerrit.cloudera.org:8080/7938 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7 Gerrit-Change-Number: 7938 Gerrit-PatchSet: 8 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 16 Oct 2017 23:48:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 3: (34 comments) Another batch of comments... http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h File be/src/runtime/krpc-data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@63 PS3, Line 63: ongoing transmission from a client to a server as a 'stream' I don't think that's accurate. see questions in krpc-data-stream-recvr.h about stream definition. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@93 PS3, Line 93: process it immediately, add it to a fixed-size 'batch queue' for later processing XXX check whether these are really different http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@164 PS3, Line 164: deferred processing is that because the recvr hasn't showed up yet, or because the stream's queue is full, or both? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@235 PS3, Line 235: node_id dest_node_id? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@239 PS3, Line 239: Ownership of the receiver is shared between this DataStream mgr instance and the : /// caller. that seems unnecessary but don't change it now. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@246 PS3, Line 246: 'proto_batch'? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@248 PS3, Line 248: . 'request'. Also document what 'response' and 'context' are. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@266 PS3, Line 266: Notifies the receiver is this an RPC handler? I think we should just be explicit about which of these methods are RPC handlers. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@267 PS3, Line 267: The RPC what RPC is this talking about? If this is a handler, then it's clear. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@274 PS3, Line 274: Closes Does it close or cancel? (or is there no difference?) http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@284 PS3, Line 284: RPCs which were buffered To be consistent with terminology used in class comment, maybe say "deferred RPCs" http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@340 PS3, Line 340: ragment instance id, 0 what is that saying? is that a misplaced comma or am I reading this wrong? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@341 PS3, Line 341: instance id changes I don't understand this. it kinda sounds like we're trying to be able to find all instances of this fragment, but then wouldn't we iterate until the fragment id changes (not the instance id)? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@349 PS3, Line 349: struct EarlySendersList { hmm, I guess we need this now that we can't block the RPC thread? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@358 PS3, Line 358: Time Monotonic time http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@374 PS3, Line 374: time monotonic time http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@382 PS3, Line 382: boost::unordered_set closed_stream_cache_; all this parallel startup stuff really needs to be revisited (but not for this change). it's too complex and brittle. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@386 PS3, Line 386: Deserialize maybe call it DeserializeDeferred() or DeserializeWorker() to make it clearer that this is only for the deferred (slow) path, since the normal path will also have to deserialize (but doesn't use this code). http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@404 PS3, Line 404: void EnqueueRowBatch(DeserializeWorkItem&& payload); how about grouping this with Deferred function above since it's related. Also, I think the name should be less generic. Like maybe EnqueueDeferredBatch() or EnqueueDeferredRpc() (does the response happen before or after this deferred work?) http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@413 PS3, Line 413: block what's that? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@414 PS3, Line 414: Status I think that status is not getting checked by the caller. I thought Tim made Status warn on unused result -- why is it not catching this? (Or do we still need to annotate each method?). http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-da
[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8255 ) Change subject: IMPALA-5940: Avoid log spew by using Status::Expected() .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 Gerrit-Change-Number: 8255 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 19:08:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8242 ) Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Gerrit-Change-Number: 8242 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 18:31:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8242 ) Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8242/2/be/src/runtime/exec-env.h File be/src/runtime/exec-env.h: http://gerrit.cloudera.org:8080/#/c/8242/2/be/src/runtime/exec-env.h@186 PS2, Line 186: General I think this is meant to be a thread pool just for ExecQueryFInstances RPC (and that's what's meant by the "exec_rpc" in the name). -- To view, visit http://gerrit.cloudera.org:8080/8242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Gerrit-Change-Number: 8242 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 18:06:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Allow the SASL protocol service name to be configurable
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8230 ) Change subject: Allow the SASL protocol service name to be configurable .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc File be/src/kudu/rpc/client_negotiation.cc: http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc@111 PS1, Line 111: const boost::optional& authn_token, > There was a conflict here since we don't have this patch from the kudu code Should we just take that patch to make (this and) future cherry picks easier? -- To view, visit http://gerrit.cloudera.org:8080/8230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Gerrit-Change-Number: 8230 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 17:57:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8242 ) Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node .. Patch Set 1: It'd be good to update the header comment for these fields to say that they are set only for coordinator role. -- To view, visit http://gerrit.cloudera.org:8080/8242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62 Gerrit-Change-Number: 8242 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Comment-Date: Wed, 11 Oct 2017 17:53:21 + Gerrit-HasComments: No
[Impala-ASF-CR] Allow the SASL protocol service name to be configurable
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8230 ) Change subject: Allow the SASL protocol service name to be configurable .. Patch Set 1: Is this a conflict free cherry-pick from something already in Kudu codebase? -- To view, visit http://gerrit.cloudera.org:8080/8230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e Gerrit-Change-Number: 8230 Gerrit-PatchSet: 1 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 17:44:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8255 ) Change subject: IMPALA-5940: Avoid log spew by using Status::Expected() .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/rpc/thrift-client.cc File be/src/rpc/thrift-client.cc: http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/rpc/thrift-client.cc@71 PS1, Line 71: } I think we should make it clear why we use Expected() when we do (since we don't want to normally use it). So how about a quick comment about that explaining under what condition this error is expected. http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc@1173 PS1, Line 1173: INTERNAL_ERROR hmm, if this is expected (as the comment claims), then this shouldn't be an INTERNAL_ERROR. INTERNAL_ERROR means we've hit a bug of some sort (i.e. some invariant was violated). That's not this case. Anyway, you don't have to fix that now if you don't want. -- To view, visit http://gerrit.cloudera.org:8080/8255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3 Gerrit-Change-Number: 8255 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 17:39:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2758: Change BufferedTupleStream::GetRows to returning multi batches
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8226 ) Change subject: IMPALA-2758: Change BufferedTupleStream::GetRows to returning multi batches .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8226/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8226/1//COMMIT_MSG@10 PS1, Line 10: which returns multiple batches instead of one. > I'm ok with the stream staying pinned, the main goal was to remove the GetR WFM, agree the important step here is to eliminate GetRows(). -- To view, visit http://gerrit.cloudera.org:8080/8226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414 Gerrit-Change-Number: 8226 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 11 Oct 2017 16:55:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 3: (17 comments) Some initial comments. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/common/status.cc File be/src/common/status.cc: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/common/status.cc@246 PS3, Line 246: } this is the same as FromThrift() effectively, right? Can we make the two look the same to make that obvious? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/common/status.cc@262 PS3, Line 262: same comment. let's make this and ToThrift look the same so it's obvious they do the same things. nit: also, could we order the functions consistently? We currently have ToThrift, FromThrift, FromProto, ToProto, and that ordering just makes it slightly slower to read through since it doesn't follow a pattern. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/exec/exchange-node.cc File be/src/exec/exchange-node.cc: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/exec/exchange-node.cc@109 PS3, Line 109: RETURN_IF_CANCELLED(state); why do we do this in some Open() but not all? Should we just do it in ExecNode::Open() and remove the ones in the derived classes? okay to do separately from this patch. http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/rpc/rpc-mgr.h@154 PS3, Line 154: } nit: one-liner? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h File be/src/runtime/krpc-data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@94 PS3, Line 94: buffer what buffer? do you mean queue? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@108 PS3, Line 108: During ordinary operation what does that mean? Is it saying that during unordinary operation, a sender can have both a TransmitData() and EndDataStream() call in-flight simultaneously? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@140 PS3, Line 140: it will quietly drop its what are "it" and "its" here? "the sender" and "the RPC's"? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@141 PS3, Line 141: /// it returns. is that still true now that we have cancellation of RPCs? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@153 PS3, Line 153: fragment sending fragment? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@166 PS3, Line 166: request is that talking about the 'request' field below, or something different? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@175 PS3, Line 175: const TransmitDataRequestPB* request; what's the relationship between this and proto_batch? http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@178 PS3, Line 178: /// responded to. who owns it? http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/data_stream_service.proto File common/protobuf/data_stream_service.proto: http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/data_stream_service.proto@29 PS3, Line 29: optional int32 sender_id = 2; : : optional int32 dest_node_id = 3; what are "IDs" in these cases? let's improve the documentation here. Especially since type is no longer PlanNodeId (and why is that?). http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto File common/protobuf/row_batch.proto: http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@30 PS3, Line 30: int32 in thrift we had TTupleId. Is there a reason we aren't defining those types as well to make the structure clearer? http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@32 PS3, Line 32: tuple_data what's tuple_data? not a field in this structure... http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@39 PS3, Line 39: Size size of what? http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@42 PS3, Line 42: (TODO(KRPC): native enum) do we plan to fix that? -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 10 Oct 2017 20:15:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7938 ) Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc@77 PS6, Line 77: "Only used when FLAGS_use_krpc is false"); > Currently all our users use this flag. This needs to be set when we use our Okay. So in other words, we could have fixed the old code as well, and eliminated this flag in that case. (Let's not bother with that though, given we will rip that code out soon). -- To view, visit http://gerrit.cloudera.org:8080/7938 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7 Gerrit-Change-Number: 7938 Gerrit-PatchSet: 6 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 10 Oct 2017 18:45:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7938 ) Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc@77 PS6, Line 77: "Only used when FLAGS_use_krpc is false"); in what case would a user need to set this flag? and how is that case handled in the kudu kinit case? -- To view, visit http://gerrit.cloudera.org:8080/7938 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7 Gerrit-Change-Number: 7938 Gerrit-PatchSet: 6 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 10 Oct 2017 18:35:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7938 ) Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc@842 PS6, Line 842: if (FLAGS_use_krpc_kinit) { can this path only be used with KRPC, or could it be used with thrift? If the latter, why have the flag? (Or at least, can we commit to removing the flag and the old code after one release?) -- To view, visit http://gerrit.cloudera.org:8080/7938 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7 Gerrit-Change-Number: 7938 Gerrit-PatchSet: 6 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 10 Oct 2017 18:21:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7805 ) Change subject: IMPALA-5425: Add test for validating input when setting query options .. Patch Set 15: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7805 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7 Gerrit-Change-Number: 7805 Gerrit-PatchSet: 15 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 10 Oct 2017 17:54:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 ) Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala .. Patch Set 15: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff Gerrit-Change-Number: 7954 Gerrit-PatchSet: 15 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Tue, 10 Oct 2017 17:48:04 + Gerrit-HasComments: No