[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures
Michael Ho has uploaded a new patch set (#4). Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures .. IMPALA-3962: Clean up scratch tuple batch on scan failures The parquet scanner doesn't clean up 'scratch_batch_' properly which causes it to process a partially filled scratch_batch_ if any of the column reader fails. This change cleans up the scratch batch if any of the parquet column readers fails. The clean up involves freeing the mem_pool of scratch_batch_ and setting number of tuples in scratch_batch_ to 0. This change also extends debug action to emulate the behavior of exceeding the query's memory limit. Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5 --- M be/src/exec/exec-node.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M common/thrift/PlanNodes.thrift M tests/failure/test_failpoints.py 10 files changed, 77 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/3991/4 -- To view, visit http://gerrit.cloudera.org:8080/3991 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures
Michael Ho has posted comments on this change. Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/3991/3/be/src/exec/exec-node.cc File be/src/exec/exec-node.cc: PS3, Line 441: Status::MemLimitExceeded(); > shoudl this use mem_tracker()->SetMemLimitExceeded() since Tim is convertin Done http://gerrit.cloudera.org:8080/#/c/3991/3/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: PS3, Line 350: column_readers_[0]->RowGroupAtEnd() > couldn't we have hit column 0's end of rowgroup, but not column 1's, which Good point. Reverted to the check in the old code. http://gerrit.cloudera.org:8080/#/c/3991/3/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: PS3, Line 381: current > currently Done PS3, Line 383: nodes > node's Done -- To view, visit http://gerrit.cloudera.org:8080/3991 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-CR](cdh5-trunk) IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2.
Hello Jim Apple, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3081 to look at the new patch set (#44). Change subject: IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2. .. IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2. Using SSSE3/AVX2 intrinsic to accelerate the function "static inline void ByteSwap(void* dst, const void* src, int len)" of BitUtil class, and a scalar byte-swap routine is added as fallback. Also the runtime selector for CPUs of different capacity is included, as well as performance test and data verification. Brief performance comparison is listed here: CPU: Intel(R) Core(TM) i5-4460 CPU@3.20GHz Result: I0725 20:47:02.402506 2078 bswap-benchmark.cc:117] Machine Info: Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz ByteSwap benchmark:Function iters/ms 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile (relative) (relative) (relative) - FastScalar675 725 731 1X 1X 1X SSSE3 6.12e+03 6.2e+03 6.23e+03 9.06X 8.55X 8.53X AVX2 1.87e+04 1.88e+04 1.89e+04 27.7X 25.9X 25.9X SIMD 1.82e+04 1.88e+04 1.89e+04 27X 25.9X 25.9X Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1 --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/bswap-benchmark.cc M be/src/exprs/string-functions-ir.cc M be/src/util/bit-util-test.cc M be/src/util/bit-util.cc M be/src/util/bit-util.h 6 files changed, 372 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/81/3081/44 -- To view, visit http://gerrit.cloudera.org:8080/3081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1 Gerrit-PatchSet: 44 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Youwei Wang
[Impala-CR](cdh5-trunk) IMPALA-889: Add support for ISO-SQL trim()
Youwei Wang has posted comments on this change. Change subject: IMPALA-889: Add support for ISO-SQL trim() .. Patch Set 10: Hi Jim. Sorry for this late response. Since Impala 2809: https://gerrit.cloudera.org/#/c/3081/ is my current task of highest priority, I have to resolve it first. I will try my best to re-pick this one as soon as possible. Sorry for this. -- To view, visit http://gerrit.cloudera.org:8080/3213 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 10 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Youwei Wang Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page
Kathy Sun has uploaded a new patch set (#7). Change subject: IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page .. IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page The /memz page tried to add JVM metrics even when they didn't exist for all daemons, not just Impala. This led to a crash when they tried to access ExecEnv::GetInstance() without an initialised ExecEnv at statestored and catalogd To fix, changed the memz handler method to take an optional metric group, provided by the caller. memz handler will check the existence of the metric group. Used C++11 lambdas rather than boost::bind to help simplify the code. Testing: Ran locally and looked at impalad/memz, statestored/memz and catalogd/memz Add a test file test_web_pages.py to test sending request to /memz on impalad / statestored / catalogd Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29 --- M be/src/catalog/catalogd-main.cc M be/src/runtime/data-stream-mgr.cc M be/src/runtime/exec-env.cc M be/src/scheduling/simple-scheduler.cc M be/src/service/impala-server.cc M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestored-main.cc M be/src/util/default-path-handlers.cc M be/src/util/default-path-handlers.h M be/src/util/memory-metrics.cc M be/src/util/metrics-test.cc M be/src/util/metrics.cc M be/src/util/metrics.h A tests/webserver/test_web_pages.py M www/memz.tmpl 15 files changed, 111 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/3998/7 -- To view, visit http://gerrit.cloudera.org:8080/3998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Kathy Sun Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Kathy Sun Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata
Zoltan Ivanfi has uploaded a new change for review. http://gerrit.cloudera.org:8080/4063 Change subject: IMPALA-1659: Netezza compatibility functions: metadata .. IMPALA-1659: Netezza compatibility functions: metadata Added the SQL functions current_catalog(), current_schema(), current_user() and session_user() as aliases to existing ones and a new SQL function current_sid(). Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca --- M be/src/exprs/expr-test.cc M be/src/exprs/utility-functions-ir.cc M be/src/exprs/utility-functions.h M be/src/runtime/runtime-state.h M common/function-registry/impala_functions.py 5 files changed, 37 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/4063/1 -- To view, visit http://gerrit.cloudera.org:8080/4063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi
[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata
Zoltan Ivanfi has posted comments on this change. Change subject: IMPALA-1659: Netezza compatibility functions: metadata .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4063/1/be/src/exprs/utility-functions-ir.cc File be/src/exprs/utility-functions-ir.cc: Line 120: StringVal UtilityFunctions::CurrentSid(FunctionContext* ctx) { This function was requested in the feature request, but I feel that it should not be added because of the reasons listed below, please let me know your opinion. There is no documentation available for Netazza's current_sid. The docs linked in the issue attachment are Microsoft's and their current_sid has a different meaning: it stands for Security ID there. Although there are no docs about Netazza's current_sid, there are some examples on the net, for example here: http://stackoverflow.com/questions/27402899/netezza-sql-to-find-client-ip Two things are obvious from this example: - current_sid is not a function, but a magic constant - current_sid is a number, not a string Adding a current_sid function will not provide any Netezza-compatibility, because it won't work without () and our session IDs are strings not numbers (although in theory we could use a different representation of our session ID, but a 128-bits may be too much to return as a number). For this reason I suggest to omit this function. -- To view, visit http://gerrit.cloudera.org:8080/4063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: Yes
Re: Contributions to Cloudera Impala
Hi Jim, Thanks a lot for your response. Please see my comments inline. Adding Zhi Zhi, our colleague from China in the thread. Regards, Nishidha From: Sudarshan Jagadale/Austin/Contr/IBM To: Jim Apple Cc: "dev@impala" , Manish Patil/Austin/Contr/IBM@IBMUS, Silvius Rus , Valencia Serrao/Austin/Contr/IBM@IBMUS, Nishidha Panpaliya/Austin/Contr/IBM@IBMUS Date: 08/18/2016 10:45 AM Subject:Re: Contributions to Cloudera Impala Hi Jim, +Nishidha Thank you for your inputs... Thanks and Regards Sudarshan Jagadale Power Open Source Solutions From: Jim Apple To: "dev@impala" Cc: Silvius Rus , Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Manish Patil/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS Date: 08/17/2016 09:43 PM Subject:Re: Contributions to Cloudera Impala > I'm glad to tell you that we are able to build and test Impala on Ubuntu > linux ppc64le with the great support from the Cloudera Community. Excellent! > Our next action is to upstream all our changes to Cloudera Impala. Great! Cloudera has donated Impala to the Apache Software Foundation (aka "ASF"). Cloudera now contributes to the project, and the project is managed by the Impala community. > With > this, our plan is to start building latest Impala on Power8 as we'd been > porting quite an old version (code from cdh5-trunk branch till 23rd March, > 2016). Since then, I know there have been many many changes happened which > are yet to be ported, specially kudu stuff. Yes, there have been many changes. One is that Impala is now hosted on ASF-owned git. Please see https://cwiki.apache.org/confluence/display/IMPALA/How+to+switch+to +Apache-hosted+git [Nishidha] I've read a few pages from this Confluence. Indeed, very useful to start with. >We know we need CLA to be signed to start contributions. We have already >initiated the process and hoping to get it done soon. I think the right thing to do here is use the Apache CLAs. See https://www.apache.org/licenses/cla-corporate.txt and https://www.apache.org/licenses/icla.txt [Nishidha] We'll start with this. > By the time we get CLA signed, we would start porting the changes done >in last 5 months. So, I wanted to know which tag/branch should we take >up for this. This is a question we could all discuss together, and it might end up being a decision made by the Project Management Committee (PMC). This is a big question about how Apache Impala will evolve. Our bylaws state: "Significant, pervasive features may be developed in a speculative branch of the repository. The PMC may grant commit rights on the branch to its consistent contributors for the duration of the initiative. Branch committers are responsible for shepherding their feature into an active release and do not cast binding votes or vetoes in the project." So perhaps this should happen on a separate branch? One question the community should also consider, IMHO, is whether the community will have sufficient resources to maintain a working ppc64le codebase indefinitely into the future. [Nishidha] We found two new source code URLs as one mentioned in Confluence https://git-wip-us.apache.org/repos/asf/incubator-impala.git and another to be https://github.com/apache/incubator-impala. Commits wise both look same, though former one says "wip" in the URL. Please suggest the URL to be forked and worked upon. We didn't know if we could directly work on a separate branch on apache's Impala. We thought of forking first into our repo and then working on it. > Working on cdh5-trunk will put us into an unending loop of >porting as it is being modified everyday. We are thinking to create a >branch from cdh5.8.0-release tag and start working on it. Please suggest >us the best way to do this. Since Impala is now developed on Apache infrastructure, we have switched branching schemas. Our main branch is now "master". We do not have any release branches yet. [Nishidha] Okay. So, after CLA, can we work directly on Apache's Impala or we'll need to fork it into our repository and create a new branch from master, and then generate PRs/Gerrit code reviews from it? >Verifying all the changes on x86 platforms ourself here will also be >time consuming and add potential delays in upstreaming. So, we were >thinking if we can get a job on Cloudera's Continuous integration server >which would simply fetch our branch and verify it on all the supported >platforms and do all the required checks. I'm not sure if this is >feasible but just a thought. Any other suggestions to foster this >activity would be appreciated. We are working on making a publicly-available CI setup, but we aren't done yet. Do you have a CI setup and x86-64 machines that your CI workers can run on? [Nishidha] Which CI do you have or working on? We can setup Jenkins here and can get x86-64
[Impala-CR](cdh5-trunk) IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2.
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2. .. Patch Set 44: (2 comments) http://gerrit.cloudera.org:8080/#/c/3081/44/be/src/util/bit-util.cc File be/src/util/bit-util.cc: Line 171: void (*ByteSwapFunc)(const uint8_t* src, uint8_t* dst) = NULL; name formatting (this is a variable name) Line 172: if( TEMPLATE_DATA_WIDTH == 16 ) { parentheses formatting -- To view, visit http://gerrit.cloudera.org:8080/3081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1 Gerrit-PatchSet: 44 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Youwei Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Youwei Wang Gerrit-HasComments: Yes
Re: Contributions to Cloudera Impala
> [Nishidha] We found two new source code URLs as one mentioned in Confluence > https://git-wip-us.apache.org/repos/asf/incubator-impala.git and another to > be https://github.com/apache/incubator-impala. > Commits wise both look same, though former one says "wip" in the URL. Please > suggest the URL to be forked and worked upon. I don't know exactly what you mean by "forked" in this context. You should use the repo referred to as "asf-gerrit" in https://cwiki.apache.org/confluence/display/IMPALA/How+to+switch+to+Apache-hosted+git > [Nishidha] Okay. So, after CLA, can we work directly on Apache's Impala or > we'll need to fork it into our repository and create a new branch from > master, and then generate PRs/Gerrit code reviews from it? I don't understand this question, but let me try and clarify some things that may help you answer it. All of Impala's commits get code reviews on gerrit. We don't use PRs. > [Nishidha] Which CI do you have or working on? Cloudera uses Jenkins. > What is the expected timeline for your CI to be publicly available? You can follow https://issues.cloudera.org/browse/IMPALA-3228 to learn about updates. Apache projects are sometimes referred to as "do-ocracies". If you would like to find a CI hosting provider that will support Apache Impala, you are welcome to help - you don't have to be an Impala committer or a CLoudera employee to help. > 1. Would you suggest me to start a new thread for branch discussion, > although I created a gist in github.com > https://gist.github.com/npanpaliya/bd58e554370455babc5c4f290e4b1723 for the > same? Yes. Apache Impala project direction discussions are unlikely to happen on private gists.
Re: Failed to submit/update my patch
It looks like you pushed https://gerrit.cloudera.org/#/c/3081/. How did you fix it? On Thu, Aug 18, 2016 at 7:04 PM, Wang, Youwei A wrote: > Greetings, Jim. > I don't see that Change-Id I8777cf76f04d34a46f53d53005412e0f1d63b5b7 using > following command as you do: > git fetch gerrit && git log gerrit/cdh5-trunk > > The output of "git remote -v" is: > > root@debian:~/Impala# git remote -v > gerrit ssh://hayabusa-in...@gerrit.cloudera.org:29418/Impala (fetch) > gerrit ssh://hayabusa-in...@gerrit.cloudera.org:29418/Impala (push) > origin https://github.com/cloudera/Impala.git (fetch) > origin https://github.com/cloudera/Impala.git (push) > > > -Original Message- > From: Jim Apple [mailto:jbap...@cloudera.com] > Sent: Friday, August 19, 2016 9:07 AM > To: Wang, Youwei A > Cc: dev@impala.incubator.apache.org > Subject: Re: Failed to submit/update my patch > > I don't see this using "git fetch gerrit && git log gerrit/cdh5-trunk". > > What is the output of "git remote -v" for you? > > On Thu, Aug 18, 2016 at 6:03 PM, Wang, Youwei A > wrote: >> Greetings, Jim. Yes, the Change-Id I8777cf76f04d34a46f53d53005412e0f1d63b5b7 >> does appear in my branch. And it also appears in the cdh5-trunk. As a proof, >> the following message is copied from the output of "git log" for cdh5-trunk: >> >> commit 4bc2f0557bce48b04897c0123eba598812931785 >> Author: Tim Armstrong >> Date: Mon May 23 14:09:39 2016 -0700 >> >> IMPALA-3611: track unused Disk IO buffer memory >> >> Track I/O buffers against separate MemTrackers. This gives us better >> visibility into memory consumption from the debug webpage and from >> MemTracker consumption dumps. The immediate motivation was in trying to >> determine whether idle memory consumption of an impalad was caused by a >> memory leak. >> >> We add two trackers: for buffers cached in DiskIoMgr's free list, >> and another for clients that don't provide a MemTracker (the only >> one is BufferedBlockMgr, which will be removed at some point). >> >> The previous code "tracked" the buffers against the process-wide >> tracker, but it was a no-op outside of ASAN builds since the >> process-wide tracker took its value from TCMalloc. >> >> The test code required fixing because it assumed that buffers were >> always credited against the DiskIoMgr's tracker. This only made sense >> when the DiskIoMgr's tracker is the root process-wide tracker. >> >> Fix backend test logging for disk-io-mgr-test. >> >> Testing: >> Ran exhaustive tests. >> >> Change-Id: I8777cf76f04d34a46f53d53005412e0f1d63b5b7 >> Reviewed-on: http://gerrit.cloudera.org:8080/3799 >> Reviewed-by: Dan Hecht >> Tested-by: Internal Jenkins >> (cherry picked from commit >> 17bf14417e3438d772b19111431453bdd537742a) >> >> PS: And I have conducted some experiments last night: >> 1. Create an empty VMWare workstation and install Debian 8.5 (the >> known latest version); 2. Install git; No other unrelated packages are >> installed; 3. git clone https://github.com/cloudera/Impala.git >> 4. cd Impala >> 5. git checkout cdh5-trunk >> 6. Follow every step from this page: >> https://github.com/cloudera/Impala/wiki/Using-Gerrit-to-submit-and-rev >> iew-patches 7. Apply my patch to cdh5-trunk and then commit it; 8. git >> push gerrit HEAD:refs/for/cdh5-trunk >> Result: I got the exact same error as we discussed. >> >> I have also try to substitute the VMWare workstation using a Debian docker >> image in the step 1. >> And the remained steps are identical. The same error still happens. >> >> -Original Message- >> From: Jim Apple [mailto:jbap...@cloudera.com] >> Sent: Thursday, August 18, 2016 8:28 PM >> To: Wang, Youwei A >> Cc: dev@impala.incubator.apache.org >> Subject: Re: Failed to submit/update my patch >> >> Does Change-Id: I8777cf76f04d34a46f53d53005412e0f1d63b5b7 show up in your >> git history when trying to push? If it does, something is wrong, since this >> change should not have been merged into gerrit/cdh5-trunk. >> >> On Thu, Aug 18, 2016 at 12:16 AM, Wang, Youwei A >> wrote: >>> Greetings, everyone. >>> Thank you for providing such great solution, @Todd and @Jim. >>> But I am afraid I have to say the 'rebase -i' approach doesn't work for me >>> and yes, I have made 100% sure that my patch is the only patch that show up >>> in my interactive rebase is my own. >>> >>> I have also tried another approach: I have diffed my branch and the >>> cdh5-trunk, then saved it to another place. After that, I switched to the >>> cdh5-trunk, run git pull, applied my patch, and then finally commit my >>> changes. Then I run "git push gerrit HEAD:refs/for/cdh5-trunk", still got >>> the same error. Since now I am working on the latest cdh5-trunk, I believe >>> the possibility that other branches introduce this error is eliminated. >>> However, this issue still happens. >>> >>> I am trying to use another clean machine to re-commit
[Impala-ASF-CR] IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for Kudu
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for Kudu .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java File fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java: Line 118:* Normalizes a 'predicate' consisting of an uncast SlotRef and a constant Expr into what was wrong with the previous version of this function? given just the description in the function comment, it's not clear why this is disabled by casting. you should leave a comment in here, so the next person doesn't try to add the casting logic again. also, doesn't this break other things? i remember we specifically added the casting logic to address a problem. http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/Expr.java File fe/src/main/java/com/cloudera/impala/analysis/Expr.java: Line 1193:* For children of 'this' that are constant expressions capable of being expressed which constants cannot be expressed as literals? http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/LiteralExpr.java File fe/src/main/java/com/cloudera/impala/analysis/LiteralExpr.java: Line 55: public static LiteralExpr create(String value, Type type) throws AnalysisException { remove throw Line 154:* Returns null for types that cannot be expressed as literals, e.g. TIMESTAMP. you mean for which don't have a LiteralExpr subclass? 'cannot be expressed' sounds like a sql limitation. -- To view, visit http://gerrit.cloudera.org:8080/3986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iae7612433a2e27f8887abe6624f9ee0f4867b934 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[DISCUSS] Branching for a release soon?
I would like to volunteer to be the next release manager for an upcoming Apache Impala 2.7 (incubating) release. I plan to: 1. Get a vote done adding to our bylaws the procedure for creating a branch. (See thread from yesterday) 2. Create a release branch, possibly backdated a few commits from HEAD to a commit that looks stable. 3. Maintain that branch by cherry-picking bugfix commits from "master" as necessary to get the tests running cleanly. 4. Make a release candidate and call for a vote here, then on the IPMC list (which is required for Incubator releases) Obviously, this glosses over some details about publishing and signing releases, but I would welcome feedback on this outline.
Re: [DISCUSS] Branching for a release soon?
Sounds great to me! Todd On Aug 19, 2016 8:41 AM, "Jim Apple" wrote: > I would like to volunteer to be the next release manager for an > upcoming Apache Impala 2.7 (incubating) release. I plan to: > > 1. Get a vote done adding to our bylaws the procedure for creating a > branch. (See thread from yesterday) > > 2. Create a release branch, possibly backdated a few commits from HEAD > to a commit that looks stable. > > 3. Maintain that branch by cherry-picking bugfix commits from "master" > as necessary to get the tests running cleanly. > > 4. Make a release candidate and call for a vote here, then on the IPMC > list (which is required for Incubator releases) > > Obviously, this glosses over some details about publishing and signing > releases, but I would welcome feedback on this outline. >
[Impala-ASF-CR] Add .clang-format for Impala's C++ style
Tim Armstrong has posted comments on this change. Change subject: Add .clang-format for Impala's C++ style .. Patch Set 5: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/3886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for Kudu
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for Kudu .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java File fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java: Line 118:* Normalizes a 'predicate' consisting of an uncast SlotRef and a constant Expr into > what was wrong with the previous version of this function? For Kudu predicates, at least, we can't have the slotref wrapped by a cast. So this fn now returns the very specific kind of predicate described here if possible. This shouldn't break anything else because its only caller is the KuduScanNode. Would you prefer if I change the name of the fn to normalizeNoCastSlotRefComparison() (or similar)? http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/Expr.java File fe/src/main/java/com/cloudera/impala/analysis/Expr.java: Line 1193:* For children of 'this' that are constant expressions capable of being expressed > which constants cannot be expressed as literals? DATE/DATETIME/TIMESTAMP don't have LiteralExpr subclasses. I'll change. http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/LiteralExpr.java File fe/src/main/java/com/cloudera/impala/analysis/LiteralExpr.java: Line 55: public static LiteralExpr create(String value, Type type) throws AnalysisException { > remove throw analyze(), uncheckedCastTo(), and BoolLiteral()/NumericLiteral() all throw AnalysisEx. Line 154:* Returns null for types that cannot be expressed as literals, e.g. TIMESTAMP. > you mean for which don't have a LiteralExpr subclass? 'cannot be expressed' Yes, I'll change the comment -- To view, visit http://gerrit.cloudera.org:8080/3986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iae7612433a2e27f8887abe6624f9ee0f4867b934 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
Re: [DISCUSS] Branching for a release soon?
Thanks for volunteering to do the release Jim! The plan looks fine to me. Tom On Fri, Aug 19, 2016 at 4:42 PM, Todd Lipcon wrote: > Sounds great to me! > > Todd > > On Aug 19, 2016 8:41 AM, "Jim Apple" wrote: > >> I would like to volunteer to be the next release manager for an >> upcoming Apache Impala 2.7 (incubating) release. I plan to: >> >> 1. Get a vote done adding to our bylaws the procedure for creating a >> branch. (See thread from yesterday) >> >> 2. Create a release branch, possibly backdated a few commits from HEAD >> to a commit that looks stable. >> >> 3. Maintain that branch by cherry-picking bugfix commits from "master" >> as necessary to get the tests running cleanly. >> >> 4. Make a release candidate and call for a vote here, then on the IPMC >> list (which is required for Incubator releases) >> >> Obviously, this glosses over some details about publishing and signing >> releases, but I would welcome feedback on this outline. >>
[Impala-ASF-CR] IMPALA-3342: Runtime profile TotalCpuTime should eliminate wait times
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3342: Runtime profile TotalCpuTime should eliminate wait times .. Patch Set 1: Unless I'm missing something, TotalCpuTime at the fragment level isn't correctly updated with the user/sys time. It looks like we try to approximate it with: int64_t cpu_time = cpu_and_wait_time - runtime_state_->total_storage_wait_timer()->value() - runtime_state_->total_network_send_timer()->value() - runtime_state_->total_network_receive_timer()->value(); But it's not clear that that's accurate. I agree that we already do the right thing for ScannerThreadsUserTime and ScannerThreadsSysTime -- To view, visit http://gerrit.cloudera.org:8080/4052 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8edd0c3559968ff12e42f83dd970e424fa4a3c8 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3943: Adhere to abort on error in ProcessFooter().
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3943: Adhere to abort_on_error in ProcessFooter(). .. Patch Set 7: (1 comment) Not this patch, but we should consider changing HdfsScanNode so that it handles the abort_on_error logic at the callsite for Open() if Open() fails. It might simplify this and the other scanners http://gerrit.cloudera.org:8080/#/c/3862/7/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 188: if (!status.ok()) return state_->LogOrReturnError(status.msg()); Don't we want to return early if there was any kind of error? If its a non-aborting error I think we want to return early with Status::OK(), since it's not clear if it's safe to execute the below code. -- To view, visit http://gerrit.cloudera.org:8080/3862 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6aff766a1ce6376efb329bdde51c648149dfe08c Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
Re: [DISCUSS] Branching for a release soon?
How do you plan to choose which commits to cherry-pick? Should we let you know if we think a patch should/shouldn't be part of the release? On Fri, Aug 19, 2016 at 8:44 AM, Tom White wrote: > Thanks for volunteering to do the release Jim! The plan looks fine to me. > > Tom > > On Fri, Aug 19, 2016 at 4:42 PM, Todd Lipcon wrote: > > Sounds great to me! > > > > Todd > > > > On Aug 19, 2016 8:41 AM, "Jim Apple" wrote: > > > >> I would like to volunteer to be the next release manager for an > >> upcoming Apache Impala 2.7 (incubating) release. I plan to: > >> > >> 1. Get a vote done adding to our bylaws the procedure for creating a > >> branch. (See thread from yesterday) > >> > >> 2. Create a release branch, possibly backdated a few commits from HEAD > >> to a commit that looks stable. > >> > >> 3. Maintain that branch by cherry-picking bugfix commits from "master" > >> as necessary to get the tests running cleanly. > >> > >> 4. Make a release candidate and call for a vote here, then on the IPMC > >> list (which is required for Incubator releases) > >> > >> Obviously, this glosses over some details about publishing and signing > >> releases, but I would welcome feedback on this outline. > >> >
[Impala-ASF-CR] IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for Kudu
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for Kudu .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/3986/5/fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java File fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java: Line 118:* Normalizes a 'predicate' consisting of an uncast SlotRef and a constant Expr into > For Kudu predicates, at least, we can't have the slotref wrapped by a cast. maybe we should change the name, to warn potential future callers. the initial thinking behind putting this logic here was that it seems relatively general, so would be useful outside of kudu. i'm not sure that's still the case. -- To view, visit http://gerrit.cloudera.org:8080/3986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iae7612433a2e27f8887abe6624f9ee0f4867b934 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
Re: [DISCUSS] Branching for a release soon?
I hadn't decided yet. What do you think? I could test the release, then look for bug fixes that landed in master after my branch started that fix anything that doesn't pass. Or I could reach out to commit authors and ask them, or I could ask them to email me, like you mentioned. Or I could read commit messages and cherry-pick anything that is a bug fix. On Fri, Aug 19, 2016 at 9:01 AM, Tim Armstrong wrote: > How do you plan to choose which commits to cherry-pick? Should we let you > know if we think a patch should/shouldn't be part of the release? > > On Fri, Aug 19, 2016 at 8:44 AM, Tom White wrote: > >> Thanks for volunteering to do the release Jim! The plan looks fine to me. >> >> Tom >> >> On Fri, Aug 19, 2016 at 4:42 PM, Todd Lipcon wrote: >> > Sounds great to me! >> > >> > Todd >> > >> > On Aug 19, 2016 8:41 AM, "Jim Apple" wrote: >> > >> >> I would like to volunteer to be the next release manager for an >> >> upcoming Apache Impala 2.7 (incubating) release. I plan to: >> >> >> >> 1. Get a vote done adding to our bylaws the procedure for creating a >> >> branch. (See thread from yesterday) >> >> >> >> 2. Create a release branch, possibly backdated a few commits from HEAD >> >> to a commit that looks stable. >> >> >> >> 3. Maintain that branch by cherry-picking bugfix commits from "master" >> >> as necessary to get the tests running cleanly. >> >> >> >> 4. Make a release candidate and call for a vote here, then on the IPMC >> >> list (which is required for Incubator releases) >> >> >> >> Obviously, this glosses over some details about publishing and signing >> >> releases, but I would welcome feedback on this outline. >> >> >>
Re: [DISCUSS] Branching for a release soon?
I think the first sounds easiest. If commit authors really think something should go in the release they can always reach out to you. On Fri, Aug 19, 2016 at 9:15 AM, Jim Apple wrote: > I hadn't decided yet. What do you think? > > I could test the release, then look for bug fixes that landed in > master after my branch started that fix anything that doesn't pass. > > Or I could reach out to commit authors and ask them, or I could ask > them to email me, like you mentioned. > > Or I could read commit messages and cherry-pick anything that is a bug fix. > > On Fri, Aug 19, 2016 at 9:01 AM, Tim Armstrong > wrote: > > How do you plan to choose which commits to cherry-pick? Should we let you > > know if we think a patch should/shouldn't be part of the release? > > > > On Fri, Aug 19, 2016 at 8:44 AM, Tom White wrote: > > > >> Thanks for volunteering to do the release Jim! The plan looks fine to > me. > >> > >> Tom > >> > >> On Fri, Aug 19, 2016 at 4:42 PM, Todd Lipcon wrote: > >> > Sounds great to me! > >> > > >> > Todd > >> > > >> > On Aug 19, 2016 8:41 AM, "Jim Apple" wrote: > >> > > >> >> I would like to volunteer to be the next release manager for an > >> >> upcoming Apache Impala 2.7 (incubating) release. I plan to: > >> >> > >> >> 1. Get a vote done adding to our bylaws the procedure for creating a > >> >> branch. (See thread from yesterday) > >> >> > >> >> 2. Create a release branch, possibly backdated a few commits from > HEAD > >> >> to a commit that looks stable. > >> >> > >> >> 3. Maintain that branch by cherry-picking bugfix commits from > "master" > >> >> as necessary to get the tests running cleanly. > >> >> > >> >> 4. Make a release candidate and call for a vote here, then on the > IPMC > >> >> list (which is required for Incubator releases) > >> >> > >> >> Obviously, this glosses over some details about publishing and > signing > >> >> releases, but I would welcome feedback on this outline. > >> >> > >> >
[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-1659: Netezza compatibility functions: metadata .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/4063/1/be/src/exprs/utility-functions-ir.cc File be/src/exprs/utility-functions-ir.cc: Line 120: StringVal UtilityFunctions::CurrentSid(FunctionContext* ctx) { > This function was requested in the feature request, but I feel that it shou agree, let's leave it out http://gerrit.cloudera.org:8080/#/c/4063/1/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: Line 551: [['user', 'current_user', 'current_schema'], 'STRING', [], 'impala::UtilityFunctions::User'], current_schema is the same as the user? -- To view, visit http://gerrit.cloudera.org:8080/4063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: Yes
[Impala-CR](cdh5-trunk) IMPALA-1654: Support general predicates in most partition DDL operations.
Jim Apple has posted comments on this change. Change subject: IMPALA-1654: Support general predicates in most partition DDL operations. .. Patch Set 9: This is now getting reviewed on https://gerrit.cloudera.org/#/c/3942/. Amos, if you don't need this anymore, can you click "Abandon"? It doesn't delete anything, so you'll always be able to come back and look at it later. -- To view, visit http://gerrit.cloudera.org:8080/1563 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f Gerrit-PatchSet: 9 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Amos Bird Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Amos Bird Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Jim Apple Gerrit-HasComments: No
[Impala-CR](cdh5-trunk) IMPALA-3153: Incorrect behaviour around slash escaping single quotes
Jim Apple has abandoned this change. Change subject: IMPALA-3153: Incorrect behaviour around slash escaping single quotes .. Abandoned This change is small enough that it should be re-sent on the new gerrit project Impala-ASF. See https://cwiki.apache.org/confluence/display/IMPALA/How+to+switch+to+Apache-hosted+git -- To view, visit http://gerrit.cloudera.org:8080/3335 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I002f75f3330ee319d82213331a5a046ecf198189 Gerrit-PatchSet: 2 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Feng Guangyuan Gerrit-Reviewer: Feng Guangyuan Gerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-2581: LIMIT can be propagated down into some aggregations
Jim Apple has posted comments on this change. Change subject: IMPALA-2581: LIMIT can be propagated down into some aggregations .. Patch Set 4: I will be able to look at this again in about 9 days. -- To view, visit http://gerrit.cloudera.org:8080/3822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I59c5b7af7a73ccdbc5496b28eacb9b6859d202bc Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Add .clang-format for Impala's C++ style
Henry Robinson has posted comments on this change. Change subject: Add .clang-format for Impala's C++ style .. Patch Set 5: Code-Review+2 Seems like there's consensus on the mailing list. Could you add a page to the wiki explaining what this is, how we expect it to be used and so on? -- To view, visit http://gerrit.cloudera.org:8080/3886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3662: Don't double allocate tuples' buffer in parquet scanner
Michael Ho has uploaded a new change for review. http://gerrit.cloudera.org:8080/4064 Change subject: IMPALA-3662: Don't double allocate tuples' buffer in parquet scanner .. IMPALA-3662: Don't double allocate tuples' buffer in parquet scanner HdfsScanner::StartNewRowBatch() is called once per row batch by the parquet scanner to allocate a new row batch and tuple buffer. Similarly, a scratch batch is created for each row batch in HdfsParquetScanner::AssembleRows() which also contains the tuple buffer. In reality, only the tuple buffer in the scratch batch is used. So, the tuple buffer allocated by HdfsScanner::StartNewRowBatch() is unused memory for the parquet scanner. This change fixes the problem above by implementing HdfsParquetScanner::StartNewRowBatch() which creates a new row batch without allocating the tuple buffer. With this patch, the memory consumption when materializing very wide tuples is reduced by half. Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scanner.h 3 files changed, 14 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/4064/1 -- To view, visit http://gerrit.cloudera.org:8080/4064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata
Tim Armstrong has posted comments on this change. Change subject: IMPALA-1659: Netezza compatibility functions: metadata .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4063/1/be/src/exprs/utility-functions-ir.cc File be/src/exprs/utility-functions-ir.cc: Line 120: StringVal UtilityFunctions::CurrentSid(FunctionContext* ctx) { > This function was requested in the feature request, but I feel that it shou I agree that the function lists in the netezza compatibility JIRAs were not thoroughly vetted, thanks for thinking through this. I think some of the same points apply to current_schema, etc as well - it looks like they may be magic constants too. It seems like having functions with the same name as the magic would make porting queries much easier, even though though would still require modifications, so it might be worth considering. We don't have a UDF to return the current session. Maybe that's a useful piece of functionality too even if it's not exactly the same as Netezza. What do you think? Returning it as a string seems ok to me, since that's how we display it everywhere in the system. -- To view, visit http://gerrit.cloudera.org:8080/4063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page
Henry Robinson has posted comments on this change. Change subject: IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/3998/7/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: PS7, Line 17: # : # Verification of impalad metrics after a test run. remove PS7, Line 21: from tests.common.errors import Timeout is this used? PS7, Line 27: (\ remove the \, and try and put some of the parameters onto this line. PS7, Line 31: resp.raise_for_status() does this raise an error when status != 200? If so you don't need the checks on lines 41, 43, 45. Maybe you could check if the contents of the webpage has what you're expecting? PS7, Line 35: """request web page /memz at imapalad / statestored / catalogd.""" I'd remove this, it doesn't tell the reader anything that's not in the test comments. -- To view, visit http://gerrit.cloudera.org:8080/3998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Kathy Sun Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Kathy Sun Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-CR](cdh5-trunk) IMPALA-2033: Netezza compatibility functions quote ident
Tim Armstrong has posted comments on this change. Change subject: IMPALA-2033: Netezza compatibility functions quote_ident .. Patch Set 4: Hi Shirish, have you had a chance to look at the comments? -- To view, visit http://gerrit.cloudera.org:8080/3263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idbac4cd10fe08a6bd4871a03ba3b5f2c20fa8ee1 Gerrit-PatchSet: 4 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Shirish Gajera Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Shirish Gajera Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3662: Don't double allocate tuples' buffer in parquet scanner
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3662: Don't double allocate tuples' buffer in parquet scanner .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/4064/1/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: PS1, Line 304: StartNewRowBatch I think we should disambiguate so that it's clear which variant is being used. Maybe StartNewRowBatchNoTupleBuffer() or StartNewParquetRowBatch()? Line 341: RETURN_IF_ERROR(CommitRows(row_batch, num_row_to_commit)); Doesn't CommitRows() call HdfsScanner::StartNewRowBatch() anyway? -- To view, visit http://gerrit.cloudera.org:8080/4064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures
Dan Hecht has posted comments on this change. Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures .. Patch Set 4: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/3991/4/be/src/exec/exec-node.cc File be/src/exec/exec-node.cc: Line 442: "Debug Action: MEM_LIMIT_EXCEEDED"); this is fine with me, but any reason we don't use mem_tracker() (i.e. exec-node's tracker)? http://gerrit.cloudera.org:8080/#/c/3991/4/be/src/exec/hdfs-scan-node.h File be/src/exec/hdfs-scan-node.h: PS4, Line 576: if requested by scanner I think we should delete this part because it sounds like the function itself will check if the scanner requested it, but you're really just saying that this is called by the scanner. -- To view, visit http://gerrit.cloudera.org:8080/3991 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format
Marcel Kornacker has posted comments on this change. Change subject: REVIEW-ONLY: the results of running clang-format .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/4046/1/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 261 i'd prefer to keep it this way, because that makes complicated boolean exprs easier to read Line 1038:**next_backend_on_host) != backends_on_host.end()); do you know what the rationale for this indentation is? -- To view, visit http://gerrit.cloudera.org:8080/4046 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] Add .clang-format for Impala's C++ style
Marcel Kornacker has posted comments on this change. Change subject: Add .clang-format for Impala's C++ style .. Patch Set 5: Code-Review-1 i posted a comment on the preview of the reformatting results you sent out, please address that first. -- To view, visit http://gerrit.cloudera.org:8080/3886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Add .clang-format for Impala's C++ style
Internal Jenkins has posted comments on this change. Change subject: Add .clang-format for Impala's C++ style .. Patch Set 5: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/106/ -- To view, visit http://gerrit.cloudera.org:8080/3886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Add .clang-format for Impala's C++ style
Jim Apple has posted comments on this change. Change subject: Add .clang-format for Impala's C++ style .. Patch Set 5: > i posted a comment on the preview of the reformatting results you > sent out, please address that first. OK - halted the in-progress verification job to address your comments first. -- To view, visit http://gerrit.cloudera.org:8080/3886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3662: Don't double allocate tuples buffer in parquet scanner
Michael Ho has uploaded a new patch set (#2). Change subject: IMPALA-3662: Don't double allocate tuples buffer in parquet scanner .. IMPALA-3662: Don't double allocate tuples buffer in parquet scanner HdfsScanner::StartNewRowBatch() is called once per row batch by the parquet scanner to allocate a new row batch and tuple buffer. Similarly, a scratch batch is created for each row batch in HdfsParquetScanner::AssembleRows() which also contains the tuple buffer. In reality, only the tuple buffer in the scratch batch is used. So, the tuple buffer allocated by HdfsScanner::StartNewRowBatch() is unused memory for the parquet scanner. This change fixes the problem above by implementing HdfsParquetScanner::StartNewRowBatch() which creates a new row batch without allocating the tuple buffer. With this patch, the memory consumption when materializing very wide tuples is reduced by half. Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scanner.h 3 files changed, 14 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/4064/2 -- To view, visit http://gerrit.cloudera.org:8080/4064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3662: Don't double allocate tuples buffer in parquet scanner
Michael Ho has posted comments on this change. Change subject: IMPALA-3662: Don't double allocate tuples buffer in parquet scanner .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/4064/1/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: PS1, Line 304: StartNewRowBatch > I think we should disambiguate so that it's clear which variant is being us Done Line 341: RETURN_IF_ERROR(CommitRows(row_batch, num_row_to_commit)); > Doesn't CommitRows() call HdfsScanner::StartNewRowBatch() anyway? I believe it's calling HdfsParquetScanner::CommitRows() which doesn't call StartNewRowBatch(). -- To view, visit http://gerrit.cloudera.org:8080/4064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format
Jim Apple has posted comments on this change. Change subject: REVIEW-ONLY: the results of running clang-format .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/4046/1/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 261 > i'd prefer to keep it this way, because that makes complicated boolean expr Setting BreakBeforeBinaryOperators to NonAssignment or All fixes this, but increases the total diff from 79697 to 81122 or 85238 lines, respectively. Line 1038:**next_backend_on_host) != backends_on_host.end()); > do you know what the rationale for this indentation is? I think it is ContinuationIndentWidth - 4 spaces, plus the length of the grandparent ('DCHECK') -- To view, visit http://gerrit.cloudera.org:8080/4046 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3991 to look at the new patch set (#5). Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures .. IMPALA-3962: Clean up scratch tuple batch on scan failures The parquet scanner doesn't clean up 'scratch_batch_' properly which causes it to process a partially filled scratch_batch_ if any of the column reader fails. This change cleans up the scratch batch if any of the parquet column readers fails. The clean up involves freeing the mem_pool of scratch_batch_ and setting number of tuples in scratch_batch_ to 0. This change also extends debug action to emulate the behavior of exceeding the query's memory limit. Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5 --- M be/src/exec/exec-node.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M common/thrift/PlanNodes.thrift M tests/failure/test_failpoints.py 10 files changed, 76 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/3991/5 -- To view, visit http://gerrit.cloudera.org:8080/3991 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures
Michael Ho has posted comments on this change. Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/3991/4/be/src/exec/exec-node.cc File be/src/exec/exec-node.cc: Line 442: "Debug Action: MEM_LIMIT_EXCEEDED"); > this is fine with me, but any reason we don't use mem_tracker() (i.e. exec- Done http://gerrit.cloudera.org:8080/#/c/3991/4/be/src/exec/hdfs-scan-node.h File be/src/exec/hdfs-scan-node.h: PS4, Line 576: if requested by scanner > I think we should delete this part because it sounds like the function itse Done -- To view, visit http://gerrit.cloudera.org:8080/3991 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures
Michael Ho has posted comments on this change. Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures .. Patch Set 5: Code-Review+2 Carry +2 forward. -- To view, visit http://gerrit.cloudera.org:8080/3991 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata
Zoltan Ivanfi has posted comments on this change. Change subject: IMPALA-1659: Netezza compatibility functions: metadata .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/4063/1/be/src/exprs/utility-functions-ir.cc File be/src/exprs/utility-functions-ir.cc: Line 120: StringVal UtilityFunctions::CurrentSid(FunctionContext* ctx) { > I agree that the function lists in the netezza compatibility JIRAs were not I agree that providing a function to query the session ID would be a useful addition. If you are fine with this implementation knowing that it will not provide any Netezza-compatility after all, I am happy to submit it as well - after all, I already wrote the code for it. ;) http://gerrit.cloudera.org:8080/#/c/4063/1/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: Line 551: [['user', 'current_user', 'current_schema'], 'STRING', [], 'impala::UtilityFunctions::User'], > current_schema is the same as the user? Since there is no universal definition of a schema, we can not provide a function that will fit everyone's definition. Based on my short research, according to IBM and Oracle, schemas seem to be conceptually similar to users (although I'm not an expert of their database products, so I may be mistaken). The document attached to the feature request also linked to IBM's page. Based on this it seems like a reasonable compromise to return the user name here. My sources in more details: netezza-functions.pdf [1], which is attached to the issue IMPALA-1660 contains the list of functions to be added with some context. Despite the name of the document, most of the functions are not Netezza functions, but are from other vendors. The info for CURRENT_SCHEMA links to an IBM page (as many other functions in that PDF), but it is a dead link. The link for CURRENT_CATALOG however, leads to another IBM page [2] that lists all functions. This page describes CURRENT_SCHEMA as "Returns the current schema name (user name)". Also a highly rated answer on Stack Overflow [3] states that in Oracle "schema == namespace within database, identical to user account" but also that "The meaning of 'catalog', 'schema' and 'database' vary from one implementation to another." [1] https://issues.cloudera.org/secure/attachment/12199/netezza-functions.pdf [2] http://www.ibm.com/support/knowledgecenter/SSULQD_7.0.3/com.ibm.nz.dbu.doc/r_dbuser_functions.html [3] http://stackoverflow.com/a/7944489/5613485 -- To view, visit http://gerrit.cloudera.org:8080/4063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3662: Don't double allocate tuples buffer in parquet scanner
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3662: Don't double allocate tuples buffer in parquet scanner .. Patch Set 2: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/4064/1/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 341: RETURN_IF_ERROR(CommitRows(row_batch, num_row_to_commit)); > I believe it's calling HdfsParquetScanner::CommitRows() which doesn't call Ah you're right. This is confusing. -- To view, visit http://gerrit.cloudera.org:8080/4064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
Re: [DISCUSS] Branching for a release soon?
I believe the newer versions of gerrit support a label-like concept called 'hashtags' (apparently the authors of Gerrit love Instagram). They allow you to tag a review/commit with arbitrary strings. Perhaps this feature could be used so that patch authors can tag their reviews/commits as "stable-candidate" and Jim can easily query for them? Then when it has been cherry-picked to the branch, a "stable-picked" tag can be added to indicate it has been merged? If this sounds useful, let me know and I can look into enabling the feature on gerrit.cloudera.org. I think some non-trivial config change has to be made, though, so would take a week or two. -Todd On Fri, Aug 19, 2016 at 9:38 AM, Tim Armstrong wrote: > I think the first sounds easiest. If commit authors really think something > should go in the release they can always reach out to you. > > On Fri, Aug 19, 2016 at 9:15 AM, Jim Apple wrote: > > > I hadn't decided yet. What do you think? > > > > I could test the release, then look for bug fixes that landed in > > master after my branch started that fix anything that doesn't pass. > > > > Or I could reach out to commit authors and ask them, or I could ask > > them to email me, like you mentioned. > > > > Or I could read commit messages and cherry-pick anything that is a bug > fix. > > > > On Fri, Aug 19, 2016 at 9:01 AM, Tim Armstrong > > wrote: > > > How do you plan to choose which commits to cherry-pick? Should we let > you > > > know if we think a patch should/shouldn't be part of the release? > > > > > > On Fri, Aug 19, 2016 at 8:44 AM, Tom White wrote: > > > > > >> Thanks for volunteering to do the release Jim! The plan looks fine to > > me. > > >> > > >> Tom > > >> > > >> On Fri, Aug 19, 2016 at 4:42 PM, Todd Lipcon > wrote: > > >> > Sounds great to me! > > >> > > > >> > Todd > > >> > > > >> > On Aug 19, 2016 8:41 AM, "Jim Apple" wrote: > > >> > > > >> >> I would like to volunteer to be the next release manager for an > > >> >> upcoming Apache Impala 2.7 (incubating) release. I plan to: > > >> >> > > >> >> 1. Get a vote done adding to our bylaws the procedure for creating > a > > >> >> branch. (See thread from yesterday) > > >> >> > > >> >> 2. Create a release branch, possibly backdated a few commits from > > HEAD > > >> >> to a commit that looks stable. > > >> >> > > >> >> 3. Maintain that branch by cherry-picking bugfix commits from > > "master" > > >> >> as necessary to get the tests running cleanly. > > >> >> > > >> >> 4. Make a release candidate and call for a vote here, then on the > > IPMC > > >> >> list (which is required for Incubator releases) > > >> >> > > >> >> Obviously, this glosses over some details about publishing and > > signing > > >> >> releases, but I would welcome feedback on this outline. > > >> >> > > >> > > > -- Todd Lipcon Software Engineer, Cloudera
[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata
Zoltan Ivanfi has posted comments on this change. Change subject: IMPALA-1659: Netezza compatibility functions: metadata .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4063/1/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: Line 551: [['user', 'current_user', 'current_schema'], 'STRING', [], 'impala::UtilityFunctions::User'], > Since there is no universal definition of a schema, we can not provide a fu To clarify: Netezza is a subsidiary of IBM, but [2] is from the documentation of IBM's DB2, which I believe to be unrelated apart from the common owner of the products. -- To view, visit http://gerrit.cloudera.org:8080/4063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata
Zoltan Ivanfi has posted comments on this change. Change subject: IMPALA-1659: Netezza compatibility functions: metadata .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4063/1/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: Line 551: [['user', 'current_user', 'current_schema'], 'STRING', [], 'impala::UtilityFunctions::User'], > To clarify: Netezza is a subsidiary of IBM, but [2] is from the documentati Actually [2] is from the Netezza documentation, my mistake. -- To view, visit http://gerrit.cloudera.org:8080/4063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-(3895,3859): Don't log file data on parse errors
Henry Robinson has uploaded a new patch set (#3). Change subject: IMPALA-(3895,3859): Don't log file data on parse errors .. IMPALA-(3895,3859): Don't log file data on parse errors Logging file or table data is a bad idea, and doing it by default is particularly bad. This patch changes HdfsScanNode::LogRowParseError() to log a file and offset only. Testing: See rewritten tests. To support testing this change, we also fix IMPALA-3895, by introducing a canonical string __HDFS_FILENAME__ that all Hadoop filenames in the ERROR output are replaced with before comparing with the expected results. This fixes a number of issues with the old way of matching filenames which purported to be a regex, but really wasn't. In particular, we can now match the rest of an ERROR line after the filename, which was not possible before. In some cases, we don't want to substitute filenames because the ERROR output is looking for a very specific output. In that case we can write: $NAMENODE/ and this patch will not perform _any_ filename substitutions on ERROR sections that contain the $NAMENODE string. Finally, this patch fixes a bug where a test that had an ERRORS section but no RESULTS section would silently pass without testing anything. Change-Id: I5a604f8784a9ff7b4bf878f82ee7f56697df3272 --- M be/src/exec/hdfs-scanner-ir.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h M testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test M testdata/workloads/functional-query/queries/DataErrorsTest/hbase-scan-node-errors.test M testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-rcfile-scan-node-errors.test M testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test M testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-sequence-scan-errors.test M testdata/workloads/functional-query/queries/QueryTest/parquet-continue-on-error.test M testdata/workloads/functional-query/queries/QueryTest/strict-mode-abort.test M testdata/workloads/functional-query/queries/QueryTest/strict-mode.test M tests/common/impala_test_suite.py M tests/common/test_result_verifier.py 17 files changed, 354 insertions(+), 367 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/4020/3 -- To view, visit http://gerrit.cloudera.org:8080/4020 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5a604f8784a9ff7b4bf878f82ee7f56697df3272 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-(3895,3859): Don't log file data on parse errors
Henry Robinson has posted comments on this change. Change subject: IMPALA-(3895,3859): Don't log file data on parse errors .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/4020/2/be/src/exec/hdfs-scanner.cc File be/src/exec/hdfs-scanner.cc: PS2, Line 635: TO > weird caps - should be lower case? Done http://gerrit.cloudera.org:8080/#/c/4020/2/tests/common/impala_test_suite.py File tests/common/impala_test_suite.py: Line 328: replace_filenames_with_placeholder = True > I think like this is getting big enough to be its own function. E.g. verify Done -- To view, visit http://gerrit.cloudera.org:8080/4020 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5a604f8784a9ff7b4bf878f82ee7f56697df3272 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1659: Netezza compatibility functions: metadata
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-1659: Netezza compatibility functions: metadata .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4063/1/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: Line 551: [['user', 'current_user', 'current_schema'], 'STRING', [], 'impala::UtilityFunctions::User'], > Actually [2] is from the Netezza documentation, my mistake. maybe we should leave this out for now until we can get confirmation (from a netezza user) what this really means. intuitively i wouldn't expect user == schema. -- To view, visit http://gerrit.cloudera.org:8080/4063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b5d1009bbf42acc175a942d2df484e1c64822ca Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail
Sailesh Mukil has uploaded a new patch set (#3). Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail .. IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail HdfsTableSink::Close() makes calls to functions that can fail with a Status. However, since the function has a void return type, these error statuses are just logged and we cannot take any action according to the type of error. This patch moves the closing of the partition file into the FlushFinal function from Close(), so that in case of an error on closing the file, the error is propagated up and some action can be taken. We try and close all the partition files in the map in Close() as well because if a query is cancelled, FlushFinal will not be called and we would end up leaking some file descriptors. Also fixed some long lines in this patch. Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-sink.h 2 files changed, 14 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4018/3 -- To view, visit http://gerrit.cloudera.org:8080/4018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail .. Patch Set 3: (1 comment) Passed private build: http://sandbox.jenkins.cloudera.com/job/impala-private-build-and-test/3984/ http://gerrit.cloudera.org:8080/#/c/4018/2/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: Line 647: RETURN_IF_ERROR(FinalizePartitionFile(state, cur_partition->second.first)); > I still don't see why this is MergeStatus() instead of RETURN_IF_ERROR. We Right, that makes sense. I've made that change. -- To view, visit http://gerrit.cloudera.org:8080/4018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail
Sailesh Mukil has uploaded a new patch set (#4). Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail .. IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail HdfsTableSink::Close() makes calls to functions that can fail with a Status. However, since the function has a void return type, these error statuses are just logged and we cannot take any action according to the type of error. This patch moves the closing of the partition file into the FlushFinal function from Close(), so that in case of an error on closing the file, the error is propagated up and some action can be taken. We try and close all the partition files in the map in Close() as well because if a query is cancelled, FlushFinal will not be called and we would end up leaking some file descriptors. Also fixed some long lines in this patch. Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-sink.h 2 files changed, 13 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4018/4 -- To view, visit http://gerrit.cloudera.org:8080/4018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail
Dan Hecht has posted comments on this change. Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail .. Patch Set 4: Does this pass the tests? -- To view, visit http://gerrit.cloudera.org:8080/4018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail .. Patch Set 4: > Does this pass the tests? Yes, I mentioned about that it passed the private build: http://sandbox.jenkins.cloudera.com/job/impala-private-build-and-test/3984/ -- To view, visit http://gerrit.cloudera.org:8080/4018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail
Dan Hecht has posted comments on this change. Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail .. Patch Set 4: Code-Review+2 (1 comment) Oops, sorry missed your comment. http://gerrit.cloudera.org:8080/#/c/4018/4/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: Line 626: ImpaladMetrics::NUM_FILES_OPEN_FOR_INSERT->Increment(-1); Should we do these before returning error? otherwise, the impalad metric can get stuck at non-zero, but there's nothing we can do about it right? -- To view, visit http://gerrit.cloudera.org:8080/4018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3662: Don't double allocate tuples buffer in parquet scanner
Dan Hecht has posted comments on this change. Change subject: IMPALA-3662: Don't double allocate tuples buffer in parquet scanner .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I826061a2be10fd0528ca4dd1e97146e3cb983370 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page
Kathy Sun has posted comments on this change. Change subject: IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page .. Patch Set 7: (13 comments) http://gerrit.cloudera.org:8080/#/c/3998/6/be/src/util/metrics-test.cc File be/src/util/metrics-test.cc: PS6, Line 350: reinterpret_cast(NULL) > just wondering - does nullptr work here? Yeah, It could work, I write in this way since lines 372 did so. to make them consistent http://gerrit.cloudera.org:8080/#/c/3998/6/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: PS6, Line 27: quest_w > should be: Done PS6, Line 37: "Test all memz/ webpages""" > if you raise here, line 38 will never be executed. I think you don't need t Done PS6, Line 27: def request_web_page(\ : host_name, web_ui_port, relative_url, params={}, timeout_secs=DEFAULT_TIMEOUT): : url = "http://%s:%s%s"; % (host_name, web_ui_port, relative_url) : resp = requests.get(url, params=params, timeout=timeout_secs) : resp.raise_for_status() : return resp : : class TestWebPage(ImpalaTestSuite): : """request web page /memz at imapalad / statestored / catalogd.""" : def test_memz(self): : """Test all memz/ webpages""" : relative_url = "/memz" : > Don't think you need a class for this. Just have: Done PS6, Line 44: ge = request_web_page("lo > comment needs finishing Done PS6, Line 45: assert page.status_cod > I feel like this test can be written like: Done PS6, Line 49: > Does this fail if the web page doesn't exist? I add assert for the status_code, so it will fail now http://gerrit.cloudera.org:8080/#/c/3998/7/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: PS7, Line 17: # : # Verification of impalad metrics after a test run. > remove Done PS7, Line 21: from tests.common.errors import Timeout > is this used? no, removed PS7, Line 21: from tests.common.errors import Timeout > is this used? Done PS7, Line 27: (\ > remove the \, and try and put some of the parameters onto this line. I decided not to wrap requests.get() into another function remove request_web_page() for cleaner code PS7, Line 31: resp.raise_for_status() > does this raise an error when status != 200? If so you don't need the check I think status_code is enough for our purpose of checking whether the system crashed on clicking /memz. PS7, Line 35: """request web page /memz at imapalad / statestored / catalogd.""" > I'd remove this, it doesn't tell the reader anything that's not in the test Done -- To view, visit http://gerrit.cloudera.org:8080/3998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Kathy Sun Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Kathy Sun Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/4018/4/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: Line 626: ImpaladMetrics::NUM_FILES_OPEN_FOR_INSERT->Increment(-1); > Should we do these before returning error? otherwise, the impalad metric ca Yes, we would want that to not be decremented in case of an error. So that test_verify_metrics can catch that a file(s) was not closed, in case of our test runs. And otherwise too, users can see that there are some files still open. -- To view, visit http://gerrit.cloudera.org:8080/4018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page
Kathy Sun has uploaded a new patch set (#8). Change subject: IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page .. IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page The /memz page tried to add JVM metrics even when they didn't exist for all daemons, not just Impala. This led to a crash when they tried to access ExecEnv::GetInstance() without an initialised ExecEnv at statestored and catalogd To fix, changed the memz handler method to take an optional metric group, provided by the caller. memz handler will check the existence of the metric group. Used C++11 lambdas rather than boost::bind to help simplify the code. Testing: Ran locally and looked at impalad/memz, statestored/memz and catalogd/memz Add a test file test_web_pages.py to test sending request to /memz on impalad / statestored / catalogd Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29 --- M be/src/catalog/catalogd-main.cc M be/src/runtime/data-stream-mgr.cc M be/src/runtime/exec-env.cc M be/src/scheduling/simple-scheduler.cc M be/src/service/impala-server.cc M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestored-main.cc M be/src/util/default-path-handlers.cc M be/src/util/default-path-handlers.h M be/src/util/memory-metrics.cc M be/src/util/metrics-test.cc M be/src/util/metrics.cc M be/src/util/metrics.h A tests/webserver/test_web_pages.py M www/memz.tmpl 15 files changed, 96 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/3998/8 -- To view, visit http://gerrit.cloudera.org:8080/3998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Kathy Sun Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Kathy Sun Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page
Henry Robinson has posted comments on this change. Change subject: IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page .. Patch Set 8: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/3998/8/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: PS8, Line 23: imapalad impalad -- To view, visit http://gerrit.cloudera.org:8080/3998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Kathy Sun Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Kathy Sun Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail
Dan Hecht has posted comments on this change. Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/4018/4/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: Line 626: ImpaladMetrics::NUM_FILES_OPEN_FOR_INSERT->Increment(-1); > Yes, we would want that to not be decremented in case of an error. Hmm, but what can a user do about that? and does an error mean the file isn't closed or that there is something else wrong with the file (maybe not even open). -- To view, visit http://gerrit.cloudera.org:8080/4018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format
Marcel Kornacker has posted comments on this change. Change subject: REVIEW-ONLY: the results of running clang-format .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/4046/1/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 261 > Setting BreakBeforeBinaryOperators to NonAssignment or All fixes this, but what exactly does BreakBeforeBinaryOperators do? can you limit the scope to logical operators and boolean exprs? Line 1038:**next_backend_on_host) != backends_on_host.end()); > I think it is ContinuationIndentWidth - 4 spaces, plus the length of the gr shouldn't it be one or the other? i don't recall seeing this before. -- To view, visit http://gerrit.cloudera.org:8080/4046 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3996: Migrate to updated Kudu insert string API
Dan Hecht has posted comments on this change. Change subject: IMPALA-3996: Migrate to updated Kudu insert string API .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4055 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I910c24724d0bc887b2d4a3e62ecdf72420a76f6f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4018 to look at the new patch set (#5). Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail .. IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail HdfsTableSink::Close() makes calls to functions that can fail with a Status. However, since the function has a void return type, these error statuses are just logged and we cannot take any action according to the type of error. This patch moves the closing of the partition file into the FlushFinal function from Close(), so that in case of an error on closing the file, the error is propagated up and some action can be taken. We try and close all the partition files in the map in Close() as well because if a query is cancelled, FlushFinal will not be called and we would end up leaking some file descriptors. Also fixed some long lines in this patch. Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-sink.h 2 files changed, 15 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4018/5 -- To view, visit http://gerrit.cloudera.org:8080/4018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3998 to look at the new patch set (#9). Change subject: IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page .. IMPALA-3981: Fix crash when accessing statestored / catalogd /memz page The /memz page tried to add JVM metrics even when they didn't exist for all daemons, not just Impala. This led to a crash when they tried to access ExecEnv::GetInstance() without an initialised ExecEnv at statestored and catalogd To fix, changed the memz handler method to take an optional metric group, provided by the caller. memz handler will check the existence of the metric group. Used C++11 lambdas rather than boost::bind to help simplify the code. Testing: Ran locally and looked at impalad/memz, statestored/memz and catalogd/memz Add a test file test_web_pages.py to test sending request to /memz on impalad / statestored / catalogd Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29 --- M be/src/catalog/catalogd-main.cc M be/src/runtime/data-stream-mgr.cc M be/src/runtime/exec-env.cc M be/src/scheduling/simple-scheduler.cc M be/src/service/impala-server.cc M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestored-main.cc M be/src/util/default-path-handlers.cc M be/src/util/default-path-handlers.h M be/src/util/memory-metrics.cc M be/src/util/metrics-test.cc M be/src/util/metrics.cc M be/src/util/metrics.h A tests/webserver/test_web_pages.py M www/memz.tmpl 15 files changed, 96 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/3998/9 -- To view, visit http://gerrit.cloudera.org:8080/3998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Kathy Sun Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Kathy Sun Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/4018/4/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: Line 626: } > Hmm, but what can a user do about that? and does an error mean the file isn As we spoke, this metric is not to catch errors, but just to count the number of tries to open and close files. http://gerrit.cloudera.org:8080/#/c/4018/5/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: PS5, Line 620: partition->tmp_hdfs_file = NULL; Had to move this up too. Else, we could have cases where the metric is decremented twice for the same file. -- To view, visit http://gerrit.cloudera.org:8080/4018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3786: Replace "cloudera" with "apache" (part 2)
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-3786: Replace "cloudera" with "apache" (part 2) .. Patch Set 3: -Code-Review It passed the packaging build: http://golden.jenkins.cloudera.com/job/CDH5-Packaging-Impala-On-Demand/347/ -- To view, visit http://gerrit.cloudera.org:8080/3937 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0d35fa6602a7fc0c212b2ef5e2b3322b77dde7e2 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3786: Replace "cloudera" with "apache" (part 1)
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-3786: Replace "cloudera" with "apache" (part 1) .. Patch Set 2: It passed the packaging build: http://golden.jenkins.cloudera.com/job/CDH5-Packaging-Impala-On-Demand/348/ (posted the wrong link for this patch in my previous comment) -- To view, visit http://gerrit.cloudera.org:8080/3936 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3767dd1ee86df767075fdf1b371eb6b0b06668db Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format
Jim Apple has posted comments on this change. Change subject: REVIEW-ONLY: the results of running clang-format .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/4046/1/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 261 > what exactly does BreakBeforeBinaryOperators do? can you limit the scope to > The way to wrap binary operators. > > Possible values: > > BOS_None (in configuration: None) Break after operators. > BOS_NonAssignment (in configuration: NonAssignment) Break before operators > that aren't assignments. > BOS_All (in configuration: All) Break before operators. It looks to me like you cannot limit the scope to logical operators and binary expressions. BTW, I got that description from http://zed0.co.uk/clang-format-configurator/ Line 1038:**next_backend_on_host) != backends_on_host.end()); > shouldn't it be one or the other? i don't recall seeing this before. I guess that's not how clang-format sees it. -- To view, visit http://gerrit.cloudera.org:8080/4046 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose
Henry Robinson has posted comments on this change. Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose .. Patch Set 2: (6 comments) Were any tests affected? http://gerrit.cloudera.org:8080/#/c/3994/2/be/src/util/webserver.cc File be/src/util/webserver.cc: PS2, Line 223: http_address_.hostname.compare Can you use IsWildcardAddress(http_address_) ? PS2, Line 224: If GetHostname() returns an error, hostname will be 0.0.0.0 um? if GetHostname() returns an error, hostname will be undefined. http://gerrit.cloudera.org:8080/#/c/3994/2/shell/impala_client.py File shell/impala_client.py: PS2, Line 238: esult.version, result.webserver_address Just return result? http://gerrit.cloudera.org:8080/#/c/3994/2/shell/impala_shell.py File shell/impala_shell.py: PS2, Line 686: self.webserver_address.replace("localhost", "127.0.0.1") why? Line 877: self._print_if_verbose("Query submitted at: %s (Coordinator: %s)" % (time.strftime( long line PS2, Line 880: self._print_if_verbose("Query submitted at: %s" % time.strftime( : "%Y-%m-%d %H:%M:%S", time.localtime())) does this print for 'use' statements as well? I still think that's a bit verbose. Maybe remove this branch? -- To view, visit http://gerrit.cloudera.org:8080/3994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format
Jim Apple has uploaded a new patch set (#2). Change subject: REVIEW-ONLY: the results of running clang-format .. REVIEW-ONLY: the results of running clang-format See https://gerrit.cloudera.org/#/c/3886 for the configuration time -p (for FILENAME in $(find be/src/ -name '*.cc' -or -name '*.h' \ | grep -v '/gutil/' | grep -v '/thirdparty/'); do echo $FILENAME; \ clang-format-3.6 -i -style=file $FILENAME; done) Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de --- M .clang-format M be/src/benchmarks/atod-benchmark.cc M be/src/benchmarks/atof-benchmark.cc M be/src/benchmarks/atoi-benchmark.cc M be/src/benchmarks/bitmap-benchmark.cc M be/src/benchmarks/bloom-filter-benchmark.cc M be/src/benchmarks/expr-benchmark.cc M be/src/benchmarks/hash-benchmark.cc M be/src/benchmarks/in-predicate-benchmark.cc M be/src/benchmarks/int-hash-benchmark.cc M be/src/benchmarks/lock-benchmark.cc M be/src/benchmarks/multiint-benchmark.cc M be/src/benchmarks/network-perf-benchmark.cc M be/src/benchmarks/overflow-benchmark.cc M be/src/benchmarks/parse-timestamp-benchmark.cc M be/src/benchmarks/rle-benchmark.cc M be/src/benchmarks/row-batch-serialize-benchmark.cc M be/src/benchmarks/status-benchmark.cc M be/src/benchmarks/string-benchmark.cc M be/src/benchmarks/string-compare-benchmark.cc M be/src/benchmarks/string-search-benchmark.cc M be/src/benchmarks/thread-create-benchmark.cc M be/src/benchmarks/tuple-layout-benchmark.cc M be/src/bufferpool/buffer-pool.h M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/catalog/catalog-util.cc M be/src/catalog/catalog-util.h M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/catalog/catalogd-main.cc M be/src/codegen/codegen-anyval-ir.cc M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/codegen/codegen-symbol-emitter.cc M be/src/codegen/codegen-symbol-emitter.h M be/src/codegen/impala-ir-data.h M be/src/codegen/impala-ir.h M be/src/codegen/instruction-counter-test.cc M be/src/codegen/instruction-counter.cc M be/src/codegen/instruction-counter.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/codegen/mcjit-mem-mgr.h M be/src/common/atomic-test.cc M be/src/common/atomic.h M be/src/common/compiler-util.h M be/src/common/global-flags.cc M be/src/common/global-types.h M be/src/common/hdfs.h M be/src/common/init.cc M be/src/common/init.h M be/src/common/logging.cc M be/src/common/logging.h M be/src/common/names.h M be/src/common/object-pool.h M be/src/common/status.cc M be/src/common/status.h M be/src/exec/aggregation-node-ir.cc M be/src/exec/aggregation-node.cc M be/src/exec/aggregation-node.h M be/src/exec/analytic-eval-node.cc M be/src/exec/analytic-eval-node.h M be/src/exec/base-sequence-scanner.cc M be/src/exec/base-sequence-scanner.h M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h M be/src/exec/catalog-op-executor.cc M be/src/exec/catalog-op-executor.h M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/data-source-scan-node.cc M be/src/exec/data-source-scan-node.h M be/src/exec/delimited-text-parser-test.cc M be/src/exec/delimited-text-parser.cc M be/src/exec/delimited-text-parser.h M be/src/exec/delimited-text-parser.inline.h M be/src/exec/empty-set-node.cc M be/src/exec/empty-set-node.h M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/external-data-source-executor.cc M be/src/exec/external-data-source-executor.h M be/src/exec/filter-context.cc M be/src/exec/filter-context.h M be/src/exec/hash-join-node-ir.cc M be/src/exec/hash-join-node.cc M be/src/exec/hash-join-node.h M be/src/exec/hash-table-ir.cc M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-scan-node.h M be/src/exec/hbase-table-scanner.cc M be/src/exec/hbase-table-scanner.h M be/src/exec/hbase-table-sink.cc M be/src/exec/hbase-table-sink.h M be/src/exec/hbase-table-writer.cc M be/src/exec/hbase-table-writer.h M be/src/exec/hdfs-avro-scanner-ir.cc M be/src/exec/hdfs-avro-scanner-test.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-avro-table-writer.cc M be/src/exec/hdfs-avro-table-writer.h M be/src/exec/hdfs-lzo-text-scanner.cc M be/src/exec/hdfs-lzo-text-scanner.h M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-rcfile-scanner.cc M be/src/exec/hdfs-rcfile-scanner.h M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/hdfs-scanner-ir.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-sequence-ta
[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format
Jim Apple has posted comments on this change. Change subject: REVIEW-ONLY: the results of running clang-format .. Patch Set 2: Verified-1 This version adds NonAssignment and then shows how it behaves on simple-scheduler.cc -- To view, visit http://gerrit.cloudera.org:8080/4046 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] REVIEW-ONLY: the results of running clang-format
Marcel Kornacker has posted comments on this change. Change subject: REVIEW-ONLY: the results of running clang-format .. Patch Set 2: looks good -- To view, visit http://gerrit.cloudera.org:8080/4046 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If00bcdb7957ec55a3b8568887de161b7f30d58de Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] Add .clang-format for Impala's C++ style
Hello Marcel Kornacker, Henry Robinson, Internal Jenkins, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3886 to look at the new patch set (#6). Change subject: Add .clang-format for Impala's C++ style .. Add .clang-format for Impala's C++ style Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa --- A .clang-format 1 file changed, 13 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/3886/6 -- To view, visit http://gerrit.cloudera.org:8080/3886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Add .clang-format for Impala's C++ style
Jim Apple has posted comments on this change. Change subject: Add .clang-format for Impala's C++ style .. Patch Set 6: PS6 Added BreakBeforeBinaryOperators: 'NonAssignment' from http://gerrit.cloudera.org:8080/4046 -- To view, visit http://gerrit.cloudera.org:8080/3886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
Re: [DISCUSS] Branching for a release soon?
Todd, I see this as being useful for future releases, even if it doesn't get used for this one. Thanks for volunteering to look into that. On Fri, Aug 19, 2016 at 11:12 AM, Todd Lipcon wrote: > I believe the newer versions of gerrit support a label-like concept called > 'hashtags' (apparently the authors of Gerrit love Instagram). They allow > you to tag a review/commit with arbitrary strings. Perhaps this feature > could be used so that patch authors can tag their reviews/commits as > "stable-candidate" and Jim can easily query for them? Then when it has been > cherry-picked to the branch, a "stable-picked" tag can be added to indicate > it has been merged? > > If this sounds useful, let me know and I can look into enabling the feature > on gerrit.cloudera.org. I think some non-trivial config change has to be > made, though, so would take a week or two. > > -Todd > > On Fri, Aug 19, 2016 at 9:38 AM, Tim Armstrong > wrote: > >> I think the first sounds easiest. If commit authors really think something >> should go in the release they can always reach out to you. >> >> On Fri, Aug 19, 2016 at 9:15 AM, Jim Apple wrote: >> >> > I hadn't decided yet. What do you think? >> > >> > I could test the release, then look for bug fixes that landed in >> > master after my branch started that fix anything that doesn't pass. >> > >> > Or I could reach out to commit authors and ask them, or I could ask >> > them to email me, like you mentioned. >> > >> > Or I could read commit messages and cherry-pick anything that is a bug >> fix. >> > >> > On Fri, Aug 19, 2016 at 9:01 AM, Tim Armstrong >> > wrote: >> > > How do you plan to choose which commits to cherry-pick? Should we let >> you >> > > know if we think a patch should/shouldn't be part of the release? >> > > >> > > On Fri, Aug 19, 2016 at 8:44 AM, Tom White wrote: >> > > >> > >> Thanks for volunteering to do the release Jim! The plan looks fine to >> > me. >> > >> >> > >> Tom >> > >> >> > >> On Fri, Aug 19, 2016 at 4:42 PM, Todd Lipcon >> wrote: >> > >> > Sounds great to me! >> > >> > >> > >> > Todd >> > >> > >> > >> > On Aug 19, 2016 8:41 AM, "Jim Apple" wrote: >> > >> > >> > >> >> I would like to volunteer to be the next release manager for an >> > >> >> upcoming Apache Impala 2.7 (incubating) release. I plan to: >> > >> >> >> > >> >> 1. Get a vote done adding to our bylaws the procedure for creating >> a >> > >> >> branch. (See thread from yesterday) >> > >> >> >> > >> >> 2. Create a release branch, possibly backdated a few commits from >> > HEAD >> > >> >> to a commit that looks stable. >> > >> >> >> > >> >> 3. Maintain that branch by cherry-picking bugfix commits from >> > "master" >> > >> >> as necessary to get the tests running cleanly. >> > >> >> >> > >> >> 4. Make a release candidate and call for a vote here, then on the >> > IPMC >> > >> >> list (which is required for Incubator releases) >> > >> >> >> > >> >> Obviously, this glosses over some details about publishing and >> > signing >> > >> >> releases, but I would welcome feedback on this outline. >> > >> >> >> > >> >> > >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera
[Impala-ASF-CR] IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail
Dan Hecht has posted comments on this change. Change subject: IMPALA-2988: Refactor HdfsTableSink::Close() so that it cannot fail .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4018 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2546bc68ba136b2713d744c1b920878606a2217b Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1731,IMPALA-3868: Float values are not parsed correctly
Dan Hecht has posted comments on this change. Change subject: IMPALA-1731,IMPALA-3868: Float values are not parsed correctly .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3791 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9e17d0f051b300a22a520ce34e276c2d4460d35e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3221: Copyright / license audit
Henry Robinson has posted comments on this change. Change subject: IMPALA-3221: Copyright / license audit .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/3995/1/LICENSE.txt File LICENSE.txt: Line 461: shell/ext-py/prettytable-0.7.1: 3-clause BSD (see shell/ext-py/prettytable-0.7.1/PKG-INFO) > I noticed that one 3-clause BSD licence is copied in above, while these are There's no license or copyright text included with prettytable etc. that I could find, so nothing to copy here. For d3, note that there are a couple of mentions of the author. -- To view, visit http://gerrit.cloudera.org:8080/3995 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I24a868aec6a4f17f4ccca1b088d2f0de32f75d87 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3221: Copyright / license audit
Henry Robinson has uploaded a new patch set (#2). Change subject: IMPALA-3221: Copyright / license audit .. IMPALA-3221: Copyright / license audit Populates LICENSE.txt with known third-party licenses in the Impala codebase. Change-Id: I24a868aec6a4f17f4ccca1b088d2f0de32f75d87 --- M LICENSE.txt D be/src/gutil/LICENSE.txt M be/src/runtime/string-search.h 3 files changed, 297 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/3995/2 -- To view, visit http://gerrit.cloudera.org:8080/3995 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I24a868aec6a4f17f4ccca1b088d2f0de32f75d87 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-3221: Copyright / license audit
Jim Apple has posted comments on this change. Change subject: IMPALA-3221: Copyright / license audit .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/3995/1/LICENSE.txt File LICENSE.txt: Line 461: shell/ext-py/prettytable-0.7.1: 3-clause BSD (see shell/ext-py/prettytable-0.7.1/PKG-INFO) > There's no license or copyright text included with prettytable etc. that I There is sopy copyright text in shell/ext-py/prettytable-0.7.1/COPYING. sqlparse is also in COPYING. sasl points to toddlipcon's github page, which says the code was borrowed from qpid, which is an apache project. Do you think there needs to be any copying there? -- To view, visit http://gerrit.cloudera.org:8080/3995 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I24a868aec6a4f17f4ccca1b088d2f0de32f75d87 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3999: Embed the query-wide fragment instance index in the instance id
Marcel Kornacker has uploaded a new patch set (#2). Change subject: IMPALA-3999: Embed the query-wide fragment instance index in the instance id .. IMPALA-3999: Embed the query-wide fragment instance index in the instance id This adds utility function in uid-util.h to create query and instance ids and convert between them. It also adapts SimpleScheduler to utilize those functions when creating the instance id (TPlanFragmentInstanceCtx.fragment_instance_id). Change-Id: Ia56a03ed9a1d7e77c72b66a01cd48c5b6bf3624f --- M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/impala-server.cc M be/src/util/CMakeLists.txt A be/src/util/uid-util-test.cc M be/src/util/uid-util.h M common/thrift/ImpalaInternalService.thrift 7 files changed, 106 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/4065/2 -- To view, visit http://gerrit.cloudera.org:8080/4065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia56a03ed9a1d7e77c72b66a01cd48c5b6bf3624f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker
[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose
Sailesh Mukil has uploaded a new patch set (#3). Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose .. IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose The webserver address was always configured as 0.0.0.0 (meaning that the webserver could be reached on any IP for that machine) unless otherwise specified. This is not a correct value to dispay to the user. This patch returns the hostname of the node, when requested, if the webserver host address is 0.0.0.0. This patch also does not print the coordinator link for very simple queries, as it's not necessary and is unnecessarily verbose. This patch also does away with pinging the impalad an extra time per query for finding the host time and webserver address. It instead remembers the webserver address at connect time and displays client local time for every query instead. Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b --- M be/src/service/impala-beeswax-server.cc M be/src/util/webserver.cc M common/thrift/ImpalaService.thrift M shell/impala_client.py M shell/impala_shell.py 5 files changed, 31 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/3994/3 -- To view, visit http://gerrit.cloudera.org:8080/3994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] Add .clang-format for Impala's C++ style
Marcel Kornacker has posted comments on this change. Change subject: Add .clang-format for Impala's C++ style .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/3886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/3994/2/be/src/util/webserver.cc File be/src/util/webserver.cc: PS2, Line 223: sWildcardAddress(http_address_ > Can you use IsWildcardAddress(http_address_) ? Done PS2, Line 224: If GetHostname() returns an error, hostname will remain as > um? if GetHostname() returns an error, hostname will be undefined. I assign hostname in L222. So if GetHostname() returns an error, hostname will remain whatever value it was (which in this case is 0.0.0.0). Clarified the comment a bit. http://gerrit.cloudera.org:8080/#/c/3994/2/shell/impala_client.py File shell/impala_client.py: PS2, Line 238: esult > Just return result? Done http://gerrit.cloudera.org:8080/#/c/3994/2/shell/impala_shell.py File shell/impala_shell.py: PS2, Line 686: ult.version > why? Else it will print http://localhost:25000 vs http://127.0.0.1:25000 If you think that's fine, I can remove this line. Line 877: > long line Done PS2, Line 880: (time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()), : self.webserver_address)) > does this print for 'use' statements as well? I still think that's a bit ve Yes it does. I forgot that we never used to print the time before. I've removed it. -- To view, visit http://gerrit.cloudera.org:8080/3994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose
Henry Robinson has posted comments on this change. Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/3994/2/be/src/util/webserver.cc File be/src/util/webserver.cc: PS2, Line 224: If GetHostname() returns an error, hostname will be 0.0.0.0 > I assign hostname in L222. So if GetHostname() returns an error, hostname w GetHostname() makes no guarantees about what happens to the argument in the error case. -- To view, visit http://gerrit.cloudera.org:8080/3994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose
Henry Robinson has posted comments on this change. Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/3994/2/shell/impala_shell.py File shell/impala_shell.py: PS2, Line 686: self.webserver_address.replace("localhost", "127.0.0.1") > Else it will print http://localhost:25000 vs http://127.0.0.1:25000 What's wrong with that? They're both the same address. -- To view, visit http://gerrit.cloudera.org:8080/3994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3999: Embed the query-wide fragment instance index in the instance id
Henry Robinson has posted comments on this change. Change subject: IMPALA-3999: Embed the query-wide fragment instance index in the instance id .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/4065/2//COMMIT_MSG Commit Message: PS2, Line 7: 3999 nit: please use 3988; 3999 is a duplicate. PS2, Line 9: This adds utility function in uid-util.h to create query and instance ids and convert between them. : It also adapts SimpleScheduler to utilize those functions when creating the instance id : (TPlanFragmentInstanceCtx.fragment_instance_id). Long lines. http://gerrit.cloudera.org:8080/#/c/4065/2/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: PS2, Line 512: CreateInstanceId(schedule->query_id(), instance_idx, &instance_id); not clear that you really need this instead of just keeping a TUniqueId and incrementing id.lo on every time through this loop. http://gerrit.cloudera.org:8080/#/c/4065/2/be/src/util/uid-util.h File be/src/util/uid-util.h: PS2, Line 62: void why not return TUniqueId? Easier to use inline. PS2, Line 69: void again, why not return? http://gerrit.cloudera.org:8080/#/c/4065/2/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: Line 257: // The bottom NUM_FRAGMENT_INSTANCE_IDX_BITS are 0. refer to where this is defined -- To view, visit http://gerrit.cloudera.org:8080/4065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia56a03ed9a1d7e77c72b66a01cd48c5b6bf3624f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3610: Account for memory used by filters in the coordinator
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/4066 Change subject: IMPALA-3610: Account for memory used by filters in the coordinator .. IMPALA-3610: Account for memory used by filters in the coordinator Before this patch, Impala would not account for the memory used to aggregate runtime filters together in the coordinator. Impala's memory could therefore be silently overcommitted. This patch accounts for aggregated filter memory in a new filter memtracker that is attached to the coordinator's query_mem_tracker(). If the query memory limit is exceeded when a filter update arrives, that update is discarded. If the filter is from a partitioned join, the entire filter can therefore be discarded immediately (to alleviate memory pressure) and a dummy 'always true' filter is sent to backends to unblock them. If the filter is from a broadcast join, no aggregation is done, so there is no tracking. The Thrift input and output filter data structures are not tracked (as we generally don't track RPC objects, but plan to in the future). Memory that is added to a memtracker must always be released. To do this, we need to signal to the coordinator that it is finished, and that there is no point trying to process any future updates that might arrive concurrently. This patch adds Coordinator::Done() which is called from QueryExecState::Done(), and which releases memory from all in-process runtime filters. Finally, this patch increases the upper limit for runtime filters to 512MB. This allows testing on very large datasets. The default maximum is still 16MB, per RUNTIME_FILTER_MAX_SIZE. Testing: Added a new test that triggers the OOM condition on the coordinator. All existing runtime filter tests pass. Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b --- M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/mem-tracker.h M be/src/runtime/runtime-filter-bank.h M be/src/service/impala-server.h M be/src/service/query-exec-state.cc M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters_phj.test 7 files changed, 138 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/4066/1 -- To view, visit http://gerrit.cloudera.org:8080/4066 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] IMPALA-3610: Account for memory used by filters in the coordinator
Henry Robinson has posted comments on this change. Change subject: IMPALA-3610: Account for memory used by filters in the coordinator .. Patch Set 1: I took Sailesh's suggestion of special-casting the treatment of broadcast filters and not copying them into an intermediate filter in the filter bank. While this is a big simplification, it does make it a bit harder to track the outgoing Thrift filter data structure; the rough argument in favour of that being ok is that the incoming data structure is not tracked (yet) anyhow, so we're really moving the memory used for that to the outgoing data structure. It would be good to fix that for all Thrift structures. -- To view, visit http://gerrit.cloudera.org:8080/4066 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose
Sailesh Mukil has uploaded a new patch set (#4). Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose .. IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose The webserver address was always configured as 0.0.0.0 (meaning that the webserver could be reached on any IP for that machine) unless otherwise specified. This is not a correct value to dispay to the user. This patch returns the hostname of the node, when requested, if the webserver host address is 0.0.0.0. This patch also does not print the coordinator link for very simple queries, as it's not necessary and is unnecessarily verbose. This patch also does away with pinging the impalad an extra time per query for finding the host time and webserver address. It instead remembers the webserver address at connect time and displays client local time for every query instead. Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b --- M be/src/service/impala-beeswax-server.cc M be/src/util/webserver.cc M common/thrift/ImpalaService.thrift M shell/impala_client.py M shell/impala_shell.py 5 files changed, 30 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/3994/4 -- To view, visit http://gerrit.cloudera.org:8080/3994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose .. Patch Set 4: (2 comments) Also, all the shell tests pass with these changes. http://gerrit.cloudera.org:8080/#/c/3994/2/be/src/util/webserver.cc File be/src/util/webserver.cc: PS2, Line 224: e).ok()) { > GetHostname() makes no guarantees about what happens to the argument in the Makes sense. Done. http://gerrit.cloudera.org:8080/#/c/3994/2/shell/impala_shell.py File shell/impala_shell.py: PS2, Line 686: ult.version > What's wrong with that? They're both the same address. I was under the impression that some shell environments wouldn't create an automatic hyperlink for an address with "localhost". But changed it in any case as it doesn't matter that much. -- To view, visit http://gerrit.cloudera.org:8080/3994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3943: Adhere to abort on error in ProcessFooter().
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3943: Adhere to abort_on_error in ProcessFooter(). .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/3862/7/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 188: if (!status.ok()) return state_->LogOrReturnError(status.msg()); > Don't we want to return early if there was any kind of error? If its a non- Please disregard this comment, I was confused. -- To view, visit http://gerrit.cloudera.org:8080/3862 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6aff766a1ce6376efb329bdde51c648149dfe08c Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures .. IMPALA-3962: Clean up scratch tuple batch on scan failures The parquet scanner doesn't clean up 'scratch_batch_' properly which causes it to process a partially filled scratch_batch_ if any of the column reader fails. This change cleans up the scratch batch if any of the parquet column readers fails. The clean up involves freeing the mem_pool of scratch_batch_ and setting number of tuples in scratch_batch_ to 0. This change also extends debug action to emulate the behavior of exceeding the query's memory limit. Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5 Reviewed-on: http://gerrit.cloudera.org:8080/3991 Reviewed-by: Michael Ho Tested-by: Internal Jenkins --- M be/src/exec/exec-node.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M common/thrift/PlanNodes.thrift M tests/failure/test_failpoints.py 10 files changed, 76 insertions(+), 22 deletions(-) Approvals: Michael Ho: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3991 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/3991 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3999: Embed the query-wide fragment instance index in the instance id
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3999: Embed the query-wide fragment instance index in the instance id .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/4065/2//COMMIT_MSG Commit Message: PS2, Line 7: 3999 > nit: please use 3988; 3999 is a duplicate. Done PS2, Line 9: This adds utility function in uid-util.h to create query and instance ids and convert between them. : It also adapts SimpleScheduler to utilize those functions when creating the instance id : (TPlanFragmentInstanceCtx.fragment_instance_id). > Long lines. this is a weird line limit, where did this come from? http://gerrit.cloudera.org:8080/#/c/4065/2/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: PS2, Line 512: CreateInstanceId(schedule->query_id(), instance_idx, &instance_id); > not clear that you really need this instead of just keeping a TUniqueId and this way it's easier to change the layout, if that should ever be needed. and i don't see any downside. http://gerrit.cloudera.org:8080/#/c/4065/2/be/src/util/uid-util.h File be/src/util/uid-util.h: PS2, Line 62: void > why not return TUniqueId? Easier to use inline. we typically only return scalars PS2, Line 69: void > again, why not return? see above http://gerrit.cloudera.org:8080/#/c/4065/2/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: Line 257: // The bottom NUM_FRAGMENT_INSTANCE_IDX_BITS are 0. > refer to where this is defined Done -- To view, visit http://gerrit.cloudera.org:8080/4065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia56a03ed9a1d7e77c72b66a01cd48c5b6bf3624f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] Add .clang-format for Impala's C++ style
Jim Apple has posted comments on this change. Change subject: Add .clang-format for Impala's C++ style .. Patch Set 6: > Seems like there's consensus on the mailing list. Could you add a > page to the wiki explaining what this is, how we expect it to be > used and so on? Added a brief note: https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=65147115&selectedPageVersions=10&selectedPageVersions=9 The official clang-format documentation I linked to explains how to use clang-format well, and briefly. -- To view, visit http://gerrit.cloudera.org:8080/3886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No