[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL
Donghui Xu has posted comments on this change. Change subject: IMPALA-5513: Fix display message exception when using invalid KEYVAL .. Patch Set 6: (1 comment) I have modified comit message. http://gerrit.cloudera.org:8080/#/c/7187/5//COMMIT_MSG Commit Message: PS5, Line 9: > Prose lines in commit messages should neither begin nor end with spaces. Done -- To view, visit http://gerrit.cloudera.org:8080/7187 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL
Donghui Xu has uploaded a new patch set (#6). Change subject: IMPALA-5513: Fix display message exception when using invalid KEYVAL .. IMPALA-5513: Fix display message exception when using invalid KEYVAL Function print_to_stderr() has a syntax error when an error message is displayed. I solved this problem by exchanging the position of variable and the subsequent strings in function print_to_stderr(). Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6 --- M shell/impala_shell.py M tests/shell/test_shell_commandline.py 2 files changed, 8 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/7187/6 -- To view, visit http://gerrit.cloudera.org:8080/7187 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Dan Hecht has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. Patch Set 12: (25 comments) http://gerrit.cloudera.org:8080/#/c/7223/12/be/src/exec/blocking-join-node.cc File be/src/exec/blocking-join-node.cc: PS12, Line 192: have_async_build_thread_token_ do we still need this? i guess the idea is to acquire the thread resource in Open() as well? Or is there another reason? http://gerrit.cloudera.org:8080/#/c/7223/11/be/src/exec/blocking-join-node.h File be/src/exec/blocking-join-node.h: PS11, Line 111: or for Line 119: /// SendBuildInputToSink is called to allocate resources for this ExecNode. once we do true multithreading, does this go away? since we'll be able to open the build child inside of this Open(), right? Maybe leave a todo about that to clarify why this is needed. http://gerrit.cloudera.org:8080/#/c/7223/11/be/src/exec/exec-node.h File be/src/exec/exec-node.h: PS11, Line 87: o nit extra space http://gerrit.cloudera.org:8080/#/c/7223/12/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: PS12, Line 395: // Total of minimum memory reservations for a host in bytes. Higher than : // per_host_min_reservation if not all reservations are used concurrently. I can't figure out what this means from just reading this comment. Are there more than one per_host_min_reservation for each host? And why would this be higher if not all reservations are used concurrently (seems like that condition would mean the needed reservation is lower). are one of these values related to all the fragment's instances? or are they really only about the fragment (i.e. what single instance would consume)? Also, in the comment before, it says "buffer reservation" but here we say "memory reservation" - are these the same? http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/JoinNode.java File fe/src/main/java/org/apache/impala/planner/JoinNode.java: PS12, Line 651: build I guess here you mean build as a noun rather than a verb. i.e. this is not the memory used when performing the build. maybe reword to clarify as I had to read the code to understand the comment. Either the next comment is sufficient, or you could reword this to explain the case of when no additional resources are needed. Alternatively, can't we think of this as the resources of this node (but not its children) during probing? I.e. call this probePhaseProfile and rename probePhaseProfile to execProfile below. (See comment in ExecPhaseResourceProfiles about naming). PS12, Line 654: !BackendConfig.INSTANCE.isPartitionedHashJoinEnabled() why is that? the old ones don't consume bufferpool reservations. but also, i think we override this flag for certain build types that aren't supported by the old join. PS12, Line 667: . ... because of the async thread. ? http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java File fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java: PS12, Line 85: joinIds where is that used? http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/PlanFragment.java File fe/src/main/java/org/apache/impala/planner/PlanFragment.java: PS12, Line 108: plan fragment per host is that for all instances of this fragment running on a host? or something else? PS12, Line 139: nodes isn't that always empty? if so, how about just allocating it here rather than taking it as a param? PS12, Line 222: join build sinks but what about the resources of the fragment itself? PS12, Line 233: peakResources maybe call that fInstanceResources or instanceResources or perFInstanceResources, etc. http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: PS12, Line 637: between Open() is that inclusive of Open()? if so, the name postOpenProfile seems a bit misleading. Maybe duringOpenProfile should just be called openProfile and postOpenProfile should be execProfile? PS12, Line 645: computeResourceProfile computeNodeResourceProfile PS12, Line 658: child is open Maybe say "until after the child's Open() returns." It ultimately means the same thing, but it took me a few minutes to follow what this was trying to tell me about the equation below. http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: Line 62: // estimates of zero, even if the contained PlanNodes have estimates of zero. how is this chosen and why do we need it? PS12, Line 353: Peak it seems like we're using peak and sum to mean the same thing in this function. or is there a subtle distinction? Line 354: // Total of per-host min
[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option
Donghui Xu has posted comments on this change. Change subject: IMPALA-5507: Add clear description to help information of KEYVAL option .. Patch Set 6: (1 comment) I hava added description about underscores. Than you. http://gerrit.cloudera.org:8080/#/c/7179/5/shell/option_parser.py File shell/option_parser.py: Line 164: " contains alphanumeric characters or underscores.") > This actually is incorrect - it can also contain underscores. Sorry for missing underline. -- To view, visit http://gerrit.cloudera.org:8080/7179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option
Donghui Xu has uploaded a new patch set (#6). Change subject: IMPALA-5507: Add clear description to help information of KEYVAL option .. IMPALA-5507: Add clear description to help information of KEYVAL option Help information of KEYVAL option in impala-shell is not clear enough. I fix this issue by adding clear description to help information of KEYVAL option. Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63 --- M shell/option_parser.py 1 file changed, 7 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/7179/6 -- To view, visit http://gerrit.cloudera.org:8080/7179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-5514: Throw an error when --ldap password cmd is used without LDAP auth
Donghui Xu has posted comments on this change. Change subject: IMPALA-5514: Throw an error when --ldap_password_cmd is used without LDAP auth .. Patch Set 3: (3 comments) Thank you for the suggestion. I have modified the code. http://gerrit.cloudera.org:8080/#/c/7188/2//COMMIT_MSG Commit Message: PS2, Line 7: Throw an error when --ldap_password_cmd is used with > Sorry for nitpicking, I don't think judgement is the right word here. How a Done PS2, Line 11: > nit: Please remove trailing spaces (here and below) Done http://gerrit.cloudera.org:8080/#/c/7188/2/shell/impala_shell.py File shell/impala_shell.py: PS2, Line 1333: Option --ldap_password_cmd requires using LDAP authentication m > IMO something like "Option --ldap_password_cmd requires using LDAP authenti Done -- To view, visit http://gerrit.cloudera.org:8080/7188 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5514: Throw an error when --ldap password cmd is used without LDAP auth
Donghui Xu has uploaded a new patch set (#3). Change subject: IMPALA-5514: Throw an error when --ldap_password_cmd is used without LDAP auth .. IMPALA-5514: Throw an error when --ldap_password_cmd is used without LDAP auth When only with ldap_password_cmd option, impala-shell runs successfully. I solved this problem by throwing an error when --ldap_password_cmd is used without LDAP auth, that is, ldap_password_cmd option will only take effect if ldap option presents. Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e --- M shell/impala_shell.py 1 file changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7188/3 -- To view, visit http://gerrit.cloudera.org:8080/7188 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-5333: [DOCS] Document Impala ADLS support
John Russell has uploaded a new patch set (#2). Change subject: IMPALA-5333: [DOCS] Document Impala ADLS support .. IMPALA-5333: [DOCS] Document Impala ADLS support Change-Id: Id5a98217741e5d540d9874e9b30e36f01644ef14 --- M docs/impala.ditamap M docs/shared/impala_common.xml A docs/topics/impala_adls.xml M docs/topics/impala_parquet_file_size.xml 4 files changed, 721 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/7175/2 -- To view, visit http://gerrit.cloudera.org:8080/7175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id5a98217741e5d540d9874e9b30e36f01644ef14 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: David Knupp Gerrit-Reviewer: John Russell Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL
Jim Apple has posted comments on this change. Change subject: IMPALA-5513: Fix display message exception when using invalid KEYVAL .. Patch Set 5: (1 comment) Bharath, if you're OK with the test, then once my comment is addressed I will be OK with this change. http://gerrit.cloudera.org:8080/#/c/7187/5//COMMIT_MSG Commit Message: PS5, Line 9: Prose lines in commit messages should neither begin nor end with spaces. -- To view, visit http://gerrit.cloudera.org:8080/7187 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL
Donghui Xu has posted comments on this change. Change subject: IMPALA-5513: Fix display message exception when using invalid KEYVAL .. Patch Set 5: (4 comments) I have modified the code according to your opinion and added the test case. http://gerrit.cloudera.org:8080/#/c/7187/4//COMMIT_MSG Commit Message: PS4, Line 9: print_to_stderr > Please fix as you did below. Done PS4, Line 10: d > The usual commit message style is that prose lines do not begin with spaces Done http://gerrit.cloudera.org:8080/#/c/7187/4/shell/impala_shell.py File shell/impala_shell.py: PS4, Line 1252: > Don't think this is needed. Done Line 1253: 'It must follow the pattern "KEY=VALUE".') > Mind adding a quick test in test_var_replacement() to cover this? Done -- To view, visit http://gerrit.cloudera.org:8080/7187 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL
Donghui Xu has uploaded a new patch set (#5). Change subject: IMPALA-5513: Fix display message exception when using invalid KEYVAL .. IMPALA-5513: Fix display message exception when using invalid KEYVAL Function print_to_stderr() has a syntax error when an error message is displayed. I solved this problem by exchanging the position of variable and the subsequent strings in function print_to_stderr(). Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6 --- M shell/impala_shell.py M tests/shell/test_shell_commandline.py 2 files changed, 8 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/7187/5 -- To view, visit http://gerrit.cloudera.org:8080/7187 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option
Jim Apple has posted comments on this change. Change subject: IMPALA-5507: Add clear description to help information of KEYVAL option .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/7179/5/shell/option_parser.py File shell/option_parser.py: Line 164: " contains only alphanumeric characters.") This actually is incorrect - it can also contain underscores. -- To view, visit http://gerrit.cloudera.org:8080/7179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5333: [DOCS] Document Impala ADLS support
John Russell has posted comments on this change. Change subject: IMPALA-5333: [DOCS] Document Impala ADLS support .. Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/7175/1/docs/topics/impala_adls.xml File docs/topics/impala_adls.xml: Line 154: > L154 has trailing white space. It's best to remove trailing white space fro Absolutely. I've gotten spoiled by a pre-commit hook that auto-removes any trailing spaces. Wow, looking at the hook I see that check & fixup has been commented out for some time now! I see that a few more have gotten into other files but I'll fix those separately. PS1, Line 156: As an alternative, specify the credentials in environment variables before starting the impalad : daemon. > Is it possible to list a mapping between core-site.xml settings above and e Yes. Need to consult with Sailesh or I think there is some CDH doc for ADLS I can adapt. Line 219: > This is still empty. :) Need to consult with Sailesh. I haven't had hands-on experience with ADLS yet the way I have with S3. PS1, Line 261: You point > Maybe just an imperative "Point" ? Almost. I'll word it as "To X, do Y". (Although now the wording on the S3 page is slightly different. Too much trouble to track down every instance in upstream and downstream docs to reconcile them all.) PS1, Line 279: For example, > Because the paragraph starting at L269 mentions a database with a LOCATION, Done PS1, Line 313: !??? ls adl://impala-demo/dir1/dir2/dir3 --recursive; What's the ADLS command line syntax to do this? (I'm basing this on an S3 example with 'aws s3' commands. PS1, Line 329: !??? ls adl://impala-demo/dir1/dir2/dir3 --recursive; Same question as on line 313. -- To view, visit http://gerrit.cloudera.org:8080/7175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id5a98217741e5d540d9874e9b30e36f01644ef14 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: David Knupp Gerrit-Reviewer: John Russell Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option
Donghui Xu has posted comments on this change. Change subject: IMPALA-5507: Add clear description to help information of KEYVAL option .. Patch Set 5: (3 comments) I have solved the suggestion you have mentioned. http://gerrit.cloudera.org:8080/#/c/7179/4/shell/option_parser.py File shell/option_parser.py: PS4, Line 88: If the argument to -f is "-", then queries are read from stdin" : " and terminated with ctrl-d > Try "If the argument to -f is "-", then queries are read from stdin and ter Done PS4, Line 163: starts with an alphabetic character and" > "starts with an alphabetic character" Done PS4, Line 164: only alphanumeric characters.") > "only alphanumeric characters." Done -- To view, visit http://gerrit.cloudera.org:8080/7179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option
Donghui Xu has uploaded a new patch set (#5). Change subject: IMPALA-5507: Add clear description to help information of KEYVAL option .. IMPALA-5507: Add clear description to help information of KEYVAL option Help information of KEYVAL option in impala-shell is not clear enough. I fix this issue by adding clear description to help information of KEYVAL option. Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63 --- M shell/option_parser.py 1 file changed, 7 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/7179/5 -- To view, visit http://gerrit.cloudera.org:8080/7179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. Patch Set 12: rebase -- To view, visit http://gerrit.cloudera.org:8080/7223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Tim Armstrong has uploaded a new patch set (#12). Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. IMPALA-4862: make resource profile consistent with backend behaviour This moves away from the PipelinedPlanNodeSet approach of enumerating sets of concurrently-executing nodes because unions would force creating many overlapping sets of nodes. The new approach computes the peak resources during Open() and the peak resources between Open() and Close() (i.e. while calling GetNext()) bottom-up for each plan node in a fragment. The fragment resources are then combined to produce the query resources. The basic assumptions for the new resource estimates are: * resources are acquired during or after the first call to Open() and released in Close(). * Blocking nodes call Open() on their child before acquiring their own resources (this required some backend changes). * Blocking nodes call Close() on their children before returning from Open(). * The peak resource consumption of the query is the sum of the independent fragments (except for the parallel join build plans where we can assume there will be synchronisation). This is conservative but we don't synchronise fragment Open() and Close() across exchanges so can't make stronger assumptions in general. Also compute the sum of minimum reservations. This will be useful in the backend to determine exactly when all of the initial reservations have been claimed from a shared pool of initial reservations. Testing: * Updated planner tests to reflect behavioural changes. * Added extra resource requirement planner tests for unions, subplans, pipelines of blocking operators, and bushy join plans. * Added single-node plans to resource-requirements tests. These have more complex plan trees inside a single fragment, which is useful for testing the peak resource requirement logic. Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d --- M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h M be/src/exec/exec-node.h M be/src/exec/hash-join-node.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/sort-node.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/common/TreeNode.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataSink.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java D fe/src/main/java/org/apache/impala/planner/PipelinedPlanNodeSet.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/SubplanNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tabl
[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5036: Parquet count star optimization .. Patch Set 11: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5036: Parquet count star optimization .. IMPALA-5036: Parquet count star optimization Instead of materializing empty rows when computing count star, we use the data stored in the Parquet RowGroup.num_rows field. The Parquet scanner tuple is modified to have one slot into which we will write the num rows statistic. The aggregate function is changed from count to a special sum function that gets initialized to 0. We also add a rewrite rule so that count() is rewritten to count(*) in order to make sure that this optimization is applied in all cases. Testing: - Added functional and planner tests Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Reviewed-on: http://gerrit.cloudera.org:8080/6812 Reviewed-by: Taras Bobrovytsky Tested-by: Impala Public Jenkins --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/FunctionParams.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java A fe/src/main/java/org/apache/impala/rewrite/NormalizeCountStarRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test A testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-query/queries/QueryTest/aggregation.test A testdata/workloads/functional-query/queries/QueryTest/parquet-stats-agg.test R testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test M tests/query_test/test_aggregation.py M tests/query_test/test_parquet_stats.py 27 files changed, 873 insertions(+), 82 deletions(-) Approvals: Impala Public Jenkins: Verified Taras Bobrovytsky: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-5618: buffered-tuple-stream-v2 fixes
Tim Armstrong has uploaded a new patch set (#2). Change subject: IMPALA-5618: buffered-tuple-stream-v2 fixes .. IMPALA-5618: buffered-tuple-stream-v2 fixes This fixes two issues: * AddRowCustom() caused a performance regression when the function was heap-allocated. This is solved by splitting the API into two separate calls. This imposes an additional burden on the caller but it is easier to reason about its performance. * Allow re-reading streams with 'delete_on_read_' set so long as no rows were read from the stream. This is necessary for some spilling ExecNodes that prepare the stream for reading in order to acquire the buffer, but then need to spill the stream to free memory before they actually are able to read the stream. Change-Id: Ibab0d774f66be632f17376a56abf302821cca047 --- M be/src/runtime/buffered-tuple-stream-v2-test.cc M be/src/runtime/buffered-tuple-stream-v2.cc M be/src/runtime/buffered-tuple-stream-v2.h M be/src/runtime/buffered-tuple-stream-v2.inline.h 4 files changed, 81 insertions(+), 73 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/7358/2 -- To view, visit http://gerrit.cloudera.org:8080/7358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibab0d774f66be632f17376a56abf302821cca047 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. Patch Set 10: (10 comments) http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/blocking-join-node.cc File be/src/exec/blocking-join-node.cc: Line 150: DCHECK(status != nullptr); > DCHECK(have_async_build_trhead_); Done Line 198: RETURN_IF_ERROR(AcquireResourcesForBuild(state)); > why do we do the build Open() here, rather than keep it in ProcessBuildInpu Done. The no-sink case will go away with the old aggs and joins fortunately. http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/blocking-join-node.h File be/src/exec/blocking-join-node.h: PS10, Line 69: it > nit: period Done PS10, Line 70: it it is started, > garbled Done PS10, Line 109: can be safely closed early > additionally, aren't we now requiring that it be closed early (i.e. the fro Yeah the planner depends on this. I updated the comment to mention that and hopefully be a bit clearer. Line 121: /// Processes the build-side input. > let's also say that the build side should already have been opened (though Done Line 141: Status SendBuildInputToSink(RuntimeState* state, DataSink* build_sink); > shouldn't this be private? i.e. we don't expect a derived class to call it Done Line 222: /// is used for the build. Otherwise, ProcessBuildInput() is called on the subclass. > explain that this opens the build Done http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/exec-node.h File be/src/exec/exec-node.h: PS10, Line 87: in general > why? are there cases this isn't true? I think the only exception is subplans. Updated. http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: PS10, Line 181: right away > what do we mean by "right away". It seemed like we were acquiring them righ The important this is that it's before the build. Updated the comment accordingly. -- To view, visit http://gerrit.cloudera.org:8080/7223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Tim Armstrong has uploaded a new patch set (#11). Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. IMPALA-4862: make resource profile consistent with backend behaviour This moves away from the PipelinedPlanNodeSet approach of enumerating sets of concurrently-executing nodes because unions would force creating many overlapping sets of nodes. The new approach computes the peak resources during Open() and the peak resources between Open() and Close() (i.e. while calling GetNext()) bottom-up for each plan node in a fragment. The fragment resources are then combined to produce the query resources. The basic assumptions for the new resource estimates are: * resources are acquired during or after the first call to Open() and released in Close(). * Blocking nodes call Open() on their child before acquiring their own resources (this required some backend changes). * Blocking nodes call Close() on their children before returning from Open(). * The peak resource consumption of the query is the sum of the independent fragments (except for the parallel join build plans where we can assume there will be synchronisation). This is conservative but we don't synchronise fragment Open() and Close() across exchanges so can't make stronger assumptions in general. Also compute the sum of minimum reservations. This will be useful in the backend to determine exactly when all of the initial reservations have been claimed from a shared pool of initial reservations. Testing: * Updated planner tests to reflect behavioural changes. * Added extra resource requirement planner tests for unions, subplans, pipelines of blocking operators, and bushy join plans. * Added single-node plans to resource-requirements tests. These have more complex plan trees inside a single fragment, which is useful for testing the peak resource requirement logic. Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d --- M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h M be/src/exec/exec-node.h M be/src/exec/hash-join-node.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/sort-node.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/common/TreeNode.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataSink.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java D fe/src/main/java/org/apache/impala/planner/PipelinedPlanNodeSet.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/SubplanNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tabl
[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option
John Russell has uploaded a new patch set (#2). Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode query option .. IMPALA-5583: [DOCS] Document default_join_distribution_mode query option New page for the query option. Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84 --- M docs/impala.ditamap M docs/impala_keydefs.ditamap A docs/topics/impala_default_join_distribution_mode.xml 3 files changed, 132 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/7300/2 -- To view, visit http://gerrit.cloudera.org:8080/7300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: John Russell
[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option
John Russell has posted comments on this change. Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode query option .. Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/7300/1/docs/impala.ditamap File docs/impala.ditamap: Line 179: > Why mention IMPALA-5583 also? In the past I've referred both to the "code implementation" JIRA and "document the new feature" JIRA in this kind of context. Just for ease of future maintenance and tracing if something is wrong or missing on the doc side. I guess that's less important when the doc one is a subtask of the code one. I'll take it out. http://gerrit.cloudera.org:8080/#/c/7300/1/docs/topics/impala_default_join_distribution_mode.xml File docs/topics/impala_default_join_distribution_mode.xml: Line 40: This option determines the join strategy that Impala uses when any of the tables > We deliberately did not use "join strategy" in the option name because stra Can you elaborate a little on the meaning of "join distribution mode" then? That's not terminology we've used elsewhere in the docs. Line 47: Hive ANALYZE TABLE statement. > Sure you want to keep the ANALYZE TABLE part? In most situations we cannot Done Line 48: By default, when a table involved in the join query does not have statistics, > Accuracy could be improved. What if both tables do not have stats? Clarify What is the answer if both tables are missing stats? Does Impala make a deduction about which is smaller and that one gets broadcast while the other doesn't? Line 58: might be missing statistics due to the overhead involved in calculating them, > I wouldn't suppose a particular reason for not having stats. Done Line 61: of a table involved in a join query and only transmits a portion of the table > Not very accurate, both tables are transferred across the network. Not sure I'd prefer to prepare and fine-tune a brief explanation so I could reuse that wording in places where such terminology is mentioned to a reader that might not have seen it before. Anyone who needs detailed background info can follow the "related info" links at the end of the page. Line 67: recommended when setting up and deploying new clusters. This setting is > We should mention why we recommend this. SHUFFLE is generally a safer optio Done -- To view, visit http://gerrit.cloudera.org:8080/7300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: John Russell Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5612: join inversion should factor in parallelism
Tim Armstrong has uploaded a new patch set (#2). Change subject: IMPALA-5612: join inversion should factor in parallelism .. IMPALA-5612: join inversion should factor in parallelism The join inversion optimisation did not factor in the degree of parallelism that the join executed with after inversion. In some cases this lead to bad decisions, e.g. executing a join on a single node instead of 20 nodes. This patch adds a more sophisticated cost model that factors degree of parallelism into the join inversion decision. The behaviour is unchanged if inversion does not change the degree of parallelism. Change-Id: Icacea4565ce25ef15aaab014684c9440dd501d4e --- M fe/src/main/java/org/apache/impala/planner/Planner.java M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-update.test M testdata/workloads/functional-planner/queries/PlannerTest/order.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test 8 files changed, 944 insertions(+), 850 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/7351/2 -- To view, visit http://gerrit.cloudera.org:8080/7351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icacea4565ce25ef15aaab014684c9440dd501d4e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-5612: join inversion should factor in parallelism
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5612: join inversion should factor in parallelism .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/7351/1/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: Line 425: // Invert the join if doing so reduces the estimated time to execute the join. > Possibly move to a separate method? Done Line 434: // * it costs the same to process each build and probe row. > The hash table build is a two pass while the probe is one. Done Line 472: perRowCost = rhsCard * rhsAvgRowSize; > Should check for overflows whenever multiplication is used. It's floating point arithmetic so it will just overflow to "infinity", which is ok. -- To view, visit http://gerrit.cloudera.org:8080/7351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icacea4565ce25ef15aaab014684c9440dd501d4e Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Complete guide to important environment variables for build, test, and mini-cluster operations.
Jim Apple has posted comments on this change. Change subject: Complete guide to important environment variables for build, test, and mini-cluster operations. .. Patch Set 1: > (2 comments) > > > > I personally abhor Google docs as they just seem to get lost in > > the > > > Wiki-sea over time. If we are going to take time to document > the > > > various set of environment variables controlling important > > aspects > > > of the build, why not make that accessible to both the internal > > and > > > external developers by including it in the codebase. > > > > > > This is my first draft of the standard build settings. > > > > What about the Wiki, like this page: > > https://cwiki.apache.org/confluence/display/IMPALA/Bootstrapping+an+Impala+Development+Environment+From+Scratch > > Our wikis are pretty nice, and for things like ASAN, totally worth > documenting separately, but for non-major features, keeping this up > to date still requires updating things in two places. In > experience, this isn't something people will do reliably unless > forced. I think we could force at least minimal documentation by > keeping a white-list of exported build variables and enforcing it > with a pre-checkin lint rule. Do you plan to add that lint rule to this patch? -- To view, visit http://gerrit.cloudera.org:8080/7350 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] Complete guide to important environment variables for build, test, and mini-cluster operations.
Zach Amsden has posted comments on this change. Change subject: Complete guide to important environment variables for build, test, and mini-cluster operations. .. Patch Set 1: (2 comments) > > I personally abhor Google docs as they just seem to get lost in > the > > Wiki-sea over time. If we are going to take time to document the > > various set of environment variables controlling important > aspects > > of the build, why not make that accessible to both the internal > and > > external developers by including it in the codebase. > > > > This is my first draft of the standard build settings. > > What about the Wiki, like this page: > https://cwiki.apache.org/confluence/display/IMPALA/Bootstrapping+an+Impala+Development+Environment+From+Scratch Our wikis are pretty nice, and for things like ASAN, totally worth documenting separately, but for non-major features, keeping this up to date still requires updating things in two places. In experience, this isn't something people will do reliably unless forced. I think we could force at least minimal documentation by keeping a white-list of exported build variables and enforcing it with a pre-checkin lint rule. http://gerrit.cloudera.org:8080/#/c/7350/1/bin/impala-config.sh File bin/impala-config.sh: Line 440 > Speaking confidently as the only team member with a fine arts education, I' Glad to know that my alternative career as the artist Deth Leprikon remains a well kept secret. http://gerrit.cloudera.org:8080/#/c/7350/1/docs/Developer-Guide/Build-Environment.txt File docs/Developer-Guide/Build-Environment.txt: Line 1: Environment Variables: > How many of these are set in impala-config.sh, and how many are set by othe Additional docs will be forthcoming -- To view, visit http://gerrit.cloudera.org:8080/7350 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5036: Parquet count star optimization .. Patch Set 11: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/824/ -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization
Lars Volker has posted comments on this change. Change subject: IMPALA-5036: Parquet count star optimization .. Patch Set 11: > The build was failing because the return type of some new getters > was a "const bool" (instead of "bool"), which caused a compiler > warning. Strangely the private build that uses internal Jenkins > passed successfully. Forwarding the +2. I think clang-tidy usually trips over the "const bool" return types. -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-5036: Parquet count star optimization .. Patch Set 11: Code-Review+2 The build was failing because the return type of some new getters was a "const bool" (instead of "bool"), which caused a compiler warning. Strangely the private build that uses internal Jenkins passed successfully. Forwarding the +2. -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization
Taras Bobrovytsky has uploaded a new patch set (#11). Change subject: IMPALA-5036: Parquet count star optimization .. IMPALA-5036: Parquet count star optimization Instead of materializing empty rows when computing count star, we use the data stored in the Parquet RowGroup.num_rows field. The Parquet scanner tuple is modified to have one slot into which we will write the num rows statistic. The aggregate function is changed from count to a special sum function that gets initialized to 0. We also add a rewrite rule so that count() is rewritten to count(*) in order to make sure that this optimization is applied in all cases. Testing: - Added functional and planner tests Change-Id: I536b85c014821296aed68a0c68faadae96005e62 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/FunctionParams.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java A fe/src/main/java/org/apache/impala/rewrite/NormalizeCountStarRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test A testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-query/queries/QueryTest/aggregation.test A testdata/workloads/functional-query/queries/QueryTest/parquet-stats-agg.test R testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test M tests/query_test/test_aggregation.py M tests/query_test/test_parquet_stats.py 27 files changed, 873 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6812/11 -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization
Hello Impala Public Jenkins, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6812 to look at the new patch set (#11). Change subject: IMPALA-5036: Parquet count star optimization .. IMPALA-5036: Parquet count star optimization Instead of materializing empty rows when computing count star, we use the data stored in the Parquet RowGroup.num_rows field. The Parquet scanner tuple is modified to have one slot into which we will write the num rows statistic. The aggregate function is changed from count to a special sum function that gets initialized to 0. We also add a rewrite rule so that count() is rewritten to count(*) in order to make sure that this optimization is applied in all cases. Testing: - Added functional and planner tests Change-Id: I536b85c014821296aed68a0c68faadae96005e62 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/FunctionParams.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java A fe/src/main/java/org/apache/impala/rewrite/NormalizeCountStarRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test A testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-query/queries/QueryTest/aggregation.test A testdata/workloads/functional-query/queries/QueryTest/parquet-stats-agg.test R testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test M tests/query_test/test_aggregation.py M tests/query_test/test_parquet_stats.py 27 files changed, 873 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6812/11 -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Dan Hecht has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. Patch Set 10: (10 comments) I need to read through the frontend code again before commenting on that, but here's some comments on the backend part. http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/blocking-join-node.cc File be/src/exec/blocking-join-node.cc: Line 150: DCHECK(status != nullptr); DCHECK(have_async_build_trhead_); Line 198: RETURN_IF_ERROR(AcquireResourcesForBuild(state)); why do we do the build Open() here, rather than keep it in ProcessBuildInputAndOpenProbe() like the other cases? between async/sync and sink/no-sink and subplan/no-subplan, there are a lot of cases to understand. http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/blocking-join-node.h File be/src/exec/blocking-join-node.h: PS10, Line 69: it nit: period PS10, Line 70: it it is started, garbled PS10, Line 109: can be safely closed early additionally, aren't we now requiring that it be closed early (i.e. the frontend assumes this)? hmm, or is this just an optimization in the async case? Line 121: /// Processes the build-side input. let's also say that the build side should already have been opened (though this code is going away soon). Line 141: Status SendBuildInputToSink(RuntimeState* state, DataSink* build_sink); shouldn't this be private? i.e. we don't expect a derived class to call it directly, right? Line 222: /// is used for the build. Otherwise, ProcessBuildInput() is called on the subclass. explain that this opens the build http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/exec-node.h File be/src/exec/exec-node.h: PS10, Line 87: in general why? are there cases this isn't true? http://gerrit.cloudera.org:8080/#/c/7223/10/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: PS10, Line 181: right away what do we mean by "right away". It seemed like we were acquiring them right away before (but too soon), no? -- To view, visit http://gerrit.cloudera.org:8080/7223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5514: Add the joint judgment of ldap password cmd and ldap
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-5514: Add the joint judgment of ldap_password_cmd and ldap .. Patch Set 2: (3 comments) Thanks for raising the jira and submitting a quick fix. The fix looks good mostly apart from a few nits. Mind fixing them? http://gerrit.cloudera.org:8080/#/c/7188/2//COMMIT_MSG Commit Message: PS2, Line 7: Add the joint judgment of ldap_password_cmd and ldap Sorry for nitpicking, I don't think judgement is the right word here. How about, "Throw an error when --ldap_password_cmd is used without LDAP auth" ? (Also please update the description below accordingly). PS2, Line 11: nit: Please remove trailing spaces (here and below) http://gerrit.cloudera.org:8080/#/c/7188/2/shell/impala_shell.py File shell/impala_shell.py: PS2, Line 1333: Must use LDAP authentication while retrieving the LDAP password IMO something like "Option --ldap_password_cmd requires using LDAP authentication mechanism (-l)" gives a more detailed description to the end user as to what the actual problem is. What do you think? -- To view, visit http://gerrit.cloudera.org:8080/7188 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily
Dan Hecht has posted comments on this change. Change subject: IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily .. Patch Set 5: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/7346/4/be/src/runtime/data-stream-sender.cc File be/src/runtime/data-stream-sender.cc: Line 482: RETURN_IF_ERROR(state->CheckQueryState()); > Done thanks, looks good to me. we really have to clean up how memory management works with UDFs to make it less brittle, but this is good for now. -- To view, visit http://gerrit.cloudera.org:8080/7346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia661eb8bed114070728a1497ccf7ed6893237e5e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-5513: Fix display message exception when using invalid KEYVAL .. Patch Set 4: (2 comments) Couple more minor comments in addition to Jim's review. http://gerrit.cloudera.org:8080/#/c/7187/4/shell/impala_shell.py File shell/impala_shell.py: PS4, Line 1252: \ Don't think this is needed. Line 1253: 'It must follow the pattern "KEY=VALUE".') Mind adding a quick test in test_var_replacement() to cover this? -- To view, visit http://gerrit.cloudera.org:8080/7187 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/7346/4/be/src/runtime/data-stream-sender.cc File be/src/runtime/data-stream-sender.cc: Line 482: RETURN_IF_ERROR(state->CheckQueryState()); > this is fine, but any reason to not just hoist it from the if-stmt to here, Done -- To view, visit http://gerrit.cloudera.org:8080/7346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia661eb8bed114070728a1497ccf7ed6893237e5e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option
Jim Apple has posted comments on this change. Change subject: IMPALA-5507: Add clear description to help information of KEYVAL option .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/7179/4/shell/option_parser.py File shell/option_parser.py: PS4, Line 88: Queries read from stdin and end with ctrl+d " : "if the argument to -f is -. Try "If the argument to -f is "-", then queries are read from stdin and terminated with ctrl-d." PS4, Line 163: has character of a-z or A-Z at the beginning "starts with an alphabetic character" PS4, Line 164: characters of a-z or A-Z and/or numbers. "only alphanumeric characters." -- To view, visit http://gerrit.cloudera.org:8080/7179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily
Hello Matthew Jacobs, Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7346 to look at the new patch set (#5). Change subject: IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily .. IMPALA-5611: KuduPartitionExpr holds onto memory unnecessarily IMPALA-3742 introduced KuduPartitionExpr, which takes a row and passes it to the Kudu client to determine what partitionit belongs to. The DataStreamSender never frees the local allocations for the Kudu partition exprs causing it to hang on to memory longer than it needs to. This patch also fixes two other related issues: - DataStreamSender was dropping the Status from AddRow in the Kudu branch. Adds 'RETURN_IF_ERROR' and 'WARN_UNUSED_RESULT' - Changes the HASH case in DataStreamSender to call FreeLocalAllocations on a per-batch basis, instead of a per-row basis. Testing: - Added an e2e test that runs a large insert with a mem limit that failed with oom previously. Change-Id: Ia661eb8bed114070728a1497ccf7ed6893237e5e --- M be/src/runtime/data-stream-sender.cc M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test 2 files changed, 13 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/7346/5 -- To view, visit http://gerrit.cloudera.org:8080/7346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia661eb8bed114070728a1497ccf7ed6893237e5e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5513: Fix display message exception when using invalid KEYVAL
Jim Apple has posted comments on this change. Change subject: IMPALA-5513: Fix display message exception when using invalid KEYVAL .. Patch Set 4: (2 comments) > (2 comments) > > I have modified the code according to your opinion. > I think your advice is fine, but how do i know who is familiar with > that file? git log $FILENAME http://gerrit.cloudera.org:8080/#/c/7187/4//COMMIT_MSG Commit Message: PS4, Line 9: Print_to_stderr Please fix as you did below. PS4, Line 10: The usual commit message style is that prose lines do not begin with spaces. -- To view, visit http://gerrit.cloudera.org:8080/7187 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] Complete guide to important environment variables for build, test, and mini-cluster operations.
David Knupp has posted comments on this change. Change subject: Complete guide to important environment variables for build, test, and mini-cluster operations. .. Patch Set 1: (2 comments) Whether it's a .txt file in the code base, or a wiki entry, how can we make sure that this doc stays in sync with impala-config.sh (and/or any other place where env vars get set)? We'd have multiple sources of truth, wouldn't we? http://gerrit.cloudera.org:8080/#/c/7350/1/bin/impala-config.sh File bin/impala-config.sh: Line 440 Speaking confidently as the only team member with a fine arts education, I'm shocked that I did not know this setting existed. :-) http://gerrit.cloudera.org:8080/#/c/7350/1/docs/Developer-Guide/Build-Environment.txt File docs/Developer-Guide/Build-Environment.txt: Line 1: Environment Variables: How many of these are set in impala-config.sh, and how many are set by other means? -- To view, visit http://gerrit.cloudera.org:8080/7350 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5036: Parquet count star optimization .. Patch Set 10: Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/823/ -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No