[native-toolchain-CR] IMPALA-13121: Switch to ccache 3.7.12

2024-05-31 Thread Joe McDonnell (Code Review)
Hello Laszlo Gaal,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21473

to look at the new patch set (#2).

Change subject: IMPALA-13121: Switch to ccache 3.7.12
..

IMPALA-13121: Switch to ccache 3.7.12

The docker images currently build and use ccache 3.3.3.
Recently, we ran into a case where debuginfo was being
generated even though the cflags ended with -g0. The
ccache release history has this note for 3.3.5:
 - Fixed a regression where the original order of
   debug options could be lost.

This upgrades ccache to 3.7.12 to address this issue.

Ccache 3.7.12 is the last ccache release that builds
using autotools. Ccache 4 moves to build with CMake.
Adding a CMake dependency would be complicated at this
stage, because some of the older OSes don't provide a
new enough CMake in the package repositories. Since we
don't really need the new features of Ccache 4+, this
sticks with 3.7.12 for now.

This reenables the check_ccache_works() logic in
assert-dependencies-present.py.

Testing:
 - Built docker images and ran a toolchain build
 - The newer ccache resolves the unexpected debuginfo issue

Change-Id: I90d751445daa0dc298b634c1049d637a14afac40
---
M docker/all/assert-dependencies-present.py
M docker/all/postinstall.sh
2 files changed, 5 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/73/21473/2
--
To view, visit http://gerrit.cloudera.org:8080/21473
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I90d751445daa0dc298b634c1049d637a14afac40
Gerrit-Change-Number: 21473
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 


[Impala-ASF-CR] IMPALA-13111: Fix the calculation of fragment ids for impala-gdb.py

2024-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21472 )

Change subject: IMPALA-13111: Fix the calculation of fragment ids for 
impala-gdb.py
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21472
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59798407e99ee0e9100cac6b4b082cdb85ed43d1
Gerrit-Change-Number: 21472
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 31 May 2024 22:57:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13111: Fix the calculation of fragment ids for impala-gdb.py

2024-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21472 )

Change subject: IMPALA-13111: Fix the calculation of fragment ids for 
impala-gdb.py
..

IMPALA-13111: Fix the calculation of fragment ids for impala-gdb.py

The gdb helpers in impala-gdb.py provide functions to look on
the stack for the information added in IMPALA-6416 and get the
fragment/query ids. Right now, it is incorrectly using a signed
integer, which leads to incorrect ids like this:
-3cbda1606b3ade7c:f170c4bd

This changes the logic to AND the integer with an 0xFF* sequence
of the right length. This forces the integer to be unsigned,
producing the right query id.

Testing:
 - Ran this on a minidump and verified the the listed query ids
   were valid (and existed in the profile log)

Change-Id: I59798407e99ee0e9100cac6b4b082cdb85ed43d1
Reviewed-on: http://gerrit.cloudera.org:8080/21472
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M lib/python/impala_py_lib/gdb/impala-gdb.py
1 file changed, 8 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21472
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I59798407e99ee0e9100cac6b4b082cdb85ed43d1
Gerrit-Change-Number: 21472
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 


[native-toolchain-CR] IMPALA-13121: Switch to ccache 3.7.12

2024-05-31 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21473


Change subject: IMPALA-13121: Switch to ccache 3.7.12
..

IMPALA-13121: Switch to ccache 3.7.12

The docker images currently build and use ccache 3.3.3.
Recently, we ran into a case where debuginfo was being
included even though the cflags ended with -g0. The
ccache release history has this note for 3.3.5:
 - Fixed a regression where the original order of
   debug options could be lost.

It's possible that this is the cause for the unexpected
behavior, so this upgrades ccache to 3.7.12.

Ccache 3.7.12 is the last ccache release that builds
using autotools. Later releases move to cmake. It is
hard for us to build things with cmake at this stage,
because some older OSes don't provide a new enough cmake.
Since we don't really need the new features of ccache 4+,
this just sticks with 3.7.12 for now.

Testing:
 - Built docker images and ran a toolchain build

Change-Id: I90d751445daa0dc298b634c1049d637a14afac40
---
M docker/all/postinstall.sh
1 file changed, 3 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/73/21473/1
--
To view, visit http://gerrit.cloudera.org:8080/21473
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I90d751445daa0dc298b634c1049d637a14afac40
Gerrit-Change-Number: 21473
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[native-toolchain-CR] IMPALA-13072: Increase verbosity to print compilation commands

2024-05-31 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21470 )

Change subject: IMPALA-13072: Increase verbosity to print compilation commands
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1a27bfe401c5def341b2e52f7d5e77d7f0e3a19e
Gerrit-Change-Number: 21470
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 31 May 2024 21:32:39 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-12686: Build the toolchain with basic debug information (-g1)

2024-05-31 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20870 )

Change subject: IMPALA-12686: Build the toolchain with basic debug information 
(-g1)
..


Patch Set 3: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/20870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee2e264b281f83ebc226d9bf7d4e5a99a52f1fc6
Gerrit-Change-Number: 20870
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 31 May 2024 21:32:28 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-12686: Build the toolchain with basic debug information (-g1)

2024-05-31 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20870 )

Change subject: IMPALA-12686: Build the toolchain with basic debug information 
(-g1)
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20870/2/source/cmake/build.sh
File source/cmake/build.sh:

http://gerrit.cloudera.org:8080/#/c/20870/2/source/cmake/build.sh@27
PS2, Line 27: # Drop the -g1/-gz flags for CMake, as it is just a build utility 
and does not
> I think it might be related to ccache. We use ccache 3.3.3, but 3.3.5 relea
Also, adding CMAKE_BUILD_TYPE=Release for mold seems to matter in my local 
tests, so I'm adding that.



--
To view, visit http://gerrit.cloudera.org:8080/20870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee2e264b281f83ebc226d9bf7d4e5a99a52f1fc6
Gerrit-Change-Number: 20870
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 31 May 2024 21:25:12 +
Gerrit-HasComments: Yes


[native-toolchain-CR] IMPALA-12686: Build the toolchain with basic debug information (-g1)

2024-05-31 Thread Joe McDonnell (Code Review)
Hello Michael Smith,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20870

to look at the new patch set (#3).

Change subject: IMPALA-12686: Build the toolchain with basic debug information 
(-g1)
..

IMPALA-12686: Build the toolchain with basic debug information (-g1)

Basic debug information (e.g. -g1) is useful for getting better
stack traces. Compressing debug information (-gz) reduces the size
of this debug information dramatically. This adds both -g1 and -gz
to the default flags for the toolchain.

In order to have the flags apply uniformly, components need to
respect the existing CFLAGS and CXXFLAGS. Several components were
setting their own CFLAGS and CXXFLAGS without keeping the existing
flags, so this either fixes the components to keep existing flags
or removes the custom CFLAGS/CXXFLAGS.

Some components build with an extra -g after our flags and these
continue to build this way. Specifically, the following using -g:
 - Thrift (-g added in this change)
 - Kudu
 - ORC
 - CCTZ
 - bzip2
These keep the same debug information they had before, except that
-gz compresses it now. This change should not reduce the debug
information for any component.

This skips generating debug information for CMake and Mold as they
are just build tools.

Testing:
 - Ran a build and examined what happened to the package sizes.
 - Added verbosity to print the compiler commands and verified
   the flags were present

Change-Id: Iee2e264b281f83ebc226d9bf7d4e5a99a52f1fc6
---
M init-compiler.sh
M source/arrow/build.sh
M source/bzip2/build.sh
M source/calloncehack/calloncehack/CMakeLists.txt
M source/cloudflarezlib/build.sh
M source/cmake/build.sh
M source/curl/build.sh
M source/flatbuffers/build.sh
M source/gflags/build.sh
M source/glog/build.sh
M source/libpfm/build.sh
M source/llvm/build-source-tarball.sh
M source/lz4/build.sh
M source/mold/build.sh
M source/orc/build.sh
M source/python/build.sh
M source/thrift/build.sh
M source/zlib/build.sh
18 files changed, 43 insertions(+), 40 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/70/20870/3
--
To view, visit http://gerrit.cloudera.org:8080/20870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iee2e264b281f83ebc226d9bf7d4e5a99a52f1fc6
Gerrit-Change-Number: 20870
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 


[native-toolchain-CR] IMPALA-13072: Increase verbosity to print compilation commands

2024-05-31 Thread Joe McDonnell (Code Review)
Hello Laszlo Gaal, Michael Smith,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21470

to look at the new patch set (#2).

Change subject: IMPALA-13072: Increase verbosity to print compilation commands
..

IMPALA-13072: Increase verbosity to print compilation commands

It is useful to be able to examine the compilation commands
to verify that custom flags are set (e.g. some components
use -fno-omit-frame-pointer, etc). Components that use
CMake often do not print the compilation command.

This modifies make invocations to pass in VERBOSE=1, which
prints the command being executed. This only passes in
VERBOSE=1 for make invocations that are doing compilation,
so it skips some locations that are only doing "make install"
after already compiling the project (or are header-only
projects).

Some projects don't respect VERBOSE=1. Boost has a special
-d+2 flag that turns on verbose output. Other projects like
CURL use configure's --disable-silent-rules flag to enable
verbose output.

The output is not too big and it gets redirected to a
file, so this does not make it configurable.

Testing:
 - Ran a toolchain build and verified that the compilation
   commands are in the output

Change-Id: I1a27bfe401c5def341b2e52f7d5e77d7f0e3a19e
---
M source/abseil-cpp/build.sh
M source/avro/build-cpp.sh
M source/avro/build.sh
M source/binutils/build.sh
M source/boost/build.sh
M source/breakpad/build.sh
M source/bzip2/build.sh
M source/calloncehack/build.sh
M source/cctz/build.sh
M source/cloudflarezlib/build.sh
M source/cmake/build.sh
M source/crcutil/build.sh
M source/curl/build.sh
M source/flatbuffers/build.sh
M source/gdb/build.sh
M source/gflags/build.sh
M source/glog/build.sh
M source/googlebenchmark/build.sh
M source/googletest/build.sh
M source/gperftools/build.sh
M source/gtest/build.sh
M source/kudu/build.sh
M source/libev/build.sh
M source/libpfm/build.sh
M source/libunwind/build.sh
M source/llvm/build-source-tarball.sh
M source/lz4/build.sh
M source/mold/build.sh
M source/openldap/build.sh
M source/protobuf/build.sh
M source/python/build.sh
M source/re2/build.sh
M source/snappy/build.sh
M source/thrift/build.sh
M source/tpc-ds/build.sh
M source/tpc-h/build.sh
M source/zlib/build.sh
M source/zstd/build.sh
38 files changed, 59 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/70/21470/2
--
To view, visit http://gerrit.cloudera.org:8080/21470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1a27bfe401c5def341b2e52f7d5e77d7f0e3a19e
Gerrit-Change-Number: 21470
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 


[native-toolchain-CR] IMPALA-12686: Build the toolchain with basic debug information (-g1)

2024-05-31 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20870 )

Change subject: IMPALA-12686: Build the toolchain with basic debug information 
(-g1)
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20870/2/source/cmake/build.sh
File source/cmake/build.sh:

http://gerrit.cloudera.org:8080/#/c/20870/2/source/cmake/build.sh@27
PS2, Line 27: # Drop the -g1/-gz flags for CMake, as it is just a build utility 
and does not
> Oddly enough, it didn't, but now that I'm looking at the new binaries, this
I think it might be related to ccache. We use ccache 3.3.3, but 3.3.5 release 
notes mentions this:
"Fixed a regression where the original order of debug options could be lost."

https://ccache.dev/releasenotes.html#_ccache_3_3_5

So, I'm going to change this back to adding -g0 and file a separate ticket for 
upgrading ccache.



--
To view, visit http://gerrit.cloudera.org:8080/20870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee2e264b281f83ebc226d9bf7d4e5a99a52f1fc6
Gerrit-Change-Number: 20870
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 31 May 2024 21:16:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-31 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21455/4/testdata/workloads/tpcds_partitioned/queries
File testdata/workloads/tpcds_partitioned/queries:

http://gerrit.cloudera.org:8080/#/c/21455/4/testdata/workloads/tpcds_partitioned/queries@1
PS4, Line 1: ../tpcds/queries
> get_workload() along with table format dimension dictate which database the
Thanks, that makes sense to me.



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 31 May 2024 20:03:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16256/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 31 May 2024 19:54:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-31 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc@373
PS2, Line 373: rocessor dependent
> Ack. Maybe also adds this to the code comment to explain why 64 is picked?
Done


http://gerrit.cloudera.org:8080/#/c/21455/4/testdata/workloads/tpcds_partitioned/queries
File testdata/workloads/tpcds_partitioned/queries:

http://gerrit.cloudera.org:8080/#/c/21455/4/testdata/workloads/tpcds_partitioned/queries@1
PS4, Line 1: ../tpcds/queries
> I don't quite understand the difference between using testdata/workloads/tp
get_workload() along with table format dimension dictate which database the 
query will run. For example,

tpcds_partitioned + 'parquet/snap/block' = tpcds_partitioned_parquet_snap 
database.
tpcds + 'parquet/none' = tpcds_parquet database.

My intention is to run the same TPC-DS Q97 that exist in tpcds workload dir to 
query tpcds_partitioned_parquet_snap instead of tpcds_parquet, because all 
facts table in tpcds_partitioned_parquet_snap are all partitioned. In 
tpcds_parquet, only store_sales are partitioned.

All test files under testdata/workloads/tpcds/queries/ should be compatible 
against tpcds_partitioned_parquet_snap. So I just create symlink instead 
copying all test files into separate 
testdata/workloads/tpcds_partitioned/queries/. Eventually, tpcds dataset should 
be changed to follow schema from tpcds_partitioned and we can keep just one of 
them.



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 31 May 2024 19:30:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-31 Thread Riza Suminto (Code Review)
Hello Yida Wu, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21455

to look at the new patch set (#5).

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..

IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

ExprValuesCache uses BATCH_SIZE as a deciding factor to set its
capacity. It bounds the capacity such that expr_values_array_ memory
usage stays below 256KB. This patch tightens that limit to include all
memory usage from ExprValuesCache::MemUsage() instead of
expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will
not push the total memory usage of ExprValuesCache beyond 256KB.

Testing:
- Add test_join_queries.py::TestExprValueCache.
- Pass core tests.

Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
---
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M bin/rat_exclude_files.txt
A testdata/workloads/tpcds_partitioned/queries
M tests/common/test_dimensions.py
M tests/query_test/test_join_queries.py
6 files changed, 53 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/21455/5
--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-31 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc@373
PS2, Line 373: t int sample_size
> Done. Keeping it local for this Init method.
Ack. Maybe also adds this to the code comment to explain why 64 is picked?


http://gerrit.cloudera.org:8080/#/c/21455/2/be/src/exec/hash-table.cc@378
PS2, Line 378: // TODO: Add 'mem_
> capacity_ is capped to 1 at minimum. It is unlikely, but we want to know if
Done


http://gerrit.cloudera.org:8080/#/c/21455/4/testdata/workloads/tpcds_partitioned/queries
File testdata/workloads/tpcds_partitioned/queries:

http://gerrit.cloudera.org:8080/#/c/21455/4/testdata/workloads/tpcds_partitioned/queries@1
PS4, Line 1: ../tpcds/queries
I don't quite understand the difference between using 
testdata/workloads/tpcds/queries and the new workload tpcds_partitioned in 
test_join_queries.py. Is there a specific reason why test_join_queries.py uses 
tpcds_partitioned, which seems to be same to tpcds?



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 31 May 2024 18:57:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13111: Fix the calculation of fragment ids for impala-gdb.py

2024-05-31 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21472 )

Change subject: IMPALA-13111: Fix the calculation of fragment ids for 
impala-gdb.py
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21472
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59798407e99ee0e9100cac6b4b082cdb85ed43d1
Gerrit-Change-Number: 21472
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 31 May 2024 17:51:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13111: Fix the calculation of fragment ids for impala-gdb.py

2024-05-31 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21472 )

Change subject: IMPALA-13111: Fix the calculation of fragment ids for 
impala-gdb.py
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21472
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59798407e99ee0e9100cac6b4b082cdb85ed43d1
Gerrit-Change-Number: 21472
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 31 May 2024 17:51:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13106: Support larger imported query profile sizes through compression

2024-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21463 )

Change subject: IMPALA-13106: Support larger imported query profile sizes 
through compression
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16255/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c4f31beb9cac89051460bf764b6d50c3933bd03
Gerrit-Change-Number: 21463
Gerrit-PatchSet: 3
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 31 May 2024 17:55:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13111: Fix the calculation of fragment ids for impala-gdb.py

2024-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21472 )

Change subject: IMPALA-13111: Fix the calculation of fragment ids for 
impala-gdb.py
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10682/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21472
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59798407e99ee0e9100cac6b4b082cdb85ed43d1
Gerrit-Change-Number: 21472
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 31 May 2024 17:52:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13111: Fix the calculation of fragment ids for impala-gdb.py

2024-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21472 )

Change subject: IMPALA-13111: Fix the calculation of fragment ids for 
impala-gdb.py
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21472
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59798407e99ee0e9100cac6b4b082cdb85ed43d1
Gerrit-Change-Number: 21472
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 31 May 2024 17:52:09 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-12686: Build the toolchain with basic debug information (-g1)

2024-05-31 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20870 )

Change subject: IMPALA-12686: Build the toolchain with basic debug information 
(-g1)
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20870/2/source/cmake/build.sh
File source/cmake/build.sh:

http://gerrit.cloudera.org:8080/#/c/20870/2/source/cmake/build.sh@27
PS2, Line 27: # Drop the -g1/-gz flags for CMake, as it is just a build utility 
and does not
> Did adding -g0 not work?
Oddly enough, it didn't, but now that I'm looking at the new binaries, this 
didn't work either. I'll have to take another pass at how to do this.



--
To view, visit http://gerrit.cloudera.org:8080/20870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee2e264b281f83ebc226d9bf7d4e5a99a52f1fc6
Gerrit-Change-Number: 20870
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 31 May 2024 17:46:43 +
Gerrit-HasComments: Yes


[native-toolchain-CR] IMPALA-13072: Increase verbosity to print compilation commands

2024-05-31 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21470 )

Change subject: IMPALA-13072: Increase verbosity to print compilation commands
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21470/1/source/gdb/build.sh
File source/gdb/build.sh:

http://gerrit.cloudera.org:8080/#/c/21470/1/source/gdb/build.sh@41
PS1, Line 41:   wrap make VERBOSE=1 -j${BUILD_THREADS:-4} install
> Do you know if there's a pattern to when we separate 'make && make install'
I don't think there is a pattern. It just happened organically. There may be 
some locations where "make install" is separate because there are concurrency 
issues if it uses -j (or maybe not).



--
To view, visit http://gerrit.cloudera.org:8080/21470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1a27bfe401c5def341b2e52f7d5e77d7f0e3a19e
Gerrit-Change-Number: 21470
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 31 May 2024 17:40:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13111: Fix the calculation of fragment ids for impala-gdb.py

2024-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21472 )

Change subject: IMPALA-13111: Fix the calculation of fragment ids for 
impala-gdb.py
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16254/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21472
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59798407e99ee0e9100cac6b4b082cdb85ed43d1
Gerrit-Change-Number: 21472
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 31 May 2024 17:40:13 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-13072: Increase verbosity to print compilation commands

2024-05-31 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21470 )

Change subject: IMPALA-13072: Increase verbosity to print compilation commands
..


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21470/1/source/gdb/build.sh
File source/gdb/build.sh:

http://gerrit.cloudera.org:8080/#/c/21470/1/source/gdb/build.sh@41
PS1, Line 41:   wrap make VERBOSE=1 -j${BUILD_THREADS:-4} install
Do you know if there's a pattern to when we separate 'make && make install' or 
do them as one?



--
To view, visit http://gerrit.cloudera.org:8080/21470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1a27bfe401c5def341b2e52f7d5e77d7f0e3a19e
Gerrit-Change-Number: 21470
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 31 May 2024 17:33:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13106: Support larger imported query profile sizes through compression

2024-05-31 Thread Surya Hebbar (Code Review)
Surya Hebbar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21463 )

Change subject: IMPALA-13106: Support larger imported query profile sizes 
through compression
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21463/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21463/2//COMMIT_MSG@14
PS2, Line 14: pako
> What version is pako.min.js in this patch? Please mention it explicitly in 
Done


http://gerrit.cloudera.org:8080/#/c/21463/2//COMMIT_MSG@16
PS2, Line 16:
> Should we make 'compression' as option to be enabled?
For small profiles, compression is very fast.
For big files, there is no other way to insert them into indexedDB without 
compression due to the size limit.

So, I believe users should not be able to toogle this.


http://gerrit.cloudera.org:8080/#/c/21463/2/www/query_stmt.tmpl
File www/query_stmt.tmpl:

http://gerrit.cloudera.org:8080/#/c/21463/2/www/query_stmt.tmpl@68
PS2, Line 68:  db.transaction("profiles", "readonly").objectStore("profiles");
> Can you contain this into its own function? Maybe also add console.log() ab
Done



--
To view, visit http://gerrit.cloudera.org:8080/21463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c4f31beb9cac89051460bf764b6d50c3933bd03
Gerrit-Change-Number: 21463
Gerrit-PatchSet: 3
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 31 May 2024 17:32:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13106: Support larger imported query profile sizes through compression

2024-05-31 Thread Surya Hebbar (Code Review)
Surya Hebbar has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/21463 )

Change subject: IMPALA-13106: Support larger imported query profile sizes 
through compression
..

IMPALA-13106: Support larger imported query profile sizes through compression

Imported query profiles are currently being stored in IndexedDB.
Although IndexedDB does not have storage limitations like other
browser storage APIs, there is a storage limit for a single
attribute / field.

For supporting larger query profiles, 'pako' compression library's
v2.1.0 has been added along with its associated license.

Before adding query profile JSON to indexedDB, it undergoes compression
using this library.

As compression is a long running process that can block the main thread,
this has been delegated to a worker script running in the background.
The worker script returns all input data sent to it in compressed form.

The process of compression consumes time; hence, an alert message is
displayed on the queries page warning user to refrain from closing or
reloading the page. On completion, the raw total size, compressed
total size, and compression time are logged to the browser console.

When multiple profiles are chosen, after each query profile insertion,
the subsequent one is not triggered until compression and insertion
are finished.

The inserted query profile field is decompressed before parsing on
the query plan, query profile, query statement, and query timeline page.

Added tests for the compression library methods utilized by
the worker script.

Change-Id: I8c4f31beb9cac89051460bf764b6d50c3933bd03
---
M .gitattributes
M LICENSE.txt
M bin/rat_exclude_files.txt
M www/common-header.tmpl
A www/pako.min.js
M www/queries.tmpl
M www/query_plan_text.tmpl
M www/query_profile.tmpl
M www/query_stmt.tmpl
M www/query_timeline.tmpl
R www/scripts/common_util.js
A www/scripts/compression_util.js
A www/scripts/queries/compressionWorker.js
M www/scripts/query_timeline/chart_commons.js
M www/scripts/query_timeline/fragment_diagram.js
A www/scripts/tests/queries/compressionWorker.test.js
16 files changed, 230 insertions(+), 80 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/21463/3
--
To view, visit http://gerrit.cloudera.org:8080/21463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c4f31beb9cac89051460bf764b6d50c3933bd03
Gerrit-Change-Number: 21463
Gerrit-PatchSet: 3
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[native-toolchain-CR] IMPALA-12541: Build GCC with --enable-linker-build-id

2024-05-31 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21469 )

Change subject: IMPALA-12541: Build GCC with --enable-linker-build-id
..


Patch Set 1: Code-Review+1

Seems fine. Prior discussion suggests a handful of projects might have to 
update some build scripts - 
https://patchwork.ozlabs.org/project/gcc/patch/20101005050958.3841d40...@magilla.sf.frob.com/
 - but presumably that doesn't affect us.


--
To view, visit http://gerrit.cloudera.org:8080/21469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieb2017ba1a348a9e9e549fa3268635afa94ae6d0
Gerrit-Change-Number: 21469
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 31 May 2024 17:25:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13111: Fix the calculation of fragment ids for impala-gdb.py

2024-05-31 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21472


Change subject: IMPALA-13111: Fix the calculation of fragment ids for 
impala-gdb.py
..

IMPALA-13111: Fix the calculation of fragment ids for impala-gdb.py

The gdb helpers in impala-gdb.py provide functions to look on
the stack for the information added in IMPALA-6416 and get the
fragment/query ids. Right now, it is incorrectly using a signed
integer, which leads to incorrect ids like this:
-3cbda1606b3ade7c:f170c4bd

This changes the logic to AND the integer with an 0xFF* sequence
of the right length. This forces the integer to be unsigned,
producing the right query id.

Testing:
 - Ran this on a minidump and verified the the listed query ids
   were valid (and existed in the profile log)

Change-Id: I59798407e99ee0e9100cac6b4b082cdb85ed43d1
---
M lib/python/impala_py_lib/gdb/impala-gdb.py
1 file changed, 8 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/21472/1
--
To view, visit http://gerrit.cloudera.org:8080/21472
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I59798407e99ee0e9100cac6b4b082cdb85ed43d1
Gerrit-Change-Number: 21472
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-12732: Add support for MERGE statements for Iceberg tables

2024-05-31 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21423 )

Change subject: IMPALA-12732: Add support for MERGE statements for Iceberg 
tables
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21423/6/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-merge.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-merge.test:

http://gerrit.cloudera.org:8080/#/c/21423/6/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-merge.test@324
PS6, Line 324: when not matched then insert(id, user) values(source.id, 
source.user)
I admit I haven't checked the code, but is there a precedence between these 
cases set programatically or is it up to the order the cases were provided. 
What I mean is for instance we have a 'when matched and ' and also a 
more general 'when matched' case too. In this order the first covers the 
narrower case and then comes the more general case to cover the rows not 
covered by the first case. If we reverse the order of these would we get the 
same results?



--
To view, visit http://gerrit.cloudera.org:8080/21423
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3416a79740eddc446c87f72bf1a85ed3f71af268
Gerrit-Change-Number: 21423
Gerrit-PatchSet: 6
Gerrit-Owner: Peter Rozsa 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 31 May 2024 14:26:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12732: Add support for MERGE statements for Iceberg tables

2024-05-31 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21423 )

Change subject: IMPALA-12732: Add support for MERGE statements for Iceberg 
tables
..


Patch Set 6:

(7 comments)

Thanks for this huge work, Peti! I'm only at the beginning of reviewing, only 
checked the commit msg and the tests, just dumping what I have now.

http://gerrit.cloudera.org:8080/#/c/21423/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21423/6//COMMIT_MSG@7
PS6, Line 7: IMPALA-12732: Add support for MERGE statements for Iceberg tables
I know MERGE is a known SQL functionality but I'd find it useful to add a few 
sentences about the functionality itself.


http://gerrit.cloudera.org:8080/#/c/21423/6//COMMIT_MSG@11
PS6, Line 11: same syntax as Hive, Spark, and Trino
Would you mind adding some examples?


http://gerrit.cloudera.org:8080/#/c/21423/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeModifyStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeModifyStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/21423/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeModifyStmtsTest.java@264
PS6, Line 264: AnalysisError("merge into "
for me this test seems identical to the one right above. Or do I miss something?


http://gerrit.cloudera.org:8080/#/c/21423/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeModifyStmtsTest.java@384
PS6, Line 384: Column permutation
Is this error msg introduced by this patch or did it exist before? I found the 
word permutation weird here. 'Column list' could be more nature IMO


http://gerrit.cloudera.org:8080/#/c/21423/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeModifyStmtsTest.java@386
PS6, Line 386: // Fewer columns in VALUES
Actually, this is also a 'More columns in VALUES' case similarly to above.


http://gerrit.cloudera.org:8080/#/c/21423/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeModifyStmtsTest.java@410
PS6, Line 410: AnalyzesOk("merge into 
functional_parquet.iceberg_partitioned target "
nit: This test could go above the error cases. Also some comment about the 
intention could be great like 'multiple MATCHED and NOT MATCHED cases'


http://gerrit.cloudera.org:8080/#/c/21423/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-merge.test
File testdata/workloads/functional-query/queries/QueryTest/iceberg-merge.test:

http://gerrit.cloudera.org:8080/#/c/21423/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-merge.test@115
PS6, Line 115:  TYPES
Do you intentionally not asserting on some profile metrics from this test on?



--
To view, visit http://gerrit.cloudera.org:8080/21423
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3416a79740eddc446c87f72bf1a85ed3f71af268
Gerrit-Change-Number: 21423
Gerrit-PatchSet: 6
Gerrit-Owner: Peter Rozsa 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 31 May 2024 14:23:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [PROTOTYPE] IMPALA-12686: Switch to toolchain with basic debug info

2024-05-31 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21471 )

Change subject: [PROTOTYPE] IMPALA-12686: Switch to toolchain with basic debug 
info
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21471/1/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/21471/1/bin/impala-config.sh@94
PS1, Line 94: f93e2c9a865c80cafd76b872ad04400877766a2f
I think this Git hash should be updated as well (although we don't seem to 
check it)



--
To view, visit http://gerrit.cloudera.org:8080/21471
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b962c350cc5f1f2b24ca7a52b940ec9e87a7745
Gerrit-Change-Number: 21471
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Fri, 31 May 2024 09:07:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-05-31 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21452 )

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/21452/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21452/2//COMMIT_MSG@33
PS2, Line 33: build fragments
aren't these 'probe' fragments that are blocked on the builder?


http://gerrit.cloudera.org:8080/#/c/21452/2//COMMIT_MSG@37
PS2, Line 37: The
nit: typo


http://gerrit.cloudera.org:8080/#/c/21452/2//COMMIT_MSG@57
PS2, Line 57: lowered
reduced?


http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.h
File be/src/exec/iceberg-delete-builder.h:

http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.h@147
PS2, Line 147:   Status FinalizeBuildImplParallel(int num_threads);
Since this one can return error codes too, could you add a comment in what 
cases can we except errors?


http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.cc
File be/src/exec/iceberg-delete-builder.cc:

http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.cc@314
PS2, Line 314: IcebergDeleteBuilder::DeleteRowVector 
IcebergDeleteBuilder::GetFinalDeleteRowVector(
This and the other PR made me think about whther it makes sense to create a 
separate class for this DeleteRowVector where we can encapsulate all the logic 
that is needed to populate the vector of vectors, allocating new vectors with 
increasing size, and also to flatten the vector of vectors into one vector. It 
would reduce code complexity from the builder code and would give the 
responsibility of representing the positions in a more efficient way to another 
code-level. What do you think?


http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.cc@320
PS2, Line 320:   int64_t capacity = 0;
indentation


http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/join-builder.h
File be/src/exec/join-builder.h:

http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/join-builder.h@197
PS2, Line 197:   int probes_waiting_on_builder_ = 0;
The variable name is self-explanatory, but some additional comment that is a 
bit more verbose could be useful.



--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 31 May 2024 08:51:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12705: Add /catalog ha info page on Statestore to show catalog HA information

2024-05-31 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21418 )

Change subject: IMPALA-12705: Add /catalog_ha_info page on Statestore to show 
catalog HA information
..


Patch Set 14:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/21418/14/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/21418/14/tests/webserver/test_web_pages.py@1141
PS14, Line 1141: ImpalaTestSuite
This is not CustomClusterTestSuite. The Impala cluster is not restarted for 
each test case.


http://gerrit.cloudera.org:8080/#/c/21418/14/tests/webserver/test_web_pages.py@1171
PS14, Line 1171: disable_catalog_ha
function name should be started with test_


http://gerrit.cloudera.org:8080/#/c/21418/14/tests/webserver/test_web_pages.py@1174
PS14, Line 1174: 
@CustomClusterTestSuite.with_args(start_args="--enable_catalogd_ha")
Could we set parameters if the class is not CustomClusterTestSuite? Is Impala 
cluster restarted with given parameter?


http://gerrit.cloudera.org:8080/#/c/21418/14/tests/webserver/test_web_pages.py@1175
PS14, Line 1175: enable_catalog_ha
function name should be started with test_



--
To view, visit http://gerrit.cloudera.org:8080/21418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If85f6a827ae8180d13caac588b92af0511ac35e3
Gerrit-Change-Number: 21418
Gerrit-PatchSet: 14
Gerrit-Owner: ttz <2433038...@qq.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Reviewer: ttz <2433038...@qq.com>
Gerrit-Comment-Date: Fri, 31 May 2024 06:28:52 +
Gerrit-HasComments: Yes