Thanks for Riza's help! I have backported the three commits - IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost - IMPALA-13119: Fix cost_ initialization at CostingSegment.java - IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Also add a test fix and some DOC patches: - IMPALA-13276: Revise the documentation of 'RUNTIME_FILTER_WAIT_TIME_MS' - IMPALA-13250: [DOCS] Document ENABLED_RUNTIME_FILTER_TYPES query option - IMPALA-13167: Fix Workload Management Tests Timing out The current branch is https://github.com/stiga-huang/impala/commits/branch-4.4.1 My plan is to merge the DOC patch for UNNEST() function and then start the vote for the 4.4.1 release. - https://gerrit.cloudera.org/c/21651/ Any suggestions are welcome! Thanks, Quanlong On Wed, Aug 7, 2024 at 12:13 PM Quanlong Huang <huangquanl...@gmail.com> wrote: > > Thanks for all the inputs! > > I'm building the branch in my repo: > https://github.com/stiga-huang/impala/commits/branch-4.4.1 > Here are the commits so far: > > 53ee6536c IMPALA-13036: Document Iceberg metadata tables > b3c964b57 IMPALA-11328: [DOCS] Fix incorrect default value for max_errors > 88b3d6bea IMPALA-13071: Update the doc of Impala components > 342da2f78 IMPALA-13252: (Addendum) PrintId cancel query > fbf61484b IMPALA-13271: Correct the documentation with respect to granting > privileges on URI > 53a0b8964 IMPALA-13272: Analytic function of collections can lead to crash > a24b98bd6 IMPALA-13252: Always use PrintId for TUniqueId > 3a3b828a3 IMPALA-13018: Block push down of conjuncts with implicit casting on > base columns for jdbc tables > 87479fa49 IMPALA-13256: Support more than 2G rows for COUNT(*) on jdbc table > bf1c74c04 IMPALA-10451: Fix avro table loading failures caused by HIVE-24157 > 3a9b60427 IMPALA-13159: Fix query cancellation caused by statestore failover > 3050e0086 IMPALA-12712: Invalidate metadata on table should set better > createEventId > bd7070198 IMPALA-13034: Add logs and counters for HTTP profile requests > blocking client fetches > 65ee0ffea IMPALA-13035: Querying metadata tables from non-Iceberg tables > throws IllegalArgumentException > 48e81a210 IMPALA-13040: (addendum) Inject larger delay for sanitized build > 3c939f09a IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote > 7d4a8537e IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 > c3fff3723 IMPALA-13076 Add pstack and jstack to Impala Redhat docker images > 221d4f1e2 IMPALA-13077: Fix selectivity estimation for SEMI JOIN > 5d3d41e5c IMPALA-13143: Fix flaky test_catalogd_failover_with_sync_ddl > a45dd963a IMPALA-13134: DDL hang with SYNC_DDL enabled when Catalogd is > changed to standby status > 5eb3187b3 IMPALA-13270: Fix IllegalStateException on runtime filter > 51661d335 IMPALA-12800: Add cache for isTrueWithNullSlots() evaluation > 224029f6d IMPALA-12800: Use HashMap for ExprSubstitutionMap lookups > 88bc00ccd IMPALA-12800: Skip O(n^2) ExprSubstitutionMap::verify() for release > builds > 0140a15a0 IMPALA-12680: Fix NullPointerException during > AlterTableAddPartitions > b4670a863 IMPALA-13028: Strip dynamic link libraries in Linux DEB/RPM packages > cdead47d3 IMPALA-9441,IMPALA-13170: Ops listing dbs/tables should handle db > not exists > cb37ad441 IMPALA-13252: Consistently use PrintId to print TUniqueId > 9eb43fba0 IMPALA-13203: Rewrite 'id = 0 OR false' as expected > f8f3dd0ec IMPALA-13057: Incorporate tuple/slot information into tuple cache > key > 724df776e IMPALA-13150: Possible buffer overflow in StringVal::CopyFrom() > a1f89131c IMPALA-13161: Fix column index overflow in DelimitedTextParser > bb9df269a IMPALA-13130: Prioritize EndDataStream messages > 294e4aeb1 IMPALA-13129: Move runtime filter skipping at registerRuntimeFilter > 0ba5403ea IMPALA-13107: Don't start query on executor if instance number > equals 0 > efa26f354 IMPALA-13138: Never smallify existing StringValue objects, only new > ones during DeepCopy > bafce5c9f (tag: 4.4.0-rc2, tag: 4.4.0, origin/branch-4.4.0, branch-4.4.0) > Update GIT_HASH for version 4.4.0 > > Verified till commit 53a0b8964 in the CORE tests in this job: > https://jenkins.impala.io/job/parallel-all-tests-ub2004/1403/ > I added some more DOC commits after I launched the job. But I think they > won't introduce test failures. We will verify the branch again in the release > votes. > > Note that two commits are still missing: > > 753ee9b8a IMPALA-13119: Fix cost_ initialization at CostingSegment.java > 5d1bd8062 IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost > > To backport them, I tried to backport another commit to resolve conflicts: > > d0237fbe4 IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column > > However, that introduces some test failures that I haven't got time to dig > into yet > https://jenkins.impala.io/job/ubuntu-20.04-from-scratch/3092/ > https://jenkins.impala.io/job/ubuntu-20.04-dockerised-tests/2094/ > > So I plan to skip them in this release and move them to the plan of 4.4.2. > Please let me know if you have any concerns. > > Thanks, > Quanlong > > > On Mon, Aug 5, 2024 at 11:52 PM Riza Suminto <riza.sumi...@cloudera.com> > wrote: >> >> I think IMPALA-13272: Analyitic function of collections can lead to crash, >> should be included as well. >> >> On Fri, Aug 2, 2024 at 11:06 AM Michael Smith <michael.sm...@cloudera.com> >> wrote: >> >> > I'd also like to add >> > IMPALA-13270: Bug when comparing ExprSubstitutionMap.size() >> > >> > On Thu, Aug 1, 2024 at 9:29 PM Quanlong Huang <huangquanl...@gmail.com> >> > wrote: >> > >> > > Impala 4.4.0 was released 2 months ago on 2024-05-25. There are >> > > several bugs that block it from being used in production. I think we >> > > should make a maintenance release of 4.4.1 to fix them. Here are the >> > > list of issues: >> > > >> > > Critical Fixes >> > > IMPALA-13107: Don't start query on executor if instance number equals 0 >> > > IMPALA-13129: Move runtime filter skipping at registerRuntimeFilter >> > > IMPALA-13130: Prioritize EndDataStream messages >> > > IMPALA-13138: Never smallify existing StringValue objects, only new >> > > ones during DeepCopy >> > > IMPALA-13161: Fix column index overflow in DelimitedTextParser >> > > IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost >> > > IMPALA-13150: Possible buffer overflow in StringVal::CopyFrom() >> > > IMPALA-13057: Incorporate tuple/slot information into tuple cache key >> > > >> > > Nice-to-have >> > > IMPALA-13203: Rewrite 'id = 0 OR false' as expected >> > > IMPALA-13252: Consistently use PrintId to print TUniqueId >> > > IMPALA-9441,IMPALA-13170: Ops listing dbs/tables should handle db not >> > > exists >> > > IMPALA-13028: Strip dynamic link libraries in Linux DEB/RPM packages >> > > IMPALA-13134: DDL hang with SYNC_DDL enabled when Catalogd is changed >> > > to standby status >> > > IMPALA-13143: Fix flaky test_catalogd_failover_with_sync_ddl >> > > IMPALA-13119: Fix cost_ initialization at CostingSegment.java >> > > IMPALA-13077: Fix selectivity estimation for SEMI JOIN >> > > IMPALA-13076 Add pstack and jstack to Impala Redhat docker images >> > > IMPALA-13058: Init first_arrival_time_ and completion_time_ with -1 >> > > IMPALA-13040: Add waiting mechanism in UpdateFilterFromRemote >> > > IMPALA-13040: (addendum) Inject larger delay for sanitized build >> > > IMPALA-13035: Querying metadata tables from non-Iceberg tables throws >> > > IllegalArgumentException >> > > IMPALA-13034: Add logs and counters for HTTP profile requests blocking >> > > client fetches >> > > IMPALA-12712: Invalidate metadata on table should set better >> > createEventId >> > > IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions >> > > IMPALA-13159: Fix query cancellation caused by statestore failover >> > > IMPALA-10451: Fix avro table loading failures caused by HIVE-24157 >> > > >> > > I propose that we release 4.4.1 soon. I'm willing to volunteer as the >> > > release manager. I'm interested to hear what the community thinks >> > > about doing a release. All feedback is welcome! >> > > >> > > Thanks, >> > > Quanlong >> > > >> >