This is an automated email from the ASF dual-hosted git repository.
kou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new 851288c1d0 GH-50109: [C++][Gandiva] Fix incorrect error messages
(#50110)
851288c1d0 is described below
commit 851288c1d01a82597ce9f20a7f24cc85a5c3f56b
Author: Logan Riggs <[email protected]>
AuthorDate: Tue Jun 9 14:43:37 2026 -0700
GH-50109: [C++][Gandiva] Fix incorrect error messages (#50110)
### Rationale for this change
Three small, independent bugs in Gandiva's runtime error messages: in two
cases the message misrepresents what actually failed, and in one case a
copy-paste error left the wrong function name in a message.
### What changes are included in this PR?
#### 1. `levenshtein` reported OOM as a length error
[cpp/src/gandiva/precompiled/string_ops.cc:1738-1742](cpp/src/gandiva/precompiled/string_ops.cc#L1738-L1742)
— when `gdv_fn_context_arena_malloc` returned null (i.e. memory allocation
failed), the function set the error to `"String length must be greater than
0"`. The condition is an OOM, not a length issue. Now reports `"LEVENSHTEIN:
could not allocate working memory"`.
#### 2. `levenshtein` length-bound message was off by one
[cpp/src/gandiva/precompiled/string_ops.cc:1702-1708](cpp/src/gandiva/precompiled/string_ops.cc#L1702-L1708)
— the precondition is `in1_len < 0 || in2_len < 0`, which rejects only
negative lengths (zero is valid). The message said `"must be greater than 0"`,
implying zero was rejected too. Corrected to `"LEVENSHTEIN: input lengths must
be non-negative, got <a> and <b>"`, which now also echoes the offending values.
#### 3. `gdv_fn_aes_decrypt` reported the wrong direction
[cpp/src/gandiva/gdv_function_stubs.cc:411](cpp/src/gandiva/gdv_function_stubs.cc#L411)
— inside the AES **decrypt** function, the OOM message said `"Could not
allocate memory for returning aes encrypt cypher text"`. Pure copy-paste from
the sibling encrypt function. Corrected the direction wording.
#### Test updates
[cpp/src/gandiva/precompiled/string_ops_test.cc:1133-1135](cpp/src/gandiva/precompiled/string_ops_test.cc#L1133-L1135)
— the levenshtein test was asserting `HasSubstr("String length must be greater
than 0")`, which was matching the *bug message* — i.e. the test was effectively
verifying the wrong behavior. Updated to assert the corrected message substring.
### Are these changes tested?
```bash
cd cpp/debug
cmake --build . --target gandiva_shared gandiva-precompiled-test
gandiva-internals-test gandiva-projector-test -j4
./debug/gandiva-precompiled-test # 130/130 pass
./debug/gandiva-internals-test # 156/156 pass
./debug/gandiva-projector-test # 218/218 pass
```
### Are there any user-facing changes?
Yes, improved error message accuracy.
* GitHub Issue: #50109
Authored-by: [email protected] <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
---
cpp/src/gandiva/gdv_function_stubs.cc | 3 +--
cpp/src/gandiva/precompiled/string_ops.cc | 9 +++++++--
cpp/src/gandiva/precompiled/string_ops_test.cc | 2 +-
3 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/cpp/src/gandiva/gdv_function_stubs.cc
b/cpp/src/gandiva/gdv_function_stubs.cc
index 6fe3fa9a57..ddd5df6d39 100644
--- a/cpp/src/gandiva/gdv_function_stubs.cc
+++ b/cpp/src/gandiva/gdv_function_stubs.cc
@@ -407,8 +407,7 @@ const char* gdv_fn_aes_decrypt(int64_t context, const char*
data, int32_t data_l
static_cast<int32_t>(arrow::bit_util::RoundUpToPowerOf2(data_len,
kAesBlockSize));
char* ret = reinterpret_cast<char*>(gdv_fn_context_arena_malloc(context,
*out_len));
if (ret == nullptr) {
- std::string err_msg =
- "Could not allocate memory for returning aes encrypt cypher text";
+ std::string err_msg = "Could not allocate memory for returning aes decrypt
plaintext";
gdv_fn_context_set_error_msg(context, err_msg.data());
*out_len = 0;
return nullptr;
diff --git a/cpp/src/gandiva/precompiled/string_ops.cc
b/cpp/src/gandiva/precompiled/string_ops.cc
index 035d3c8c62..9a2d8c8415 100644
--- a/cpp/src/gandiva/precompiled/string_ops.cc
+++ b/cpp/src/gandiva/precompiled/string_ops.cc
@@ -1700,7 +1700,11 @@ FORCE_INLINE
gdv_int32 levenshtein(int64_t context, const char* in1, int32_t in1_len, const
char* in2,
int32_t in2_len) {
if (in1_len < 0 || in2_len < 0) {
- gdv_fn_context_set_error_msg(context, "String length must be greater than
0");
+ char err_msg[128];
+ snprintf(err_msg, sizeof(err_msg),
+ "LEVENSHTEIN: input lengths must be non-negative, got %d and %d",
in1_len,
+ in2_len);
+ gdv_fn_context_set_error_msg(context, err_msg);
return 0;
}
@@ -1736,7 +1740,8 @@ gdv_int32 levenshtein(int64_t context, const char* in1,
int32_t in1_len, const c
int* ptr = reinterpret_cast<int*>(
gdv_fn_context_arena_malloc(context, (len_smaller + 1) * 2 *
sizeof(int)));
if (ptr == nullptr) {
- gdv_fn_context_set_error_msg(context, "String length must be greater than
0");
+ gdv_fn_context_set_error_msg(context,
+ "LEVENSHTEIN: could not allocate working
memory");
return 0;
}
diff --git a/cpp/src/gandiva/precompiled/string_ops_test.cc
b/cpp/src/gandiva/precompiled/string_ops_test.cc
index d57eb43753..a31683c65a 100644
--- a/cpp/src/gandiva/precompiled/string_ops_test.cc
+++ b/cpp/src/gandiva/precompiled/string_ops_test.cc
@@ -1132,7 +1132,7 @@ TEST(TestStringOps, TestLevenshtein) {
EXPECT_EQ(levenshtein(ctx_ptr, "book", -5, "back", 4), 0);
EXPECT_TRUE(ctx.has_error());
EXPECT_THAT(ctx.get_error(),
- ::testing::HasSubstr("String length must be greater than 0"));
+ ::testing::HasSubstr("LEVENSHTEIN: input lengths must be
non-negative"));
ctx.Reset();
}