vvellanki commented on a change in pull request #11180:
URL: https://github.com/apache/arrow/pull/11180#discussion_r741641131



##########
File path: cpp/src/gandiva/gdv_function_stubs.cc
##########
@@ -794,6 +794,56 @@ const char* gdv_fn_initcap_utf8(int64_t context, const 
char* data, int32_t data_
   *out_len = out_idx;
   return out;
 }
+
+GANDIVA_EXPORT
+const char* gdv_fn_concat_ws_utf8(int64_t context, const char* separator,
+                                  int32_t separator_len, const char* data,
+                                  int32_t data_len, int32_t* out_len) {
+  if (data_len <= 0) {
+    gdv_fn_context_set_error_msg(context, "Data can not be null.");
+    *out_len = 0;
+    return "";
+  }
+
+  std::string concat = "";
+  std::string data_str(data, data_len);
+  std::string sep(separator, separator_len);
+
+  size_t pos_str = 0;
+  std::string token;
+  while ((pos_str = data_str.find(',')) != std::string::npos) {

Review comment:
       The Hive function documentation has nothing about splitting the input by 
','. Why are we doing this?

##########
File path: cpp/src/gandiva/gdv_function_stubs.cc
##########
@@ -794,6 +794,56 @@ const char* gdv_fn_initcap_utf8(int64_t context, const 
char* data, int32_t data_
   *out_len = out_idx;
   return out;
 }
+
+GANDIVA_EXPORT
+const char* gdv_fn_concat_ws_utf8(int64_t context, const char* separator,

Review comment:
       This function should take at least 2 const char* as input and concat 
those. This is taking only one input (data)

##########
File path: cpp/src/gandiva/gdv_function_stubs.cc
##########
@@ -794,6 +794,56 @@ const char* gdv_fn_initcap_utf8(int64_t context, const 
char* data, int32_t data_
   *out_len = out_idx;
   return out;
 }
+
+GANDIVA_EXPORT
+const char* gdv_fn_concat_ws_utf8(int64_t context, const char* separator,
+                                  int32_t separator_len, const char* data,
+                                  int32_t data_len, int32_t* out_len) {
+  if (data_len <= 0) {
+    gdv_fn_context_set_error_msg(context, "Data can not be null.");
+    *out_len = 0;
+    return "";
+  }
+
+  std::string concat = "";
+  std::string data_str(data, data_len);

Review comment:
       Please do not use this pattern - this does a memcpy and allocates memory
   
   We should allocate memory for the output and then iterate over all the input 
buffers and copy them/separator to the output buffer




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to