MichaelChirico opened a new pull request #553: [HIVE-13482][UDF] Explicitly 
define str_to_map args as regex
URL: https://github.com/apache/hive/pull/553
 
 
   Successor to https://github.com/apache/spark/pull/23888
   
   See discussion there for some more details about the Hive side of this, in 
particular [my comment 
here](https://github.com/apache/spark/pull/23888#issuecomment-467742127) about 
existing StackOverflow answers and 
[here](https://github.com/apache/spark/pull/23888#issuecomment-467747788):
   
   > My conclusion is that it's eminently ambiguous whether the _intended_ 
behavior in either Hive or SparkSQL is to treat the delimiters as regular 
expressions.
   
   > BUT the behavior has been around for [8 
years](https://github.com/apache/hive/commit/4f8294e578db449294a1186f0ac4efb041445dcb)
 and at least going off of the SO answers, it seems to be accepted as "known" 
behavior so things will probably break if we change it.
   
   Thus, this PR intends to solidify the interpretation of `delimiter1` and 
`delimiter2` as regular expressions once and for all.
   
   If the non-regexp behavior is strongly desired, eventually there could be a 
`fixed: bool` argument that behaves like the identically-named argument in R 
regular expression functions like 
[`gsub`](http://astrostatistics.psu.edu/su07/R/html/base/html/grep.html) and 
[`strsplit`](http://astrostatistics.psu.edu/su07/R/html/base/html/strsplit.html)...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to