[ https://issues.apache.org/jira/browse/CALCITE-5580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17699918#comment-17699918 ]
Julian Hyde commented on CALCITE-5580: -------------------------------------- You might find it interesting to note that with the addition of the {{SPLIT}} function, the [WordCount|http://blog.hydromatic.net/2020/03/31/word-count-revisited.html] problem can be solved in pure SQL. > Add SPLIT() Function (Enabled for BigQuery) > ------------------------------------------- > > Key: CALCITE-5580 > URL: https://issues.apache.org/jira/browse/CALCITE-5580 > Project: Calcite > Issue Type: Improvement > Reporter: Tanner Clary > Assignee: Tanner Clary > Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > BigQuery offers the {{SPLIT()}} function which splits a string at an > optionally-specified delimiter into a string array. If no delimiter is > specified, it is default to a comma. If the string is empty, an array of a > single empty string is returned. If the delimiter is not found in the string, > an array with a single element (the string) is returned. > In BigQuery, the function can also accept bytes. In order to implement this, > I think some modifications to ByteString.java may be required. I will > probably not do this at least for my initial draft. If anyone has any > suggestions or guidance on whether or not it should be supported, I would > appreciate it. > Documentation and example cases may be found below. > EXAMPLE: {{SPLIT('h,e,l,l,o')}} would return: {{[h, e, l, l, o]}}. > EXAMPLE: {{SPLIT('h-e-l-l-o', '-')}} would return: {{[h, e, l, l, o]}}. > [BigQuery > docs|https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#split] > -- This message was sent by Atlassian Jira (v8.20.10#820010)