[ 
https://issues.apache.org/jira/browse/CALCITE-5580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17699918#comment-17699918
 ] 

Julian Hyde commented on CALCITE-5580:
--------------------------------------

You might find it interesting to note that with the addition of the {{SPLIT}} 
function, the 
[WordCount|http://blog.hydromatic.net/2020/03/31/word-count-revisited.html] 
problem can be solved in pure SQL.

> Add SPLIT() Function (Enabled for BigQuery)
> -------------------------------------------
>
>                 Key: CALCITE-5580
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5580
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Tanner Clary
>            Assignee: Tanner Clary
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> BigQuery offers the {{SPLIT()}} function which splits a string at an 
> optionally-specified delimiter into a string array. If no delimiter is 
> specified, it is default to a comma. If the string is empty, an array of a 
> single empty string is returned. If the delimiter is not found in the string, 
> an array with a single element (the string) is returned. 
> In BigQuery, the function can also accept bytes. In order to implement this, 
> I think some modifications to ByteString.java may be required. I will 
> probably not do this at least for my initial draft. If anyone has any 
> suggestions or guidance on whether or not it should be supported, I would 
> appreciate it. 
> Documentation and example cases may be found below.
> EXAMPLE: {{SPLIT('h,e,l,l,o')}} would return: {{[h, e, l, l, o]}}.
> EXAMPLE: {{SPLIT('h-e-l-l-o', '-')}} would return: {{[h, e, l, l, o]}}.
> [BigQuery 
> docs|https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#split]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to