[ 
https://issues.apache.org/jira/browse/CALCITE-5644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanner Clary updated CALCITE-5644:
----------------------------------
    Description: 
BigQuery offers the 
[CONTAINS_SUBSTR|https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#contains_substr]
 function that returns {{TRUE}} if a substring is present in an expression and 
{{FALSE}} if it is not. A basic example of this may be seen in [1]. 

The expression can take many forms (more info in the linked doc) which makes 
its implementation more complex than other string functions that only accept 
arguments of type {{STRING}}. For instance, the expression to be searched can 
be a column or table reference. 

The function also has an optional third argument called {{json_scope}} where 
the user can indicate the scope of JSON data (keys, values, or both) to be 
searched for the substring.

 I am curious if anyone has thoughts on how the search of rows or tables could 
be implemented. I have a basic implementation (that supports expressions of 
type {{STRING}} and nothing else) that I will open a draft PR for as a starting 
point. To me, the challenge is implementing the additional features like the 
{{JSON_SCOPE}} argument (seen in [2]) and performing a cross field search as 
seen in [3]. 

[1] {{SELECT CONTAINS_SUBSTR("hello", "he");}} would return {{TRUE}}. 

[2] {{SELECT CONTAINS_SUBSTR(JSON '{"lunch":"soup"}', "lunch",
                       json_scope=>"JSON_VALUES") AS result;}} would return 
{{FALSE}}.

[3] 
{{SELECT *
FROM Recipes
WHERE CONTAINS_SUBSTR(
  (SELECT AS STRUCT Recipes.* EXCEPT (Lunch, Dinner)),
  'potato'
);}} would return: 


||Breakfast||Lunch||Dinner||
|Potato pancakes|Toasted cheese sandwich|Beef stroganoff|





  was:
BigQuery offers the 
[CONTAINS_SUBSTR|https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#contains_substr]
 function that returns {{TRUE}} if a substring is present in an expression and 
{{FALSE}} if it is not. A basic example of this may be seen in [1]. 

The expression can take many forms (more info in the linked doc) which makes 
its implementation more complex than other string functions that only accept 
arguments of type {{STRING}}. For instance, the expression to be searched can 
be a column or table reference. 

The function also has an optional third argument called {{json_scope}} where 
the user can indicate the scope of JSON data (keys, values, or both) to be 
searched for the substring.

 I am curious if anyone has thoughts on how the search of rows or tables could 
be implemented. I have a basic implementation (that supports expressions of 
type {{STRING}} and nothing else) that I will open a draft PR for as a starting 
point. To me, the challenge is implementing the additional features like the 
{{JSON_SCOPE}} argument (seen in [2]) and performing a cross field search as 
seen in [3]. 

[1] {{SELECT CONTAINS_SUBSTR("hello", "he");}} would return {{TRUE}}. 

[2] {{SELECT CONTAINS_SUBSTR(JSON '{"lunch":"soup"}',
                       "lunch",
                       json_scope=>"JSON_VALUES") AS result;}} would return 
{{FALSE}}.

[3] 
{{SELECT *
FROM Recipes
WHERE CONTAINS_SUBSTR(
  (SELECT AS STRUCT Recipes.* EXCEPT (Lunch, Dinner)),
  'potato'
);}} would return: 


||Breakfast||Lunch||Dinner||
|Potato pancakes|Toasted cheese sandwich|Beef stroganoff|






> Implement BigQuery CONTAINS_SUBSTR
> ----------------------------------
>
>                 Key: CALCITE-5644
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5644
>             Project: Calcite
>          Issue Type: Task
>            Reporter: Tanner Clary
>            Priority: Major
>
> BigQuery offers the 
> [CONTAINS_SUBSTR|https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#contains_substr]
>  function that returns {{TRUE}} if a substring is present in an expression 
> and {{FALSE}} if it is not. A basic example of this may be seen in [1]. 
> The expression can take many forms (more info in the linked doc) which makes 
> its implementation more complex than other string functions that only accept 
> arguments of type {{STRING}}. For instance, the expression to be searched can 
> be a column or table reference. 
> The function also has an optional third argument called {{json_scope}} where 
> the user can indicate the scope of JSON data (keys, values, or both) to be 
> searched for the substring.
>  I am curious if anyone has thoughts on how the search of rows or tables 
> could be implemented. I have a basic implementation (that supports 
> expressions of type {{STRING}} and nothing else) that I will open a draft PR 
> for as a starting point. To me, the challenge is implementing the additional 
> features like the {{JSON_SCOPE}} argument (seen in [2]) and performing a 
> cross field search as seen in [3]. 
> [1] {{SELECT CONTAINS_SUBSTR("hello", "he");}} would return {{TRUE}}. 
> [2] {{SELECT CONTAINS_SUBSTR(JSON '{"lunch":"soup"}', "lunch",
>                        json_scope=>"JSON_VALUES") AS result;}} would return 
> {{FALSE}}.
> [3] 
> {{SELECT *
> FROM Recipes
> WHERE CONTAINS_SUBSTR(
>   (SELECT AS STRUCT Recipes.* EXCEPT (Lunch, Dinner)),
>   'potato'
> );}} would return: 
> ||Breakfast||Lunch||Dinner||
> |Potato pancakes|Toasted cheese sandwich|Beef stroganoff|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to