Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead merged PR #455: URL: https://github.com/apache/datafusion-comet/pull/455 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
codecov-commenter commented on PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#issuecomment-2148675725 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/455?dropdown=coverage=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache) Report All modified and coverable lines are covered by tests :white_check_mark: > Project coverage is 34.23%. Comparing base [(`9ca63a2`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/9ca63a23edf67033e4f4eba5a9d004aa472743d2?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache) to head [(`e8f3b77`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/e8f3b77596ebe1617c28f812cc63d4206ed064a1?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache). > Report is 27 commits behind head on main. Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #455 +/- ## + Coverage 34.18% 34.23% +0.04% + Complexity 851 806 -45 Files 116 105 -11 Lines 3857038488 -82 Branches 8531 8562 +31 - Hits 1318713175 -12 + Misses2261222554 -58 + Partials 2771 2759 -12 ``` [:umbrella: View full report in Codecov by Sentry](https://app.codecov.io/gh/apache/datafusion-comet/pull/455?dropdown=coverage=pr=continue_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache). :loudspeaker: Have feedback on the report? [Share it here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1626637845 ## docs/source/user-guide/overview.md: ## @@ -29,7 +29,7 @@ Comet aims to support: - a native Parquet implementation, including both reader and writer - full implementation of Spark operators, including Filter/Project/Aggregation/Join/Exchange etc. -- full implementation of Spark built-in expressions +- [full implementation](../../../docs/spark_expressions_support.md) of Spark built-in expressions. Review Comment: ```suggestion - full implementation of Spark built-in expressions. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1626637123 ## docs/source/user-guide/overview.md: ## @@ -29,7 +29,7 @@ Comet aims to support: - a native Parquet implementation, including both reader and writer - full implementation of Spark operators, including Filter/Project/Aggregation/Join/Exchange etc. -- full implementation of Spark built-in expressions +- [full implementation](../../../docs/spark_expressions_support.md) of Spark built-in expressions. Review Comment: This won't build correctly: ``` /Users/andy/git/apache/datafusion-comet/docs/temp/user-guide/overview.md:32: WARNING: Unknown source document '../spark_expressions_support' [myst.xref_missing] ``` Let's revert this change for this PR and handle where we publish (user guide vs contributor guide) in a follow-up PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1626633782 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [x] some + - [x] std + - [x] stddev + - [x] stddev_pop + - [x] stddev_samp + - [x] sum + - [ ] try_avg + - [ ] try_sum + - [x] var_pop + - [x] var_samp + - [x] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [ ] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [x] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [x] ifnull + - [ ] nanvl + - [x] nullif + - [x] nvl + - [x] nvl2 + - [ ] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [ ] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [ ] to_date + - [ ] to_timestamp + - [ ] to_timestamp_ltz + - [ ] to_timestamp_ntz + - [ ] to_unix_timestamp + - [ ] to_utc_timestamp + - [ ] trunc + - [ ] try_to_timestamp + - [ ] unix_date + - [ ] unix_micros + - [ ] unix_millis + - [ ] unix_seconds + - [ ] unix_timestamp + - [ ] weekday + - [ ] weekofyear + - [ ] year + +### generator_funcs + - [ ] explode + - [ ] explode_outer + - [ ] inline + - [ ] inline_outer + - [ ] posexplode + - [ ] posexplode_outer + - [ ] stack + +### hash_funcs + - [ ] crc32 + - [ ] hash + - [x] md5 + - [ ] sha + - [ ] sha1 + - [ ] sha2 + - [ ] xxhash64 + +### json_funcs + - [ ] from_json + - [ ] get_json_object + - [ ] json_array_length + - [ ] json_object_keys + - [ ] json_tuple + - [ ] schema_of_json + - [ ] to_json + +### lambda_funcs + - [ ] aggregate + - [ ] array_sort + - [ ] exists + - [ ] filter + - [ ] forall + - [ ] map_filter + - [ ] map_zip_with + - [ ] reduce + - [ ] transform + - [ ] transform_keys + - [ ] transform_values + - [ ] zip_with + +### map_funcs + - [ ] element_at + - [ ] map + - [ ] map_concat + - [ ] map_contains_key + - [ ] map_entries + - [ ] map_from_arrays + - [ ] map_from_entries + - [ ] map_keys + - [ ] map_values + - [ ] str_to_map + - [ ] try_element_at + +### math_funcs + - [x] % + - [x] * + - [x] + + - [x] - + - [x] / + - [x] abs + - [x] acos + - [ ] acosh + - [x] asin + - [ ] asinh + - [x] atan + - [x] atan2 + - [ ] atanh + - [ ] bin + - [ ] bround + - [ ] cbrt + - [x] ceil + - [x] ceiling + - [ ] conv + - [x] cos + - [ ] cosh + - [ ] cot + - [ ] csc + - [ ] degrees + - [ ] div + - [ ] e + - [x] exp + - [ ] expm1 + - [ ] factorial + - [x] floor + - [ ] greatest + - [ ] hex + - [ ] hypot + - [ ] least + - [x] ln + - [
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1626623804 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [x] some + - [x] std + - [x] stddev + - [x] stddev_pop + - [x] stddev_samp + - [x] sum + - [ ] try_avg + - [ ] try_sum + - [x] var_pop + - [x] var_samp + - [x] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [ ] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [x] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [x] ifnull + - [ ] nanvl + - [x] nullif + - [x] nvl + - [x] nvl2 + - [ ] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [ ] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [ ] to_date + - [ ] to_timestamp + - [ ] to_timestamp_ltz + - [ ] to_timestamp_ntz + - [ ] to_unix_timestamp + - [ ] to_utc_timestamp + - [ ] trunc + - [ ] try_to_timestamp + - [ ] unix_date + - [ ] unix_micros + - [ ] unix_millis + - [ ] unix_seconds + - [ ] unix_timestamp + - [ ] weekday + - [ ] weekofyear + - [ ] year + +### generator_funcs + - [ ] explode + - [ ] explode_outer + - [ ] inline + - [ ] inline_outer + - [ ] posexplode + - [ ] posexplode_outer + - [ ] stack + +### hash_funcs + - [ ] crc32 + - [ ] hash + - [x] md5 + - [ ] sha + - [ ] sha1 + - [ ] sha2 + - [ ] xxhash64 + +### json_funcs + - [ ] from_json + - [ ] get_json_object + - [ ] json_array_length + - [ ] json_object_keys + - [ ] json_tuple + - [ ] schema_of_json + - [ ] to_json + +### lambda_funcs + - [ ] aggregate + - [ ] array_sort + - [ ] exists + - [ ] filter + - [ ] forall + - [ ] map_filter + - [ ] map_zip_with + - [ ] reduce + - [ ] transform + - [ ] transform_keys + - [ ] transform_values + - [ ] zip_with + +### map_funcs + - [ ] element_at + - [ ] map + - [ ] map_concat + - [ ] map_contains_key + - [ ] map_entries + - [ ] map_from_arrays + - [ ] map_from_entries + - [ ] map_keys + - [ ] map_values + - [ ] str_to_map + - [ ] try_element_at + +### math_funcs + - [x] % + - [x] * + - [x] + + - [x] - + - [x] / + - [x] abs + - [x] acos + - [ ] acosh + - [x] asin + - [ ] asinh + - [x] atan + - [x] atan2 + - [ ] atanh + - [ ] bin + - [ ] bround + - [ ] cbrt + - [x] ceil + - [x] ceiling + - [ ] conv + - [x] cos + - [ ] cosh + - [ ] cot + - [ ] csc + - [ ] degrees + - [ ] div + - [ ] e + - [x] exp + - [ ] expm1 + - [ ] factorial + - [x] floor + - [ ] greatest + - [ ] hex + - [ ] hypot + - [ ] least + - [x] ln + - [
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1626623804 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [x] some + - [x] std + - [x] stddev + - [x] stddev_pop + - [x] stddev_samp + - [x] sum + - [ ] try_avg + - [ ] try_sum + - [x] var_pop + - [x] var_samp + - [x] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [ ] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [x] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [x] ifnull + - [ ] nanvl + - [x] nullif + - [x] nvl + - [x] nvl2 + - [ ] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [ ] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [ ] to_date + - [ ] to_timestamp + - [ ] to_timestamp_ltz + - [ ] to_timestamp_ntz + - [ ] to_unix_timestamp + - [ ] to_utc_timestamp + - [ ] trunc + - [ ] try_to_timestamp + - [ ] unix_date + - [ ] unix_micros + - [ ] unix_millis + - [ ] unix_seconds + - [ ] unix_timestamp + - [ ] weekday + - [ ] weekofyear + - [ ] year + +### generator_funcs + - [ ] explode + - [ ] explode_outer + - [ ] inline + - [ ] inline_outer + - [ ] posexplode + - [ ] posexplode_outer + - [ ] stack + +### hash_funcs + - [ ] crc32 + - [ ] hash + - [x] md5 + - [ ] sha + - [ ] sha1 + - [ ] sha2 + - [ ] xxhash64 + +### json_funcs + - [ ] from_json + - [ ] get_json_object + - [ ] json_array_length + - [ ] json_object_keys + - [ ] json_tuple + - [ ] schema_of_json + - [ ] to_json + +### lambda_funcs + - [ ] aggregate + - [ ] array_sort + - [ ] exists + - [ ] filter + - [ ] forall + - [ ] map_filter + - [ ] map_zip_with + - [ ] reduce + - [ ] transform + - [ ] transform_keys + - [ ] transform_values + - [ ] zip_with + +### map_funcs + - [ ] element_at + - [ ] map + - [ ] map_concat + - [ ] map_contains_key + - [ ] map_entries + - [ ] map_from_arrays + - [ ] map_from_entries + - [ ] map_keys + - [ ] map_values + - [ ] str_to_map + - [ ] try_element_at + +### math_funcs + - [x] % + - [x] * + - [x] + + - [x] - + - [x] / + - [x] abs + - [x] acos + - [ ] acosh + - [x] asin + - [ ] asinh + - [x] atan + - [x] atan2 + - [ ] atanh + - [ ] bin + - [ ] bround + - [ ] cbrt + - [x] ceil + - [x] ceiling + - [ ] conv + - [x] cos + - [ ] cosh + - [ ] cot + - [ ] csc + - [ ] degrees + - [ ] div + - [ ] e + - [x] exp + - [ ] expm1 + - [ ] factorial + - [x] floor + - [ ] greatest + - [ ] hex + - [ ] hypot + - [ ] least + - [x] ln + - [
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#issuecomment-2141027572 @andygrove I fixed all the comments, however you are right, sometimes we support partially the function. means part of syntax or some value range not supported. here comes an idea for follow up PR to introduce partially supported status(or similar) with the reason why it is supported partially -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619544584 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [x] some + - [x] std + - [x] stddev + - [x] stddev_pop + - [x] stddev_samp + - [x] sum + - [ ] try_avg + - [ ] try_sum + - [x] var_pop + - [x] var_samp + - [x] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [ ] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [x] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [x] ifnull + - [ ] nanvl + - [x] nullif + - [x] nvl + - [x] nvl2 + - [ ] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [ ] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [ ] to_date + - [ ] to_timestamp + - [ ] to_timestamp_ltz + - [ ] to_timestamp_ntz + - [ ] to_unix_timestamp + - [ ] to_utc_timestamp + - [ ] trunc + - [ ] try_to_timestamp + - [ ] unix_date + - [ ] unix_micros + - [ ] unix_millis + - [ ] unix_seconds + - [ ] unix_timestamp + - [ ] weekday + - [ ] weekofyear + - [ ] year + +### generator_funcs + - [ ] explode + - [ ] explode_outer + - [ ] inline + - [ ] inline_outer + - [ ] posexplode + - [ ] posexplode_outer + - [ ] stack + +### hash_funcs + - [ ] crc32 + - [ ] hash + - [x] md5 + - [ ] sha + - [ ] sha1 + - [ ] sha2 + - [ ] xxhash64 + +### json_funcs + - [ ] from_json + - [ ] get_json_object + - [ ] json_array_length + - [ ] json_object_keys + - [ ] json_tuple + - [ ] schema_of_json + - [ ] to_json + +### lambda_funcs + - [ ] aggregate + - [ ] array_sort + - [ ] exists + - [ ] filter + - [ ] forall + - [ ] map_filter + - [ ] map_zip_with + - [ ] reduce + - [ ] transform + - [ ] transform_keys + - [ ] transform_values + - [ ] zip_with + +### map_funcs + - [ ] element_at + - [ ] map + - [ ] map_concat + - [ ] map_contains_key + - [ ] map_entries + - [ ] map_from_arrays + - [ ] map_from_entries + - [ ] map_keys + - [ ] map_values + - [ ] str_to_map + - [ ] try_element_at + +### math_funcs + - [x] % + - [x] * + - [x] + + - [x] - + - [x] / + - [x] abs + - [x] acos + - [ ] acosh + - [x] asin + - [ ] asinh + - [x] atan + - [x] atan2 + - [ ] atanh + - [ ] bin + - [ ] bround + - [ ] cbrt + - [x] ceil + - [x] ceiling + - [ ] conv + - [x] cos + - [ ] cosh + - [ ] cot + - [ ] csc + - [ ] degrees + - [ ] div + - [ ] e + - [x] exp + - [ ] expm1 + - [ ] factorial + - [x] floor + - [ ] greatest + - [ ] hex + - [ ] hypot + - [ ] least + - [x] ln + - [
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619538503 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count Review Comment: regr_avgx supported by DF ``` > SELECT regr_avgx(1, 2); +--+ | REGR_AVGX(Int64(1),Int64(2)) | +--+ | 2.0 | +--+ ``` so I think all is fair here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619533113 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [x] some + - [x] std + - [x] stddev + - [x] stddev_pop + - [x] stddev_samp + - [x] sum + - [ ] try_avg + - [ ] try_sum + - [x] var_pop + - [x] var_samp + - [x] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [ ] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [x] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [x] ifnull + - [ ] nanvl + - [x] nullif + - [x] nvl + - [x] nvl2 + - [ ] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [ ] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [ ] to_date + - [ ] to_timestamp + - [ ] to_timestamp_ltz + - [ ] to_timestamp_ntz + - [ ] to_unix_timestamp + - [ ] to_utc_timestamp + - [ ] trunc + - [ ] try_to_timestamp + - [ ] unix_date + - [ ] unix_micros + - [ ] unix_millis + - [ ] unix_seconds + - [ ] unix_timestamp + - [ ] weekday + - [ ] weekofyear + - [ ] year + +### generator_funcs + - [ ] explode + - [ ] explode_outer + - [ ] inline + - [ ] inline_outer + - [ ] posexplode + - [ ] posexplode_outer + - [ ] stack + +### hash_funcs + - [ ] crc32 + - [ ] hash + - [x] md5 + - [ ] sha + - [ ] sha1 + - [ ] sha2 + - [ ] xxhash64 + +### json_funcs + - [ ] from_json + - [ ] get_json_object + - [ ] json_array_length + - [ ] json_object_keys + - [ ] json_tuple + - [ ] schema_of_json + - [ ] to_json + +### lambda_funcs + - [ ] aggregate + - [ ] array_sort + - [ ] exists + - [ ] filter + - [ ] forall + - [ ] map_filter + - [ ] map_zip_with + - [ ] reduce + - [ ] transform + - [ ] transform_keys + - [ ] transform_values + - [ ] zip_with + +### map_funcs + - [ ] element_at + - [ ] map + - [ ] map_concat + - [ ] map_contains_key + - [ ] map_entries + - [ ] map_from_arrays + - [ ] map_from_entries + - [ ] map_keys + - [ ] map_values + - [ ] str_to_map + - [ ] try_element_at + +### math_funcs + - [x] % + - [x] * + - [x] + + - [x] - + - [x] / + - [x] abs + - [x] acos + - [ ] acosh + - [x] asin + - [ ] asinh + - [x] atan + - [x] atan2 + - [ ] atanh + - [ ] bin + - [ ] bround + - [ ] cbrt + - [x] ceil + - [x] ceiling + - [ ] conv + - [x] cos + - [ ] cosh + - [ ] cot + - [ ] csc + - [ ] degrees + - [ ] div + - [ ] e + - [x] exp + - [ ] expm1 + - [ ] factorial + - [x] floor + - [ ] greatest + - [ ] hex + - [ ] hypot + - [ ] least + - [x] ln + - [
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619530851 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [x] some + - [x] std + - [x] stddev + - [x] stddev_pop + - [x] stddev_samp + - [x] sum + - [ ] try_avg + - [ ] try_sum + - [x] var_pop + - [x] var_samp + - [x] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [ ] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [x] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [x] ifnull + - [ ] nanvl + - [x] nullif + - [x] nvl + - [x] nvl2 + - [ ] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [ ] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [ ] to_date + - [ ] to_timestamp + - [ ] to_timestamp_ltz + - [ ] to_timestamp_ntz + - [ ] to_unix_timestamp + - [ ] to_utc_timestamp + - [ ] trunc + - [ ] try_to_timestamp + - [ ] unix_date + - [ ] unix_micros + - [ ] unix_millis + - [ ] unix_seconds + - [ ] unix_timestamp + - [ ] weekday + - [ ] weekofyear + - [ ] year + +### generator_funcs + - [ ] explode + - [ ] explode_outer + - [ ] inline + - [ ] inline_outer + - [ ] posexplode + - [ ] posexplode_outer + - [ ] stack + +### hash_funcs + - [ ] crc32 + - [ ] hash + - [x] md5 + - [ ] sha + - [ ] sha1 + - [ ] sha2 + - [ ] xxhash64 + +### json_funcs + - [ ] from_json + - [ ] get_json_object + - [ ] json_array_length + - [ ] json_object_keys + - [ ] json_tuple + - [ ] schema_of_json + - [ ] to_json + +### lambda_funcs + - [ ] aggregate + - [ ] array_sort + - [ ] exists + - [ ] filter + - [ ] forall + - [ ] map_filter + - [ ] map_zip_with + - [ ] reduce + - [ ] transform + - [ ] transform_keys + - [ ] transform_values + - [ ] zip_with + +### map_funcs + - [ ] element_at + - [ ] map + - [ ] map_concat + - [ ] map_contains_key + - [ ] map_entries + - [ ] map_from_arrays + - [ ] map_from_entries + - [ ] map_keys + - [ ] map_values + - [ ] str_to_map + - [ ] try_element_at + +### math_funcs + - [x] % + - [x] * + - [x] + + - [x] - + - [x] / + - [x] abs + - [x] acos + - [ ] acosh + - [x] asin + - [ ] asinh + - [x] atan + - [x] atan2 + - [ ] atanh + - [ ] bin + - [ ] bround + - [ ] cbrt + - [x] ceil + - [x] ceiling + - [ ] conv + - [x] cos + - [ ] cosh + - [ ] cot + - [ ] csc + - [ ] degrees + - [ ] div + - [ ] e + - [x] exp + - [ ] expm1 + - [ ] factorial + - [x] floor + - [ ] greatest + - [ ] hex + - [ ] hypot + - [ ] least + - [x] ln + - [
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619529446 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count Review Comment: ``` test("regr_avgx") { Seq(false, true).foreach { dictionary => withSQLConf( "parquet.enable.dictionary" -> dictionary.toString, "spark.comet.exec.shuffle.enabled" -> "true", CometConf.COMET_ENABLED.key -> "true", CometConf.COMET_EXEC_ENABLED.key -> "true", CometConf.COMET_SHUFFLE_ENFORCE_MODE_ENABLED.key -> "true", CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key -> "true", ) { val table = "test" withTable(table) { sql(s"create table $table(a int, b int) using parquet") sql(s"insert into $table VALUES (1, 2), (2, 2), (2, 3), (2, 4)") checkSparkAnswerAndOperator(s"SELECT regr_avgx(a, b) FROM $table") } } } } ``` regr_avgx test passes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619525532 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [x] some + - [x] std + - [x] stddev + - [x] stddev_pop + - [x] stddev_samp + - [x] sum + - [ ] try_avg + - [ ] try_sum + - [x] var_pop + - [x] var_samp + - [x] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [ ] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [x] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [x] ifnull + - [ ] nanvl + - [x] nullif + - [x] nvl + - [x] nvl2 + - [ ] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [ ] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [ ] to_date + - [ ] to_timestamp + - [ ] to_timestamp_ltz + - [ ] to_timestamp_ntz + - [ ] to_unix_timestamp + - [ ] to_utc_timestamp + - [ ] trunc + - [ ] try_to_timestamp + - [ ] unix_date + - [ ] unix_micros + - [ ] unix_millis + - [ ] unix_seconds + - [ ] unix_timestamp + - [ ] weekday + - [ ] weekofyear + - [ ] year + +### generator_funcs + - [ ] explode + - [ ] explode_outer + - [ ] inline + - [ ] inline_outer + - [ ] posexplode + - [ ] posexplode_outer + - [ ] stack + +### hash_funcs + - [ ] crc32 + - [ ] hash + - [x] md5 + - [ ] sha + - [ ] sha1 + - [ ] sha2 + - [ ] xxhash64 + +### json_funcs + - [ ] from_json + - [ ] get_json_object + - [ ] json_array_length + - [ ] json_object_keys + - [ ] json_tuple + - [ ] schema_of_json + - [ ] to_json + +### lambda_funcs + - [ ] aggregate + - [ ] array_sort + - [ ] exists + - [ ] filter + - [ ] forall + - [ ] map_filter + - [ ] map_zip_with + - [ ] reduce + - [ ] transform + - [ ] transform_keys + - [ ] transform_values + - [ ] zip_with + +### map_funcs + - [ ] element_at + - [ ] map + - [ ] map_concat + - [ ] map_contains_key + - [ ] map_entries + - [ ] map_from_arrays + - [ ] map_from_entries + - [ ] map_keys + - [ ] map_values + - [ ] str_to_map + - [ ] try_element_at + +### math_funcs + - [x] % + - [x] * + - [x] + + - [x] - + - [x] / + - [x] abs + - [x] acos + - [ ] acosh + - [x] asin + - [ ] asinh + - [x] atan + - [x] atan2 + - [ ] atanh + - [ ] bin + - [ ] bround + - [ ] cbrt + - [x] ceil + - [x] ceiling + - [ ] conv + - [x] cos + - [ ] cosh + - [ ] cot + - [ ] csc + - [ ] degrees + - [ ] div + - [ ] e + - [x] exp + - [ ] expm1 + - [ ] factorial + - [x] floor + - [ ] greatest + - [ ] hex + - [ ] hypot + - [ ] least + - [x] ln + - [
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619523325 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count Review Comment: The test is exactly for year, but if only YEAR supported, what is supposed to show to the user? Not supported? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619275529 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [x] some + - [x] std + - [x] stddev + - [x] stddev_pop + - [x] stddev_samp + - [x] sum + - [ ] try_avg + - [ ] try_sum + - [x] var_pop + - [x] var_samp + - [x] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [ ] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [x] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [x] ifnull + - [ ] nanvl + - [x] nullif + - [x] nvl + - [x] nvl2 + - [ ] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [ ] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract Review Comment: We only support `extract` for `YEAR` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619275258 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [x] some + - [x] std + - [x] stddev + - [x] stddev_pop + - [x] stddev_samp + - [x] sum + - [ ] try_avg + - [ ] try_sum + - [x] var_pop + - [x] var_samp + - [x] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [ ] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [x] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [x] ifnull + - [ ] nanvl + - [x] nullif + - [x] nvl + - [x] nvl2 + - [ ] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [ ] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part Review Comment: We only support `date_part` for `YEAR` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
viirya commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619233974 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [x] some + - [x] std + - [x] stddev + - [x] stddev_pop + - [x] stddev_samp + - [x] sum + - [ ] try_avg + - [ ] try_sum + - [x] var_pop + - [x] var_samp + - [x] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [ ] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [x] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [x] ifnull + - [ ] nanvl + - [x] nullif + - [x] nvl + - [x] nvl2 + - [ ] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [ ] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [ ] to_date + - [ ] to_timestamp + - [ ] to_timestamp_ltz + - [ ] to_timestamp_ntz + - [ ] to_unix_timestamp + - [ ] to_utc_timestamp + - [ ] trunc + - [ ] try_to_timestamp + - [ ] unix_date + - [ ] unix_micros + - [ ] unix_millis + - [ ] unix_seconds + - [ ] unix_timestamp + - [ ] weekday + - [ ] weekofyear + - [ ] year + +### generator_funcs + - [ ] explode + - [ ] explode_outer + - [ ] inline + - [ ] inline_outer + - [ ] posexplode + - [ ] posexplode_outer + - [ ] stack + +### hash_funcs + - [ ] crc32 + - [ ] hash + - [x] md5 + - [ ] sha + - [ ] sha1 + - [ ] sha2 + - [ ] xxhash64 + +### json_funcs + - [ ] from_json + - [ ] get_json_object + - [ ] json_array_length + - [ ] json_object_keys + - [ ] json_tuple + - [ ] schema_of_json + - [ ] to_json + +### lambda_funcs + - [ ] aggregate + - [ ] array_sort + - [ ] exists + - [ ] filter + - [ ] forall + - [ ] map_filter + - [ ] map_zip_with + - [ ] reduce + - [ ] transform + - [ ] transform_keys + - [ ] transform_values + - [ ] zip_with + +### map_funcs + - [ ] element_at + - [ ] map + - [ ] map_concat + - [ ] map_contains_key + - [ ] map_entries + - [ ] map_from_arrays + - [ ] map_from_entries + - [ ] map_keys + - [ ] map_values + - [ ] str_to_map + - [ ] try_element_at + +### math_funcs + - [x] % + - [x] * + - [x] + + - [x] - + - [x] / + - [x] abs + - [x] acos + - [ ] acosh + - [x] asin + - [ ] asinh + - [x] atan + - [x] atan2 + - [ ] atanh + - [ ] bin + - [ ] bround + - [ ] cbrt + - [x] ceil + - [x] ceiling + - [ ] conv + - [x] cos + - [ ] cosh + - [ ] cot + - [ ] csc + - [ ] degrees + - [ ] div + - [ ] e + - [x] exp + - [ ] expm1 + - [ ] factorial + - [x] floor + - [ ] greatest + - [ ] hex + - [ ] hypot + - [ ] least + - [x] ln + - [ ]
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619211189 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [x] some + - [x] std + - [x] stddev + - [x] stddev_pop + - [x] stddev_samp + - [x] sum + - [ ] try_avg + - [ ] try_sum + - [x] var_pop + - [x] var_samp + - [x] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [ ] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [x] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [x] ifnull + - [ ] nanvl + - [x] nullif + - [x] nvl + - [x] nvl2 + - [ ] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [ ] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [ ] to_date + - [ ] to_timestamp + - [ ] to_timestamp_ltz + - [ ] to_timestamp_ntz + - [ ] to_unix_timestamp + - [ ] to_utc_timestamp + - [ ] trunc + - [ ] try_to_timestamp + - [ ] unix_date + - [ ] unix_micros + - [ ] unix_millis + - [ ] unix_seconds + - [ ] unix_timestamp + - [ ] weekday + - [ ] weekofyear + - [ ] year + +### generator_funcs + - [ ] explode + - [ ] explode_outer + - [ ] inline + - [ ] inline_outer + - [ ] posexplode + - [ ] posexplode_outer + - [ ] stack + +### hash_funcs + - [ ] crc32 + - [ ] hash + - [x] md5 + - [ ] sha + - [ ] sha1 + - [ ] sha2 + - [ ] xxhash64 + +### json_funcs + - [ ] from_json + - [ ] get_json_object + - [ ] json_array_length + - [ ] json_object_keys + - [ ] json_tuple + - [ ] schema_of_json + - [ ] to_json + +### lambda_funcs + - [ ] aggregate + - [ ] array_sort + - [ ] exists + - [ ] filter + - [ ] forall + - [ ] map_filter + - [ ] map_zip_with + - [ ] reduce + - [ ] transform + - [ ] transform_keys + - [ ] transform_values + - [ ] zip_with + +### map_funcs + - [ ] element_at + - [ ] map + - [ ] map_concat + - [ ] map_contains_key + - [ ] map_entries + - [ ] map_from_arrays + - [ ] map_from_entries + - [ ] map_keys + - [ ] map_values + - [ ] str_to_map + - [ ] try_element_at + +### math_funcs + - [x] % + - [x] * + - [x] + + - [x] - + - [x] / + - [x] abs + - [x] acos + - [ ] acosh + - [x] asin + - [ ] asinh + - [x] atan + - [x] atan2 + - [ ] atanh + - [ ] bin + - [ ] bround + - [ ] cbrt + - [x] ceil + - [x] ceiling + - [ ] conv + - [x] cos + - [ ] cosh + - [ ] cot + - [ ] csc + - [ ] degrees + - [ ] div + - [ ] e + - [x] exp + - [ ] expm1 + - [ ] factorial + - [x] floor + - [ ] greatest + - [ ] hex + - [ ] hypot + - [ ] least + - [x] ln + - [
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619210130 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [x] some + - [x] std + - [x] stddev + - [x] stddev_pop + - [x] stddev_samp + - [x] sum + - [ ] try_avg + - [ ] try_sum + - [x] var_pop + - [x] var_samp + - [x] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [ ] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [x] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [x] ifnull + - [ ] nanvl + - [x] nullif + - [x] nvl + - [x] nvl2 + - [ ] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [ ] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [ ] to_date + - [ ] to_timestamp + - [ ] to_timestamp_ltz + - [ ] to_timestamp_ntz + - [ ] to_unix_timestamp + - [ ] to_utc_timestamp + - [ ] trunc + - [ ] try_to_timestamp + - [ ] unix_date + - [ ] unix_micros + - [ ] unix_millis + - [ ] unix_seconds + - [ ] unix_timestamp + - [ ] weekday + - [ ] weekofyear + - [ ] year + +### generator_funcs + - [ ] explode + - [ ] explode_outer + - [ ] inline + - [ ] inline_outer + - [ ] posexplode + - [ ] posexplode_outer + - [ ] stack + +### hash_funcs + - [ ] crc32 + - [ ] hash + - [x] md5 + - [ ] sha + - [ ] sha1 + - [ ] sha2 + - [ ] xxhash64 + +### json_funcs + - [ ] from_json + - [ ] get_json_object + - [ ] json_array_length + - [ ] json_object_keys + - [ ] json_tuple + - [ ] schema_of_json + - [ ] to_json + +### lambda_funcs + - [ ] aggregate + - [ ] array_sort + - [ ] exists + - [ ] filter + - [ ] forall + - [ ] map_filter + - [ ] map_zip_with + - [ ] reduce + - [ ] transform + - [ ] transform_keys + - [ ] transform_values + - [ ] zip_with + +### map_funcs + - [ ] element_at + - [ ] map + - [ ] map_concat + - [ ] map_contains_key + - [ ] map_entries + - [ ] map_from_arrays + - [ ] map_from_entries + - [ ] map_keys + - [ ] map_values + - [ ] str_to_map + - [ ] try_element_at + +### math_funcs + - [x] % + - [x] * + - [x] + + - [x] - + - [x] / + - [x] abs + - [x] acos + - [ ] acosh + - [x] asin + - [ ] asinh + - [x] atan + - [x] atan2 + - [ ] atanh + - [ ] bin + - [ ] bround + - [ ] cbrt + - [x] ceil + - [x] ceiling + - [ ] conv + - [x] cos + - [ ] cosh + - [ ] cot + - [ ] csc + - [ ] degrees + - [ ] div + - [ ] e + - [x] exp + - [ ] expm1 + - [ ] factorial + - [x] floor + - [ ] greatest + - [ ] hex + - [ ] hypot + - [ ] least + - [x] ln + - [
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619208014 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [x] some + - [x] std + - [x] stddev + - [x] stddev_pop + - [x] stddev_samp + - [x] sum + - [ ] try_avg + - [ ] try_sum + - [x] var_pop + - [x] var_samp + - [x] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [ ] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [x] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [x] ifnull + - [ ] nanvl + - [x] nullif + - [x] nvl + - [x] nvl2 + - [ ] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [ ] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [ ] to_date + - [ ] to_timestamp + - [ ] to_timestamp_ltz + - [ ] to_timestamp_ntz + - [ ] to_unix_timestamp + - [ ] to_utc_timestamp + - [ ] trunc + - [ ] try_to_timestamp + - [ ] unix_date + - [ ] unix_micros + - [ ] unix_millis + - [ ] unix_seconds + - [ ] unix_timestamp + - [ ] weekday + - [ ] weekofyear + - [ ] year + +### generator_funcs + - [ ] explode + - [ ] explode_outer + - [ ] inline + - [ ] inline_outer + - [ ] posexplode + - [ ] posexplode_outer + - [ ] stack + +### hash_funcs + - [ ] crc32 + - [ ] hash + - [x] md5 + - [ ] sha + - [ ] sha1 + - [ ] sha2 + - [ ] xxhash64 + +### json_funcs + - [ ] from_json + - [ ] get_json_object + - [ ] json_array_length + - [ ] json_object_keys + - [ ] json_tuple + - [ ] schema_of_json + - [ ] to_json + +### lambda_funcs + - [ ] aggregate + - [ ] array_sort + - [ ] exists + - [ ] filter + - [ ] forall + - [ ] map_filter + - [ ] map_zip_with + - [ ] reduce + - [ ] transform + - [ ] transform_keys + - [ ] transform_values + - [ ] zip_with + +### map_funcs + - [ ] element_at + - [ ] map + - [ ] map_concat + - [ ] map_contains_key + - [ ] map_entries + - [ ] map_from_arrays + - [ ] map_from_entries + - [ ] map_keys + - [ ] map_values + - [ ] str_to_map + - [ ] try_element_at + +### math_funcs + - [x] % + - [x] * + - [x] + + - [x] - + - [x] / + - [x] abs + - [x] acos + - [ ] acosh + - [x] asin + - [ ] asinh + - [x] atan + - [x] atan2 + - [ ] atanh + - [ ] bin + - [ ] bround + - [ ] cbrt + - [x] ceil + - [x] ceiling + - [ ] conv + - [x] cos + - [ ] cosh + - [ ] cot + - [ ] csc + - [ ] degrees + - [ ] div + - [ ] e + - [x] exp + - [ ] expm1 + - [ ] factorial + - [x] floor + - [ ] greatest + - [ ] hex + - [ ] hypot + - [ ] least + - [x] ln + - [
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1619197931 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,475 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [x] any + - [x] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [x] avg + - [x] bit_and + - [x] bit_or + - [x] bit_xor + - [x] bool_and + - [x] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [x] count + - [x] count_if + - [ ] count_min_sketch + - [x] covar_pop + - [x] covar_samp + - [x] every + - [x] first + - [x] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [x] last + - [x] last_value + - [x] max + - [ ] max_by + - [x] mean + - [ ] median + - [x] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [x] regr_avgx + - [x] regr_avgy + - [x] regr_count Review Comment: I don't think that we support these expressions -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1617434431 ## spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala: ## @@ -217,6 +325,25 @@ class CometExpressionCoverageSuite extends CometTestBase with AdaptiveSparkPlanH str shouldBe s"${getLicenseHeader()}\n# Supported Spark Expressions\n\n### group1\n - [x] f1\n - [ ] f2\n\n### group2\n - [x] f3\n - [ ] f4\n\n### group3\n - [x] f5" } + test("get sql function arguments") { +// getSqlFunctionArguments("SELECT unix_seconds(TIMESTAMP('1970-01-01 00:00:01Z'))") shouldBe Seq("TIMESTAMP('1970-01-01 00:00:01Z')") +// getSqlFunctionArguments("SELECT decode(unhex('537061726B2053514C'), 'UTF-8')") shouldBe Seq("unhex('537061726B2053514C')", "'UTF-8'") +// getSqlFunctionArguments("SELECT extract(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456')") shouldBe Seq("'YEAR'", "TIMESTAMP '2019-08-12 01:00:00.123456'") +// getSqlFunctionArguments("SELECT exists(array(1, 2, 3), x -> x % 2 == 0)") shouldBe Seq("array(1, 2, 3)") +getSqlFunctionArguments("select to_char(454, '999')") shouldBe Seq("array(1, 2, 3)") Review Comment: Correct, the test is ignored for now, thats why it is passed. annoying. Fixed that -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
advancedxy commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1616589501 ## spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala: ## @@ -217,6 +325,25 @@ class CometExpressionCoverageSuite extends CometTestBase with AdaptiveSparkPlanH str shouldBe s"${getLicenseHeader()}\n# Supported Spark Expressions\n\n### group1\n - [x] f1\n - [ ] f2\n\n### group2\n - [x] f3\n - [ ] f4\n\n### group3\n - [x] f5" } + test("get sql function arguments") { +// getSqlFunctionArguments("SELECT unix_seconds(TIMESTAMP('1970-01-01 00:00:01Z'))") shouldBe Seq("TIMESTAMP('1970-01-01 00:00:01Z')") +// getSqlFunctionArguments("SELECT decode(unhex('537061726B2053514C'), 'UTF-8')") shouldBe Seq("unhex('537061726B2053514C')", "'UTF-8'") +// getSqlFunctionArguments("SELECT extract(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456')") shouldBe Seq("'YEAR'", "TIMESTAMP '2019-08-12 01:00:00.123456'") +// getSqlFunctionArguments("SELECT exists(array(1, 2, 3), x -> x % 2 == 0)") shouldBe Seq("array(1, 2, 3)") +getSqlFunctionArguments("select to_char(454, '999')") shouldBe Seq("array(1, 2, 3)") Review Comment: hmmm, i think it should be updated to ```scala getSqlFunctionArguments("select to_char(454, '999')") shouldBe Seq(454, "999") ``` ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
advancedxy commented on PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#issuecomment-2134246796 > Thanks @advancedxy I fixed the flaws you mentioned. However I'd like to make refactoring you recommended in followup PR, this PR getting too large for review Of course, sounds good to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#issuecomment-2134236220 Thanks @advancedxy I fixed the flaws you mentioned. However I'd like to make refactoring you recommended in followup PR, this PR getting too large for review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1616497338 ## spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala: ## @@ -54,16 +57,79 @@ class CometExpressionCoverageSuite extends CometTestBase with AdaptiveSparkPlanH private val valuesPattern = """(?i)FROM VALUES(.+?);""".r private val selectPattern = """(i?)SELECT(.+?)FROM""".r + // exclude funcs Comet has no plans to support streaming in near future + // like spark streaming functions, java calls + private val outofRoadmapFuncs = +List("window", "session_window", "window_time", "java_method", "reflect") + private val sqlConf = Seq( +"spark.comet.exec.shuffle.enabled" -> "true", +"spark.sql.optimizer.excludedRules" -> "org.apache.spark.sql.catalyst.optimizer.ConstantFolding", +"spark.sql.adaptive.optimizer.excludedRules" -> "org.apache.spark.sql.catalyst.optimizer.ConstantFolding") + + // Tests to run manually as its syntax is different from usual or nested + val manualTests: Map[String, (String, String)] = Map( +"!" -> ("select true a", "select ! true from tbl"), +"%" -> ("select 1 a, 2 b", "select a + b from tbl"), Review Comment: Corrected -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1616497081 ## spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala: ## @@ -217,6 +325,25 @@ class CometExpressionCoverageSuite extends CometTestBase with AdaptiveSparkPlanH str shouldBe s"${getLicenseHeader()}\n# Supported Spark Expressions\n\n### group1\n - [x] f1\n - [ ] f2\n\n### group2\n - [x] f3\n - [ ] f4\n\n### group3\n - [x] f5" } + test("get sql function arguments") { +// getSqlFunctionArguments("SELECT unix_seconds(TIMESTAMP('1970-01-01 00:00:01Z'))") shouldBe Seq("TIMESTAMP('1970-01-01 00:00:01Z')") +// getSqlFunctionArguments("SELECT decode(unhex('537061726B2053514C'), 'UTF-8')") shouldBe Seq("unhex('537061726B2053514C')", "'UTF-8'") +// getSqlFunctionArguments("SELECT extract(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456')") shouldBe Seq("'YEAR'", "TIMESTAMP '2019-08-12 01:00:00.123456'") +// getSqlFunctionArguments("SELECT exists(array(1, 2, 3), x -> x % 2 == 0)") shouldBe Seq("array(1, 2, 3)") +getSqlFunctionArguments("select to_char(454, '999')") shouldBe Seq("array(1, 2, 3)") Review Comment: Oops, uncommented -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
advancedxy commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1614532699 ## spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala: ## @@ -217,6 +325,25 @@ class CometExpressionCoverageSuite extends CometTestBase with AdaptiveSparkPlanH str shouldBe s"${getLicenseHeader()}\n# Supported Spark Expressions\n\n### group1\n - [x] f1\n - [ ] f2\n\n### group2\n - [x] f3\n - [ ] f4\n\n### group3\n - [x] f5" } + test("get sql function arguments") { +// getSqlFunctionArguments("SELECT unix_seconds(TIMESTAMP('1970-01-01 00:00:01Z'))") shouldBe Seq("TIMESTAMP('1970-01-01 00:00:01Z')") +// getSqlFunctionArguments("SELECT decode(unhex('537061726B2053514C'), 'UTF-8')") shouldBe Seq("unhex('537061726B2053514C')", "'UTF-8'") +// getSqlFunctionArguments("SELECT extract(YEAR FROM TIMESTAMP '2019-08-12 01:00:00.123456')") shouldBe Seq("'YEAR'", "TIMESTAMP '2019-08-12 01:00:00.123456'") +// getSqlFunctionArguments("SELECT exists(array(1, 2, 3), x -> x % 2 == 0)") shouldBe Seq("array(1, 2, 3)") +getSqlFunctionArguments("select to_char(454, '999')") shouldBe Seq("array(1, 2, 3)") Review Comment: this test is wrong? the arguments are not correct. ## spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala: ## @@ -54,16 +57,79 @@ class CometExpressionCoverageSuite extends CometTestBase with AdaptiveSparkPlanH private val valuesPattern = """(?i)FROM VALUES(.+?);""".r private val selectPattern = """(i?)SELECT(.+?)FROM""".r + // exclude funcs Comet has no plans to support streaming in near future + // like spark streaming functions, java calls + private val outofRoadmapFuncs = +List("window", "session_window", "window_time", "java_method", "reflect") + private val sqlConf = Seq( +"spark.comet.exec.shuffle.enabled" -> "true", +"spark.sql.optimizer.excludedRules" -> "org.apache.spark.sql.catalyst.optimizer.ConstantFolding", +"spark.sql.adaptive.optimizer.excludedRules" -> "org.apache.spark.sql.catalyst.optimizer.ConstantFolding") + + // Tests to run manually as its syntax is different from usual or nested + val manualTests: Map[String, (String, String)] = Map( +"!" -> ("select true a", "select ! true from tbl"), +"%" -> ("select 1 a, 2 b", "select a + b from tbl"), Review Comment: the mapped should be `select a % b from the tbl`? ## spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala: ## @@ -54,16 +57,79 @@ class CometExpressionCoverageSuite extends CometTestBase with AdaptiveSparkPlanH private val valuesPattern = """(?i)FROM VALUES(.+?);""".r private val selectPattern = """(i?)SELECT(.+?)FROM""".r + // exclude funcs Comet has no plans to support streaming in near future + // like spark streaming functions, java calls + private val outofRoadmapFuncs = +List("window", "session_window", "window_time", "java_method", "reflect") + private val sqlConf = Seq( +"spark.comet.exec.shuffle.enabled" -> "true", +"spark.sql.optimizer.excludedRules" -> "org.apache.spark.sql.catalyst.optimizer.ConstantFolding", +"spark.sql.adaptive.optimizer.excludedRules" -> "org.apache.spark.sql.catalyst.optimizer.ConstantFolding") + + // Tests to run manually as its syntax is different from usual or nested + val manualTests: Map[String, (String, String)] = Map( +"!" -> ("select true a", "select ! true from tbl"), +"%" -> ("select 1 a, 2 b", "select a + b from tbl"), Review Comment: Or maybe you can just generate the binary operators and its mappings in a pragmatic way? Such as: ```scala Seq("%", "&", ..., "|").map(x => x -> ("select 1 a, 2 b", s"select a $x b from tbl") ``` ## spark/src/test/scala/org/apache/comet/CometExpressionCoverageSuite.scala: ## @@ -116,20 +182,62 @@ class CometExpressionCoverageSuite extends CometTestBase with AdaptiveSparkPlanH // ConstantFolding is a operator optimization rule in Catalyst that replaces expressions // that can be statically evaluated with their equivalent literal values. dfMessage = runDatafusionCli(q) - testSingleLineQuery( -"select 'dummy' x", -s"${q.dropRight(1)}, x from tbl", -excludedOptimizerRules = - Some("org.apache.spark.sql.catalyst.optimizer.ConstantFolding")) + + manualTests.get(func.name) match { +// the test is manual query +case Some(test) => testSingleLineQuery(test._1, test._2, sqlConf = sqlConf) +case None => + // extract function arguments as a sql text + // example: + // cos(0) -> 0 + // explode_outer(array(10, 20)) -> array(10, 20) + val args = getSqlFunctionArguments(q.dropRight(1)) + val
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#issuecomment-2130409303 @andygrove @advancedxy I fixed the test, implementing extra parsing and manual small tests if the parsing is complicated. I hope now we have better picture. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1612223812 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp + - [ ] sum + - [ ] try_avg + - [ ] try_sum + - [ ] var_pop + - [ ] var_samp + - [ ] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [x] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [ ] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [ ] ifnull + - [ ] nanvl + - [x] nullif + - [ ] nvl + - [x] nvl2 + - [x] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [x] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [x] to_date Review Comment: I think the problem here when Spark evaluates function of literal it skips Comet... the test above tests the function of the column and Comet enabled. Thinking how to fix it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1612087281 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp + - [ ] sum + - [ ] try_avg + - [ ] try_sum + - [ ] var_pop + - [ ] var_samp + - [ ] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [x] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [ ] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [ ] ifnull + - [ ] nanvl + - [x] nullif + - [ ] nvl + - [x] nvl2 + - [x] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [x] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [x] to_date Review Comment: I'm checking that, I just run the test manually and it failed as you mentioned ``` test("to_date") { Seq(false, true).foreach { dictionary => withSQLConf( "parquet.enable.dictionary" -> dictionary.toString, "spark.comet.exec.shuffle.enabled" -> "true", ) { val table = "test" withTable(table) { sql(s"create table $table(col string) using parquet") sql(s"insert into $table VALUES ('2009-07-30 04:17:52')") checkSparkAnswerAndOperator(s"SELECT to_date(col) FROM $table") } } } } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1612070732 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp + - [ ] sum + - [ ] try_avg + - [ ] try_sum + - [ ] var_pop + - [ ] var_samp + - [ ] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [x] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [ ] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [ ] ifnull + - [ ] nanvl + - [x] nullif + - [ ] nvl + - [x] nvl2 + - [x] when + +### conversion_funcs + - [ ] bigint + - [ ] binary + - [ ] boolean + - [x] cast + - [ ] date + - [ ] decimal + - [ ] double + - [ ] float + - [ ] int + - [ ] smallint + - [ ] string + - [ ] timestamp + - [ ] tinyint + +### csv_funcs + - [ ] from_csv + - [ ] schema_of_csv + - [ ] to_csv + +### datetime_funcs + - [ ] add_months + - [ ] convert_timezone + - [x] curdate + - [x] current_date + - [ ] current_timestamp + - [x] current_timezone + - [ ] date_add + - [ ] date_diff + - [ ] date_format + - [ ] date_from_unix_date + - [x] date_part + - [ ] date_sub + - [ ] date_trunc + - [ ] dateadd + - [ ] datediff + - [x] datepart + - [ ] day + - [ ] dayofmonth + - [ ] dayofweek + - [ ] dayofyear + - [x] extract + - [ ] from_unixtime + - [ ] from_utc_timestamp + - [ ] hour + - [ ] last_day + - [ ] localtimestamp + - [ ] make_date + - [ ] make_dt_interval + - [ ] make_interval + - [ ] make_timestamp + - [ ] make_timestamp_ltz + - [ ] make_timestamp_ntz + - [ ] make_ym_interval + - [ ] minute + - [ ] month + - [ ] months_between + - [ ] next_day + - [ ] now + - [ ] quarter + - [ ] second + - [ ] timestamp_micros + - [ ] timestamp_millis + - [ ] timestamp_seconds + - [x] to_date Review Comment: With Spark 3.4, `to_date` with no format arg translates to `cast(expr as date)`, which we do not currently support (but will soon - there is PR pending) and Comet cannot run natively because `Unsupported cast from StringType to DateType`. When a format arg is supplied, Comet cannot run natively because `gettimestamp is not supported`. Do you know why this doc says that it is supported? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
advancedxy commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1609204351 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp Review Comment: Yeah.. Maybe we need to enable Comet Shuffle to re-run the CometExpressionCoverageSuite. ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp + - [ ] sum + - [ ] try_avg + - [ ] try_sum + - [ ] var_pop + - [ ] var_samp + - [ ] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [x] get + - [ ] sequence + - [ ] shuffle + - [ ] slice + - [ ] sort_array + +### bitwise_funcs + - [x] & + - [x] ^ + - [ ] bit_count + - [ ] bit_get + - [ ] getbit + - [x] shiftright + - [ ] shiftrightunsigned + - [x] | + - [ ] ~ + +### collection_funcs + - [ ] array_size + - [ ] cardinality + - [ ] concat + - [x] reverse + - [ ] size + +### conditional_funcs + - [x] coalesce + - [x] if + - [ ] ifnull + - [ ] nanvl + - [x] nullif + - [ ] nvl Review Comment: hmm, it should be supported? It's essential the same as `coalesce`, which is replaced during analysis phase. Maybe we should file an issue to track this kind of problem. ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp + - [ ] sum + - [ ] try_avg + - [ ] try_sum + - [ ] var_pop + - [ ] var_samp + - [ ] variance + +### array_funcs + - [ ] array + - [ ] array_append + - [ ] array_compact + - [ ] array_contains + - [ ] array_distinct + - [ ] array_except + - [ ] array_insert + - [ ] array_intersect + - [ ] array_join + - [ ] array_max + - [ ] array_min + - [ ] array_position + - [ ] array_remove + - [ ] array_repeat + - [ ] array_union + - [ ] arrays_overlap + - [ ] arrays_zip + - [ ] flatten + - [x] get + - [ ] sequence + - [ ] shuffle + - [ ] slice
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1608634043 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp Review Comment: good point -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
viirya commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1608605897 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp Review Comment: Have you enabled Comet shuffle? The upper `HashAggregate` cannot be translated to `CometHashAggregate` because Comet shuffle is not enabled. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
viirya commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1608605897 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp Review Comment: Have you enabled Comet shuffle? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1608598651 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp Review Comment: @viirya @andygrove @kazuyukitanimura @advancedxy do you guys think this is a sign of the expression not natively supported? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1608597228 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp Review Comment: I ran the test manually ``` test("sttdev") { Seq(false, true).foreach { dictionary => withSQLConf("parquet.enable.dictionary" -> dictionary.toString) { val table = "test" withTable(table) { sql(s"create table $table(col int) using parquet") sql(s"insert into $table VALUES (1), (2), (3)") checkSparkAnswerAndOperator(s"SELECT stddev_pop(col) FROM $table") } } } } ``` and it fails `Expected only Comet native operators, but found HashAggregate.` the physical plan is ``` == Physical Plan == AdaptiveSparkPlan isFinalPlan=false +- HashAggregate(keys=[], functions=[stddev_pop(cast(col#0 as double))]) +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=118] +- CometHashAggregate [col#0], Partial, [partial_stddev_pop(cast(col#0 as double))] +- CometScan parquet [col#0] Batched: true, DataFilters: [], Format: CometParquet, Location: InMemoryFileIndex(1 paths)[file:/tmp/stddev], PartitionFilters: [], PushedFilters: [], ReadSchema: struct ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1608510461 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp Review Comment: I'll double check that but the `spark_builtin_expr_coverage.txt` shows ``` Unsupported: Expected only Comet native operators but found Spark fallback ``` for both of them. I'll verify if its a test problem or not -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#issuecomment-2122853631 This is very cool @comphead but it looks like it is not detecting any of the aggregate functions that we support? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1608498362 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp Review Comment: These are supported according to https://datafusion.apache.org/comet/user-guide/expressions.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
andygrove commented on code in PR #455: URL: https://github.com/apache/datafusion-comet/pull/455#discussion_r1608493460 ## docs/spark_expressions_support.md: ## @@ -0,0 +1,477 @@ + + +# Supported Spark Expressions + +### agg_funcs + - [ ] any + - [ ] any_value + - [ ] approx_count_distinct + - [ ] approx_percentile + - [ ] array_agg + - [ ] avg + - [ ] bit_and + - [ ] bit_or + - [ ] bit_xor + - [ ] bool_and + - [ ] bool_or + - [ ] collect_list + - [ ] collect_set + - [ ] corr + - [ ] count + - [ ] count_if + - [ ] count_min_sketch + - [ ] covar_pop + - [ ] covar_samp + - [ ] every + - [ ] first + - [ ] first_value + - [ ] grouping + - [ ] grouping_id + - [ ] histogram_numeric + - [ ] kurtosis + - [ ] last + - [ ] last_value + - [ ] max + - [ ] max_by + - [ ] mean + - [ ] median + - [ ] min + - [ ] min_by + - [ ] mode + - [ ] percentile + - [ ] percentile_approx + - [ ] regr_avgx + - [ ] regr_avgy + - [ ] regr_count + - [ ] regr_intercept + - [ ] regr_r2 + - [ ] regr_slope + - [ ] regr_sxx + - [ ] regr_sxy + - [ ] regr_syy + - [ ] skewness + - [ ] some + - [ ] std + - [ ] stddev + - [ ] stddev_pop + - [ ] stddev_samp + - [ ] sum + - [ ] try_avg + - [ ] try_sum + - [ ] var_pop + - [ ] var_samp Review Comment: According to https://datafusion.apache.org/comet/user-guide/expressions.html, we do support `VariancePop` and `VarianceSamp` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
[PR] Minor: Generate the supported Spark builtin expression list into MD file [datafusion-comet]
comphead opened a new pull request, #455: URL: https://github.com/apache/datafusion-comet/pull/455 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org