Re: [PR] [Bug Fix]: Deem hash repartition unnecessary when input and output has 1 partition [datafusion]

2024-04-23 Thread via GitHub
mustafasrepo commented on PR #10095: URL: https://github.com/apache/datafusion/pull/10095#issuecomment-2074203264 > > Maybe we can insert RepartitionExec on top UnionExecs if their output partition number > config.target_partitions. By this way, we can guarantee this violation wouldn't prop

Re: [I] Make repartitioning in `PhysicalPlan` output less confusing [datafusion]

2024-04-23 Thread via GitHub
mustafasrepo commented on issue #9370: URL: https://github.com/apache/datafusion/issues/9370#issuecomment-2074192047 As you say `RepartitionExec::pull_from_input` is `async`. However, it opens a work for each input partition. Hence, when the plan contains `RepartitionExec: partitioning=Hash

Re: [PR] Move `create_physical_expr` to `phy-expr-common` #1 [datafusion]

2024-04-23 Thread via GitHub
berkaysynnada commented on code in PR #10144: URL: https://github.com/apache/datafusion/pull/10144#discussion_r1577356110 ## datafusion/physical-expr-common/src/intervals/cp_solver.rs: ## @@ -0,0 +1,362 @@ +// Licensed to the Apache Software Foundation (ASF) under one Review Co

Re: [PR] fix: cargo warnings of import item [datafusion]

2024-04-23 Thread via GitHub
waynexia merged PR #10196: URL: https://github.com/apache/datafusion/pull/10196 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

[I] Create `Struct` table with explicit type and name [datafusion]

2024-04-23 Thread via GitHub
jayzhan211 opened a new issue, #10207: URL: https://github.com/apache/datafusion/issues/10207 ### Is your feature request related to a problem or challenge? Currently we can build struct table like this ``` statement ok create table t as values (struct(1, 'a')), (struct(2,

Re: [I] `select array_concat([])` panicked [datafusion]

2024-04-23 Thread via GitHub
jayzhan211 commented on issue #10200: URL: https://github.com/apache/datafusion/issues/10200#issuecomment-2074044846 @Lordworms I think arguments validity check are more like `signature` instead of one that need to be computed by the user in `return_type`. -- This is an automated mess

Re: [I] Support `Union` as a function [datafusion]

2024-04-23 Thread via GitHub
vaibhawvipul commented on issue #10206: URL: https://github.com/apache/datafusion/issues/10206#issuecomment-2074044623 I would like to work on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] `select array_concat([])` panicked [datafusion]

2024-04-23 Thread via GitHub
Lordworms commented on issue #10200: URL: https://github.com/apache/datafusion/issues/10200#issuecomment-207398 > ```shell > select array_concat(make_array()); > ``` Hi @jayzhan211 I was going to implement the same behavior like DuckDB https://github.com/apache/datafusion/

[I] Support `Union` [datafusion]

2024-04-23 Thread via GitHub
jayzhan211 opened a new issue, #10206: URL: https://github.com/apache/datafusion/issues/10206 ### Is your feature request related to a problem or challenge? We have recently issues show that it is time to support `Union` #10161 #10139 `ScalarValue::iter_to_array` #10180 `comp

Re: [I] API in ParquetExec to pass in RowSelections to `ParquetExec` (enable custom indexes, finer grained pushdown) [datafusion]

2024-04-23 Thread via GitHub
waynexia commented on issue #9929: URL: https://github.com/apache/datafusion/issues/9929#issuecomment-2073949931 Related code is here https://github.com/GreptimeTeam/greptimedb/commit/9e1e4a518143236371b76ecb6f1da5c694eb867b#diff-ac43dc13456cf41e4fabb9d577101e245366687d49064aff99bf10aab20b9c

Re: [PR] Move create_physical_expr to phy-expr-common #3 [datafusion]

2024-04-23 Thread via GitHub
jayzhan211 commented on PR #10188: URL: https://github.com/apache/datafusion/pull/10188#issuecomment-2073930173 It depends on you and other reviewers, if this PR is too large, we can close #1 first, and move on to #2 and finally this one. -- This is an automated message from the Apache Gi

Re: [I] Move coalesce function from math to core [datafusion]

2024-04-23 Thread via GitHub
jayzhan211 closed issue #10174: Move coalesce function from math to core URL: https://github.com/apache/datafusion/issues/10174 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Move coalesce function from math to core [datafusion]

2024-04-23 Thread via GitHub
jayzhan211 merged PR #10201: URL: https://github.com/apache/datafusion/pull/10201 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] feat: Improve CometHashJoin statistics [datafusion-comet]

2024-04-23 Thread via GitHub
viirya commented on code in PR #309: URL: https://github.com/apache/datafusion-comet/pull/309#discussion_r1577160047 ## spark/src/test/scala/org/apache/comet/exec/CometExecSuite.scala: ## @@ -331,6 +331,43 @@ class CometExecSuite extends CometTestBase { } } + test("Co

Re: [PR] feat: Improve CometHashJoin statistics [datafusion-comet]

2024-04-23 Thread via GitHub
kazuyukitanimura commented on code in PR #309: URL: https://github.com/apache/datafusion-comet/pull/309#discussion_r1577144791 ## spark/src/test/scala/org/apache/comet/exec/CometExecSuite.scala: ## @@ -331,6 +331,43 @@ class CometExecSuite extends CometTestBase { } } +

Re: [PR] Minor: Avoid a clone in ArrayFunctionRewriter [datafusion]

2024-04-23 Thread via GitHub
jayzhan211 merged PR #10204: URL: https://github.com/apache/datafusion/pull/10204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] Minor: extend ci to test each package [datafusion]

2024-04-23 Thread via GitHub
github-actions[bot] commented on PR #7940: URL: https://github.com/apache/datafusion/pull/7940#issuecomment-2073837808 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or th

Re: [PR] Refactor TreeNode recursions [datafusion]

2024-04-23 Thread via GitHub
github-actions[bot] commented on PR #7942: URL: https://github.com/apache/datafusion/pull/7942#issuecomment-2073837776 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or th

Re: [PR] refactor: support bitmap for u8/16 and i8/16 in `approx_distinct` [datafusion]

2024-04-23 Thread via GitHub
github-actions[bot] closed pull request #8462: refactor: support bitmap for u8/16 and i8/16 in `approx_distinct` URL: https://github.com/apache/datafusion/pull/8462 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] DRAFT: Resolve function calls by name during planning [datafusion]

2024-04-23 Thread via GitHub
github-actions[bot] closed pull request #8447: DRAFT: Resolve function calls by name during planning URL: https://github.com/apache/datafusion/pull/8447 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] feat: Support Variance [datafusion-comet]

2024-04-23 Thread via GitHub
huaxingao commented on code in PR #297: URL: https://github.com/apache/datafusion-comet/pull/297#discussion_r1577085846 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -426,6 +426,42 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde {

Re: [PR] ScalarUDF: Remove `supports_zero_argument` and avoid creating null array for empty args [datafusion]

2024-04-23 Thread via GitHub
jayzhan211 commented on code in PR #10193: URL: https://github.com/apache/datafusion/pull/10193#discussion_r1577075655 ## datafusion/functions/src/math/random.rs: ## @@ -64,45 +61,9 @@ impl ScalarUDFImpl for RandomFunc { Ok(Float64) } -fn invoke(&self, args:

Re: [PR] feat: Support Variance [datafusion-comet]

2024-04-23 Thread via GitHub
huaxingao commented on code in PR #297: URL: https://github.com/apache/datafusion-comet/pull/297#discussion_r1577073541 ## core/src/execution/proto/expr.proto: ## @@ -165,6 +167,18 @@ message CovPopulation { DataType datatype = 4; } +message VarianceSample { + Expr child

[I] ScalarUDF: Remove `supports_zero_argument` and avoid creating null array for empty args [datafusion]

2024-04-23 Thread via GitHub
jayzhan211 opened a new issue, #10205: URL: https://github.com/apache/datafusion/issues/10205 ### Is your feature request related to a problem or challenge? File an issue to discuss about design in #10193. Previously, we always provided a null array if the function supports zero

Re: [PR] Move coalesce function from math to core [datafusion]

2024-04-23 Thread via GitHub
xxxuuu commented on PR #10201: URL: https://github.com/apache/datafusion/pull/10201#issuecomment-2073694889 > it looks like there is a small CI failure Thank. I have fixed it! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Move create_physical_expr to phy-expr-common #3 [datafusion]

2024-04-23 Thread via GitHub
alamb commented on PR #10188: URL: https://github.com/apache/datafusion/pull/10188#issuecomment-2073688322 Sorry for being dense -- would you like me to review this PR? Or do you think we should review (and merge) https://github.com/apache/datafusion/pull/10176 and https://github.com/apac

Re: [PR] ScalarUDF: Remove `supports_zero_argument` and avoid creating null array for empty args [datafusion]

2024-04-23 Thread via GitHub
alamb commented on code in PR #10193: URL: https://github.com/apache/datafusion/pull/10193#discussion_r1577039606 ## datafusion/functions/src/math/random.rs: ## @@ -64,45 +61,9 @@ impl ScalarUDFImpl for RandomFunc { Ok(Float64) } -fn invoke(&self, args: &[Col

Re: [PR] chore: Update documentation publishing domain and path [datafusion-comet]

2024-04-23 Thread via GitHub
andygrove merged PR #310: URL: https://github.com/apache/datafusion-comet/pull/310 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] ScalarUDF: Remove `supports_zero_argument` and avoid creating null array for empty args [datafusion]

2024-04-23 Thread via GitHub
jayzhan211 commented on code in PR #10193: URL: https://github.com/apache/datafusion/pull/10193#discussion_r1577034922 ## datafusion/functions/src/math/random.rs: ## @@ -64,45 +61,9 @@ impl ScalarUDFImpl for RandomFunc { Ok(Float64) } -fn invoke(&self, args:

Re: [I] Any plan to support JSON or JSONB? [datafusion]

2024-04-23 Thread via GitHub
samuelcolvin commented on issue #7845: URL: https://github.com/apache/datafusion/issues/7845#issuecomment-2073677832 Oh, and as per the micro-benchmarks on https://github.com/pydantic/jiter/pull/84, the performance of these methods should now be on a par with duckdb. @alamb, you migh

Re: [I] Any plan to support JSON or JSONB? [datafusion]

2024-04-23 Thread via GitHub
samuelcolvin commented on issue #7845: URL: https://github.com/apache/datafusion/issues/7845#issuecomment-2073675288 https://github.com/datafusion-contrib/datafusion-functions-json now provides the following methods, are think we're nearly ready for a first release, see https://github.com/d

Re: [I] `select array_concat([])` panicked [datafusion]

2024-04-23 Thread via GitHub
jayzhan211 commented on issue #10200: URL: https://github.com/apache/datafusion/issues/10200#issuecomment-2073668237 @Lordworms Not sure what is your idea to solve this problem. I expect this to be solved with the correct signature. I had done it at https://github.com/apache/datafusion/pu

Re: [PR] Move create_physical_expr to phy-expr-common #3 [datafusion]

2024-04-23 Thread via GitHub
jayzhan211 commented on PR #10188: URL: https://github.com/apache/datafusion/pull/10188#issuecomment-2073660015 > Hi @jayzhan211 -- if we merge this one, does it close #10176 and #10144 ? Yes! -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
viirya commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576969288 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -103,10 +126,78 @@ impl Cast { (DataType::LargeUtf8, DataType::Boolean) => {

Re: [PR] Display: Support `preserve_partitioning` on SortExec physical plan. [datafusion]

2024-04-23 Thread via GitHub
alamb commented on PR #10153: URL: https://github.com/apache/datafusion/pull/10153#issuecomment-2073563458 I think you can update the tests using completion mode: https://github.com/apache/datafusion/tree/main/datafusion/sqllogictest#updating-tests-completion-mode That probably will g

Re: [PR] chore: Update documentation publishing domain and path [datafusion-comet]

2024-04-23 Thread via GitHub
andygrove commented on code in PR #310: URL: https://github.com/apache/datafusion-comet/pull/310#discussion_r1576960892 ## .asf.yaml: ## @@ -27,7 +27,7 @@ notifications: jira_options: link label worklog github: description: "Apache DataFusion Comet Spark Accelerator" - h

Re: [PR] chore: Update documentation publishing domain and path [datafusion-comet]

2024-04-23 Thread via GitHub
comphead commented on code in PR #310: URL: https://github.com/apache/datafusion-comet/pull/310#discussion_r1576956150 ## .asf.yaml: ## @@ -27,7 +27,7 @@ notifications: jira_options: link label worklog github: description: "Apache DataFusion Comet Spark Accelerator" - ho

[PR] chore: Update documentation publishing domain and path [datafusion-comet]

2024-04-23 Thread via GitHub
andygrove opened a new pull request, #310: URL: https://github.com/apache/datafusion-comet/pull/310 ## Which issue does this PR close? N/A ## Rationale for this change Publish documentation (eventually) to a `datafusion.apache.org/comet`. ## What ch

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
andygrove commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576944454 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -64,6 +68,25 @@ pub struct Cast { pub timezone: String, } +macro_rules! spark_cast_utf8_to_

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
andygrove commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576919256 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -64,6 +68,25 @@ pub struct Cast { pub timezone: String, } +macro_rules! spark_cast_utf8_to_

Re: [PR] Minor: Possibility to strip datafusion error name [datafusion]

2024-04-23 Thread via GitHub
comphead commented on code in PR #10186: URL: https://github.com/apache/datafusion/pull/10186#discussion_r1576898138 ## datafusion/common/src/error.rs: ## @@ -450,6 +396,99 @@ impl DataFusionError { #[cfg(not(feature = "backtrace"))] "".to_owned() } + +

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
viirya commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576894523 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -103,10 +126,78 @@ impl Cast { (DataType::LargeUtf8, DataType::Boolean) => {

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
viirya commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576894815 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -103,10 +126,78 @@ impl Cast { (DataType::LargeUtf8, DataType::Boolean) => {

Re: [PR] Optimization: make the most of Hint::AcceptsSingular when call make_scalar_function to Improve performance [datafusion]

2024-04-23 Thread via GitHub
alamb commented on code in PR #10054: URL: https://github.com/apache/datafusion/pull/10054#discussion_r1576891022 ## datafusion/functions/src/string/overlay.rs: ## @@ -88,10 +105,13 @@ pub fn overlay(args: &[ArrayRef]) -> Result { let characters_array = as_generic_

Re: [I] Complete support for `Expr --> String ` [datafusion]

2024-04-23 Thread via GitHub
alamb commented on issue #9726: URL: https://github.com/apache/datafusion/issues/9726#issuecomment-2073439485 Thanks @devanbenz ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [I] [EPIC] Tasks for a new Top Level Apache Project [datafusion]

2024-04-23 Thread via GitHub
alamb commented on issue #9691: URL: https://github.com/apache/datafusion/issues/9691#issuecomment-2073438447 > We should rename slack and discord channels? Update here is that @andygrove did so -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
viirya commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576888006 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -103,10 +126,78 @@ impl Cast { (DataType::LargeUtf8, DataType::Boolean) => {

Re: [I] Allow expr_to_sql unparsing with no quotes [datafusion]

2024-04-23 Thread via GitHub
alamb closed issue #10197: Allow expr_to_sql unparsing with no quotes URL: https://github.com/apache/datafusion/issues/10197 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Allow expr_to_sql unparsing with no quotes [datafusion]

2024-04-23 Thread via GitHub
alamb merged PR #10198: URL: https://github.com/apache/datafusion/pull/10198 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] [MINOR] Remove ScalarFunction from datafusion.proto #10173 [datafusion]

2024-04-23 Thread via GitHub
alamb commented on PR #10202: URL: https://github.com/apache/datafusion/pull/10202#issuecomment-2073431230 Thanks again @dmitrybugakov -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [I] Remove ScalarFunction from datafusion.proto [datafusion]

2024-04-23 Thread via GitHub
alamb closed issue #10173: Remove ScalarFunction from datafusion.proto URL: https://github.com/apache/datafusion/issues/10173 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [MINOR] Remove ScalarFunction from datafusion.proto #10173 [datafusion]

2024-04-23 Thread via GitHub
alamb merged PR #10202: URL: https://github.com/apache/datafusion/pull/10202 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Minor: Remove some clone in `TypeCoercion` [datafusion]

2024-04-23 Thread via GitHub
alamb commented on code in PR #10203: URL: https://github.com/apache/datafusion/pull/10203#discussion_r1576883951 ## datafusion/optimizer/src/analyzer/type_coercion.rs: ## @@ -530,31 +527,30 @@ fn coerce_window_frame( } WindowFrameUnits::Rows | WindowFrameUnits

Re: [PR] Minor: Possibility to strip datafusion error name [datafusion]

2024-04-23 Thread via GitHub
andygrove commented on code in PR #10186: URL: https://github.com/apache/datafusion/pull/10186#discussion_r1576864537 ## datafusion/common/src/error.rs: ## @@ -450,6 +396,99 @@ impl DataFusionError { #[cfg(not(feature = "backtrace"))] "".to_owned() } + +

Re: [PR] Minor: Introduce `Expr::is_volatile()`, adjust `TreeNode::exists()` [datafusion]

2024-04-23 Thread via GitHub
peter-toth commented on PR #10191: URL: https://github.com/apache/datafusion/pull/10191#issuecomment-2073394985 Thanks for the review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
comphead commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576844996 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -64,6 +68,25 @@ pub struct Cast { pub timezone: String, } +macro_rules! spark_cast_utf8_to_i

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
comphead commented on PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#issuecomment-2073364294 > > Thanks @andygrove btw I'm wondering if this PR should cover scope with formatting https://spark.apache.org/docs/latest/sql-ref-number-pattern.html#the-to_number-function >

[PR] Minor: Avoid a clone in ArrayFunctionRewriter [datafusion]

2024-04-23 Thread via GitHub
alamb opened a new pull request, #10204: URL: https://github.com/apache/datafusion/pull/10204 ## Which issue does this PR close? Part of https://github.com/apache/datafusion/issues/9637 ## Rationale for this change I was looking for why ApplyFunctionRewrite takes so much time

Re: [PR] Minor: Remove some clone in `TypeCoercion` [datafusion]

2024-04-23 Thread via GitHub
comphead commented on code in PR #10203: URL: https://github.com/apache/datafusion/pull/10203#discussion_r1576834547 ## datafusion/optimizer/src/analyzer/type_coercion.rs: ## @@ -530,31 +527,30 @@ fn coerce_window_frame( } WindowFrameUnits::Rows | WindowFrameUn

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
andygrove commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576823136 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -66,19 +66,22 @@ class CometCastSuite extends CometTestBase with AdaptiveSparkPlanHelper

Re: [PR] feat: Improve CometHashJoin statistics [datafusion-comet]

2024-04-23 Thread via GitHub
viirya commented on code in PR #309: URL: https://github.com/apache/datafusion-comet/pull/309#discussion_r1576822887 ## spark/src/test/scala/org/apache/comet/exec/CometExecSuite.scala: ## @@ -331,6 +331,43 @@ class CometExecSuite extends CometTestBase { } } + test("Co

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
andygrove commented on PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#issuecomment-2073322973 > Thanks @andygrove btw I'm wondering if this PR should cover scope with formatting https://spark.apache.org/docs/latest/sql-ref-number-pattern.html#the-to_number-function

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
parthchandra commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576817617 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -66,19 +66,22 @@ class CometCastSuite extends CometTestBase with AdaptiveSparkPlanHelp

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
andygrove commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576814855 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -64,6 +68,25 @@ pub struct Cast { pub timezone: String, } +macro_rules! spark_cast_utf8_to_

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
parthchandra commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576810386 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -103,10 +126,78 @@ impl Cast { (DataType::LargeUtf8, DataType::Boolean) => {

Re: [I] Stop copying `LogicalPlan` during OptimizerPasses [datafusion]

2024-04-23 Thread via GitHub
alamb commented on issue #9637: URL: https://github.com/apache/datafusion/issues/9637#issuecomment-2073296008 Update here is that I am hacking on getting `TypeCoercion` (https://github.com/apache/datafusion/pull/10039) to avoid clones. It is turning out to be tricker than others as type coe

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
comphead commented on PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#issuecomment-2073292355 Thanks @andygrove btw I'm wondering if this PR should cover scope with formatting https://spark.apache.org/docs/latest/sql-ref-number-pattern.html#the-to_number-function -- Thi

Re: [PR] Minor: Remove some clone in `TypeCoercion` [datafusion]

2024-04-23 Thread via GitHub
alamb commented on code in PR #10203: URL: https://github.com/apache/datafusion/pull/10203#discussion_r1576803845 ## datafusion/optimizer/src/analyzer/type_coercion.rs: ## @@ -565,58 +561,47 @@ fn coerce_arguments_for_signature( let new_types = data_types(¤t_types, signatur

Re: [PR] Minor: Remove some clone in `TypeCoercion` [datafusion]

2024-04-23 Thread via GitHub
alamb commented on code in PR #10203: URL: https://github.com/apache/datafusion/pull/10203#discussion_r1576803357 ## datafusion/optimizer/src/analyzer/type_coercion.rs: ## @@ -530,31 +527,30 @@ fn coerce_window_frame( } WindowFrameUnits::Rows | WindowFrameUnits

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
comphead commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576797068 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -103,10 +126,78 @@ impl Cast { (DataType::LargeUtf8, DataType::Boolean) => {

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
comphead commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576796459 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -103,10 +126,78 @@ impl Cast { (DataType::LargeUtf8, DataType::Boolean) => {

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
comphead commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576795634 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -64,6 +68,25 @@ pub struct Cast { pub timezone: String, } +macro_rules! spark_cast_utf8_to_i

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
comphead commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576794138 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -64,6 +68,25 @@ pub struct Cast { pub timezone: String, } +macro_rules! spark_cast_utf8_to_i

Re: [PR] feat: Implement Spark-compatible CAST from string to integral types [datafusion-comet]

2024-04-23 Thread via GitHub
andygrove commented on code in PR #307: URL: https://github.com/apache/datafusion-comet/pull/307#discussion_r1576792898 ## core/src/execution/datafusion/expressions/cast.rs: ## @@ -103,10 +126,78 @@ impl Cast { (DataType::LargeUtf8, DataType::Boolean) => {

Re: [PR] Avoid copies in `TypeCoercion` via TreeNode API [datafusion]

2024-04-23 Thread via GitHub
alamb commented on PR #10039: URL: https://github.com/apache/datafusion/pull/10039#issuecomment-2073237342 An update here is twofold: 1. There are some very subtle semantics going on 2. As this pass actually changes the types of the plan (on purpose) we need some way to recalculate th

Re: [PR] chore: Ignore unused variables [datafusion-comet]

2024-04-23 Thread via GitHub
viirya commented on PR #306: URL: https://github.com/apache/datafusion-comet/pull/306#issuecomment-2073234383 Merged. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] chore: Ignore unused variables [datafusion-comet]

2024-04-23 Thread via GitHub
viirya merged PR #306: URL: https://github.com/apache/datafusion-comet/pull/306 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Doc: Modify docs to fix old naming [datafusion]

2024-04-23 Thread via GitHub
comphead merged PR #10199: URL: https://github.com/apache/datafusion/pull/10199 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Doc: Modify docs to fix old naming [datafusion]

2024-04-23 Thread via GitHub
comphead commented on PR #10199: URL: https://github.com/apache/datafusion/pull/10199#issuecomment-2073227766 Thanks all for the review 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[PR] feat: Improve CometHashJoin statistics [datafusion-comet]

2024-04-23 Thread via GitHub
planga82 opened a new pull request, #309: URL: https://github.com/apache/datafusion-comet/pull/309 ## Which issue does this PR close? Closes #308 . ## Rationale for this change Add all statistics HashJoinExec datafusion node provides. ## What change

Re: [PR] [MINOR] Remove ScalarFunction from datafusion.proto #10173 [datafusion]

2024-04-23 Thread via GitHub
Omega359 commented on PR #10202: URL: https://github.com/apache/datafusion/pull/10202#issuecomment-2073190034 lgtm -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

[I] Improve CometHashJoin statistics [datafusion-comet]

2024-04-23 Thread via GitHub
planga82 opened a new issue, #308: URL: https://github.com/apache/datafusion-comet/issues/308 ### What is the problem the feature request solves? Add all statistics HashJoinExec datafusion node provides. ### Describe the potential solution Override metrics map in

Re: [PR] [Bug Fix]: Deem hash repartition unnecessary when input and output has 1 partition [datafusion]

2024-04-23 Thread via GitHub
korowa commented on PR #10095: URL: https://github.com/apache/datafusion/pull/10095#issuecomment-2073078432 > Maybe we can insert RepartitionExec on top UnionExecs if their output partition number > config.target_partitions. By this way, we can guarantee this violation wouldn't propagate to

Re: [PR] Minor: Introduce `Expr::is_volatile()`, adjust `TreeNode::exists()` [datafusion]

2024-04-23 Thread via GitHub
alamb commented on code in PR #10191: URL: https://github.com/apache/datafusion/pull/10191#discussion_r1576684997 ## datafusion/common/src/tree_node.rs: ## @@ -405,18 +405,17 @@ pub trait TreeNode: Sized { /// Returns true if `f` returns true for any node in the tree.

Re: [PR] Minor: Introduce `Expr::is_volatile()`, adjust `TreeNode::exists()` [datafusion]

2024-04-23 Thread via GitHub
alamb merged PR #10191: URL: https://github.com/apache/datafusion/pull/10191 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Move create_physical_expr to phy-expr-common #3 [datafusion]

2024-04-23 Thread via GitHub
alamb commented on PR #10188: URL: https://github.com/apache/datafusion/pull/10188#issuecomment-2073063352 Hi @jayzhan211 -- if we merge this one, does it close https://github.com/apache/datafusion/pull/10176 and https://github.com/apache/datafusion/pull/10144 ? -- This is an automated

Re: [PR] Move coalesce function from math to core [datafusion]

2024-04-23 Thread via GitHub
alamb commented on PR #10201: URL: https://github.com/apache/datafusion/pull/10201#issuecomment-2073061325 Hi @xxxuuu -- it looks like there is a small CI failure -- https://github.com/apache/datafusion/actions/runs/8803139105/job/24166156315?pr=10201 -- This is an automated message from

Re: [PR] ScalarUDF: Remove `supports_zero_argument` and avoid creating null array for empty args [datafusion]

2024-04-23 Thread via GitHub
alamb commented on code in PR #10193: URL: https://github.com/apache/datafusion/pull/10193#discussion_r1576665672 ## datafusion/core/tests/user_defined/user_defined_scalar_functions.rs: ## @@ -615,6 +493,22 @@ async fn test_user_defined_functions_cast_to_i64() -> Result<()> {

Re: [I] Range/inequality joins are slow [datafusion]

2024-04-23 Thread via GitHub
korowa commented on issue #8393: URL: https://github.com/apache/datafusion/issues/8393#issuecomment-2073050255 My intention was to fix NLJoin parallelism issue due to fixed build-side choice, and in the same time we also have #318 for specialized operator implementation, so, I supposed #967

Re: [PR] ScalarUDF: Remove `supports_zero_argument` and avoid creating null array for empty args [datafusion]

2024-04-23 Thread via GitHub
alamb commented on code in PR #10193: URL: https://github.com/apache/datafusion/pull/10193#discussion_r1576665520 ## datafusion/core/tests/user_defined/user_defined_scalar_functions.rs: ## @@ -403,123 +398,6 @@ async fn test_user_defined_functions_with_alias() -> Result<()> {

Re: [PR] Update links to point to datafusion.apache.org [datafusion]

2024-04-23 Thread via GitHub
alamb merged PR #10195: URL: https://github.com/apache/datafusion/pull/10195 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Setup regular meetups for Comet development [datafusion-comet]

2024-04-23 Thread via GitHub
viirya commented on issue #217: URL: https://github.com/apache/datafusion-comet/issues/217#issuecomment-2073036221 Are we going to set up something with this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Update links to point to datafusion.apache.org [datafusion]

2024-04-23 Thread via GitHub
alamb commented on code in PR #10195: URL: https://github.com/apache/datafusion/pull/10195#discussion_r1576661083 ## dev/release/README.md: ## @@ -463,7 +463,7 @@ svn delete -m "delete old DataFusion release" https://dist.apache.org/repos/dist - Checkout the `asf-site` branch

Re: [PR] Allow expr_to_sql unparsing with no quotes [datafusion]

2024-04-23 Thread via GitHub
alamb commented on code in PR #10198: URL: https://github.com/apache/datafusion/pull/10198#discussion_r1576652693 ## datafusion/sql/src/unparser/expr.rs: ## @@ -965,4 +965,20 @@ mod tests { Ok(()) } + +#[test] +fn custom_dialect_none() -> Result<()> { Re

Re: [PR] Doc: Modify docs to fix old naming [datafusion]

2024-04-23 Thread via GitHub
alamb commented on code in PR #10199: URL: https://github.com/apache/datafusion/pull/10199#discussion_r1576650004 ## dev/release/README.md: ## @@ -223,7 +223,7 @@ Here is my vote: +1 [1]: https://github.com/apache/datafusion/tree/a5dd428f57e62db20a945e8b1895de91405958c4 -[2

Re: [PR] [MINOR] Remove ScalarFunction from datafusion.proto #10173 [datafusion]

2024-04-23 Thread via GitHub
alamb commented on PR #10202: URL: https://github.com/apache/datafusion/pull/10202#issuecomment-2073012535 Thank you @dmitrybugakov -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [I] `select array_concat([])` panicked [datafusion]

2024-04-23 Thread via GitHub
Lordworms commented on issue #10200: URL: https://github.com/apache/datafusion/issues/10200#issuecomment-2073013621 I can fix this one if no one is working on this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Move coalesce function from math to core [datafusion]

2024-04-23 Thread via GitHub
alamb commented on PR #10201: URL: https://github.com/apache/datafusion/pull/10201#issuecomment-2073010872 Thank you for the contribution @xxxuuu ❤️ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Implement rewrite for EliminateOneUnion and EliminateJoin [datafusion]

2024-04-23 Thread via GitHub
alamb merged PR #10184: URL: https://github.com/apache/datafusion/pull/10184 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] implement rewrite for FilterNullJoinKeys [datafusion]

2024-04-23 Thread via GitHub
alamb commented on PR #10166: URL: https://github.com/apache/datafusion/pull/10166#issuecomment-2073002072 Thanks again @Lordworms -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

  1   2   >