Re: [PR] chore: Use twox-hash 2.0 xxhash64 oneshot api instead of custom implementation [datafusion-comet]

2024-10-27 Thread via GitHub
NoeB commented on PR #1041: URL: https://github.com/apache/datafusion-comet/pull/1041#issuecomment-2440640745 Interesting, I just reran the bench and I get the same improvements as before. Should I compare the library implementation with the DataFusion custom one to see if there is a diffe

Re: [PR] fix `cargo run` error [datafusion-sqlparser-rs]

2024-10-27 Thread via GitHub
wugeer commented on PR #1486: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1486#issuecomment-2440620042 This one is a duplicate of another PR https://github.com/apache/datafusion-sqlparser-rs/pull/1483. -- This is an automated message from the Apache Git Service. To respon

Re: [PR] fix `cargo run` error [datafusion-sqlparser-rs]

2024-10-27 Thread via GitHub
wugeer closed pull request #1486: fix `cargo run` error URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1486 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] POC: Vectorized hashtable for aggregation [datafusion]

2024-10-27 Thread via GitHub
Rachelint commented on code in PR #12996: URL: https://github.com/apache/datafusion/pull/12996#discussion_r1818406542 ## datafusion/physical-plan/src/aggregates/group_values/group_column.rs: ## @@ -287,6 +469,63 @@ where }; } +fn vectorized_equal_to( Review

Re: [I] Implement RightMark join [datafusion]

2024-10-27 Thread via GitHub
jonathanc-n commented on issue #13138: URL: https://github.com/apache/datafusion/issues/13138#issuecomment-2440570300 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818373156 ## datafusion/physical-plan/src/sorts/sort_preserving_merge.rs: ## @@ -326,18 +343,87 @@ mod tests { use arrow::compute::SortOptions; use arrow::datat

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818373156 ## datafusion/physical-plan/src/sorts/sort_preserving_merge.rs: ## @@ -326,18 +343,87 @@ mod tests { use arrow::compute::SortOptions; use arrow::datat

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818373156 ## datafusion/physical-plan/src/sorts/sort_preserving_merge.rs: ## @@ -326,18 +343,87 @@ mod tests { use arrow::compute::SortOptions; use arrow::datat

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818373156 ## datafusion/physical-plan/src/sorts/sort_preserving_merge.rs: ## @@ -326,18 +343,87 @@ mod tests { use arrow::compute::SortOptions; use arrow::datat

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818373156 ## datafusion/physical-plan/src/sorts/sort_preserving_merge.rs: ## @@ -326,18 +343,87 @@ mod tests { use arrow::compute::SortOptions; use arrow::datat

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
2010YOUY01 commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818369721 ## datafusion/physical-plan/src/sorts/sort_preserving_merge.rs: ## @@ -326,18 +343,87 @@ mod tests { use arrow::compute::SortOptions; use arrow::datat

Re: [I] feat: Add `alternative_syntax` function for docs [datafusion]

2024-10-27 Thread via GitHub
jonathanc-n commented on issue #13139: URL: https://github.com/apache/datafusion/issues/13139#issuecomment-2440489157 I'm struggling quite a bit with migrating the ntile, because I need to delete all of the builtinwindow functionality which is causing many bugs in many places that I am not

Re: [PR] Raise a plan error on union if column count is not the same between plans [datafusion]

2024-10-27 Thread via GitHub
jonahgao merged PR #13117: URL: https://github.com/apache/datafusion/pull/13117 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [I] Datafusion 42 does not raise plan error on some queries [datafusion]

2024-10-27 Thread via GitHub
jonahgao closed issue #13092: Datafusion 42 does not raise plan error on some queries URL: https://github.com/apache/datafusion/issues/13092 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Add option to FilterExec to prevent re-using input batches [datafusion]

2024-10-27 Thread via GitHub
github-actions[bot] closed pull request #12039: Add option to FilterExec to prevent re-using input batches URL: https://github.com/apache/datafusion/pull/12039 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] added a BallistaContext to ballista to allow for Remote or standalone [datafusion-ballista]

2024-10-27 Thread via GitHub
tbar4 opened a new pull request, #1100: URL: https://github.com/apache/datafusion-ballista/pull/1100 # Which issue does this PR close? Closes #1091 # Rationale for this change This adds a standalone context option when initializing a `BallistaContext` in Ba

Re: [PR] feat: Support Substrait's IntervalCompound type/literal instead of interval-month-day-nano UDT [datafusion]

2024-10-27 Thread via GitHub
Blizzara commented on PR #12112: URL: https://github.com/apache/datafusion/pull/12112#issuecomment-2440329051 this should be good to review, fyi @alamb @vbarua @westonpace @tokoko (feel free to lmk if you'd prefer not to be tagged :)) -- This is an automated message from the Apache Git Se

Re: [PR] feat: Support Substrait's IntervalCompound type/literal instead of interval-month-day-nano UDT [datafusion]

2024-10-27 Thread via GitHub
Blizzara commented on code in PR #12112: URL: https://github.com/apache/datafusion/pull/12112#discussion_r1818222135 ## datafusion/substrait/src/logical_plan/producer.rs: ## @@ -2310,39 +2306,6 @@ mod test { Ok(()) } -#[test] -fn custom_type_literal_exten

Re: [PR] feat: Support Substrait's IntervalCompound type/literal instead of interval-month-day-nano UDT [datafusion]

2024-10-27 Thread via GitHub
Blizzara commented on code in PR #12112: URL: https://github.com/apache/datafusion/pull/12112#discussion_r1818221896 ## datafusion/substrait/src/variation_const.rs: ## @@ -96,12 +96,16 @@ pub const INTERVAL_DAY_TIME_TYPE_REF: u32 = 2; /// [`ScalarValue::IntervalMonthDayNano`]:

Re: [PR] [docs]: added `alternative_syntax` function for docs [datafusion]

2024-10-27 Thread via GitHub
Omega359 commented on PR #13140: URL: https://github.com/apache/datafusion/pull/13140#issuecomment-2440208811 > ❤️ I like this! One minor suggestion: I would suggest changing the style of the alternate syntax to match the .with_syntax_example() style Actually, on a second look, maybe

Re: [PR] [docs]: added `alternative_syntax` function for docs [datafusion]

2024-10-27 Thread via GitHub
Omega359 commented on PR #13140: URL: https://github.com/apache/datafusion/pull/13140#issuecomment-2440208396 :heart: I like this! One minor suggestion: I would suggest changing the style of the alternate syntax to match the .with_syntax_example() style -- This is an automated message fr

Re: [PR] consider volatile function in simply_expression [datafusion]

2024-10-27 Thread via GitHub
Lordworms commented on code in PR #13128: URL: https://github.com/apache/datafusion/pull/13128#discussion_r1818194880 ## datafusion/optimizer/src/simplify_expressions/utils.rs: ## @@ -341,3 +355,49 @@ pub fn distribute_negation(expr: Expr) -> Expr { _ => Expr::Negative(

Re: [PR] Add absolute_paths clippy lint with 4 maximum segments. [datafusion]

2024-10-27 Thread via GitHub
dhegberg commented on PR #13086: URL: https://github.com/apache/datafusion/pull/13086#issuecomment-2440188367 @findepi I've added the new linter check. One thing I struggled with was the generated code in `datafusion/proto`, I ended up removing the workspace lint setting from that

[PR] Minor: Delete old cume_dist and percent_rank docs [datafusion]

2024-10-27 Thread via GitHub
jonathanc-n opened a new pull request, #13137: URL: https://github.com/apache/datafusion/pull/13137 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? Deleted cume_dist and percent_rank doc

Re: [PR] chore: Use twox-hash 2.0 xxhash64 oneshot api instead of custom implementation [datafusion-comet]

2024-10-27 Thread via GitHub
andygrove commented on PR #1041: URL: https://github.com/apache/datafusion-comet/pull/1041#issuecomment-2440185498 I got slightly different cargo bench results, but I saw no regression in overall TPC-H performance. ``` hash/xxhash64/8192 time: [306.49 µs 307.37 µs 308.27 µs

Re: [PR] [docs]: added `alternative_syntax` function for docs [datafusion]

2024-10-27 Thread via GitHub
jonathanc-n commented on PR #13140: URL: https://github.com/apache/datafusion/pull/13140#issuecomment-2440181206 @Omega359 Does this seem fine? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] [docs]: added `alternative_syntax` function for docs [datafusion]

2024-10-27 Thread via GitHub
jonathanc-n opened a new pull request, #13140: URL: https://github.com/apache/datafusion/pull/13140 ## Which issue does this PR close? Closes #13139 . ## Rationale for this change Added alternative syntax argument for doc functions ## What changes are inclu

Re: [PR] feat: Implement LeftMark join to fix subquery correctness issue [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13134: URL: https://github.com/apache/datafusion/pull/13134#discussion_r1818178088 ## datafusion/physical-plan/src/joins/symmetric_hash_join.rs: ## @@ -708,6 +713,20 @@ where { // Store the result in a tuple let result = match (build

Re: [I] Casting existing timestamp to timestamp again strips timezone information [datafusion]

2024-10-27 Thread via GitHub
Omega359 commented on issue #12218: URL: https://github.com/apache/datafusion/issues/12218#issuecomment-2440171699 Unfortunately as it stands the `now()` function nor any other UDF function has no access to the timezone of the server (though it could likely be retrieved) nor the default tim

Re: [PR] feat: Implement LeftMark join to fix subquery correctness issue [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13134: URL: https://github.com/apache/datafusion/pull/13134#discussion_r1818178734 ## datafusion/physical-plan/src/joins/symmetric_hash_join.rs: ## @@ -708,6 +713,20 @@ where { // Store the result in a tuple let result = match (build

Re: [I] feat: Add `alternative_syntax` function for docs [datafusion]

2024-10-27 Thread via GitHub
jonathanc-n commented on issue #13139: URL: https://github.com/apache/datafusion/issues/13139#issuecomment-2440170189 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[I] feat: Add `alternative_syntax` function for docs [datafusion]

2024-10-27 Thread via GitHub
jonathanc-n opened a new issue, #13139: URL: https://github.com/apache/datafusion/issues/13139 ### Is your feature request related to a problem or challenge? This is mentioned in #12740 from @Omega359. The doc migration is almost wrapped up, but there are a few functions that are list

Re: [PR] feat: Implement LeftMark join to fix subquery correctness issue [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13134: URL: https://github.com/apache/datafusion/pull/13134#discussion_r1818179027 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -1254,7 +1264,10 @@ pub(crate) fn build_batch_from_indices( let mut columns: Vec> = Vec::with_capacit

Re: [I] Remove old static function pages [datafusion]

2024-10-27 Thread via GitHub
jonathanc-n commented on issue #12741: URL: https://github.com/apache/datafusion/issues/12741#issuecomment-2440169294 Yeah that could work, I can add an issue for that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] feat: Implement LeftMark join to fix subquery correctness issue [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13134: URL: https://github.com/apache/datafusion/pull/13134#discussion_r1818176298 ## datafusion/physical-plan/src/joins/sort_merge_join.rs: ## @@ -784,6 +790,29 @@ fn get_corrected_filter_mask( corrected_mask.extend(vec![Some(fals

Re: [PR] feat: Implement LeftMark join to fix subquery correctness issue [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13134: URL: https://github.com/apache/datafusion/pull/13134#discussion_r1818175721 ## datafusion/physical-plan/src/joins/sort_merge_join.rs: ## @@ -784,6 +790,29 @@ fn get_corrected_filter_mask( corrected_mask.extend(vec![Some(fals

Re: [PR] feat: Implement LeftMark join to fix subquery correctness issue [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13134: URL: https://github.com/apache/datafusion/pull/13134#discussion_r1818173188 ## datafusion/optimizer/src/decorrelate_predicate_subquery.rs: ## @@ -296,37 +287,26 @@ fn build_join_top( /// /// ```text /// Projection: t1.id -/// Filter

Re: [I] Remove old static function pages [datafusion]

2024-10-27 Thread via GitHub
Omega359 commented on issue #12741: URL: https://github.com/apache/datafusion/issues/12741#issuecomment-2440158910 That may work though it doesn't help in documenting the difference in syntax. I was also wondering about add an `alternative_syntax(...)` section to the Documentation struct fo

Re: [PR] chore: Use twox-hash 2.0 xxhash64 oneshot api instead of custom implementation [datafusion-comet]

2024-10-27 Thread via GitHub
codecov-commenter commented on PR #1041: URL: https://github.com/apache/datafusion-comet/pull/1041#issuecomment-2440158917 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1041?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] chore: Use twox-hash 2.0 xxhash64 oneshot api instead of custom implementation [datafusion-comet]

2024-10-27 Thread via GitHub
andygrove commented on PR #1041: URL: https://github.com/apache/datafusion-comet/pull/1041#issuecomment-2440157392 > How should I proceed with the pipeline failure which comes from a new clippy rule (manual_pattern_char_comparison) introduced in 1.81.0? Should I apply the suggestion or cre

Re: [PR] Improve TableScan with filters pushdown unparsing (joins) [datafusion]

2024-10-27 Thread via GitHub
sgrebnov commented on code in PR #13132: URL: https://github.com/apache/datafusion/pull/13132#discussion_r1818169716 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## @@ -991,6 +991,68 @@ fn test_sort_with_push_down_fetch() -> Result<()> { Ok(()) } +#[test] +fn test_join_

Re: [PR] refactor: Use twox-hash 2.0 xxhash64 oneshot api instead of custom implementation [datafusion-comet]

2024-10-27 Thread via GitHub
NoeB commented on PR #1041: URL: https://github.com/apache/datafusion-comet/pull/1041#issuecomment-2440139941 > > I am unsure if license.txt and Notice.txt also need to be updated The references regarding the code from twox-hash got introduced with #575 > > Yes, the twox_hash referen

Re: [PR] refactor: Use twox-hash 2.0 xxhash64 oneshot api instead of custom implementation [datafusion-comet]

2024-10-27 Thread via GitHub
NoeB commented on PR #1041: URL: https://github.com/apache/datafusion-comet/pull/1041#issuecomment-2440139696 How should I proceed with the pipeline failure which comes from a new clippy rule (manual_pattern_char_comparison) introduced in 1.81.0? Should I apply the suggestion or create

Re: [PR] feat: Implement LeftMark join to fix subquery correctness issue [datafusion]

2024-10-27 Thread via GitHub
eejbyfeldt commented on code in PR #13134: URL: https://github.com/apache/datafusion/pull/13134#discussion_r1818159207 ## datafusion/common/src/join_type.rs: ## @@ -113,6 +118,9 @@ pub enum JoinSide { Left, /// Right side of the join Right, +/// Neither side o

Re: [PR] feat: Implement LeftMark join to fix subquery correctness issue [datafusion]

2024-10-27 Thread via GitHub
eejbyfeldt commented on code in PR #13134: URL: https://github.com/apache/datafusion/pull/13134#discussion_r1818159762 ## datafusion/substrait/tests/cases/roundtrip_logical_plan.rs: ## @@ -473,15 +473,15 @@ async fn roundtrip_inlist_5() -> Result<()> { // on roundtrip there

[I] Implement RightMark join [datafusion]

2024-10-27 Thread via GitHub
eejbyfeldt opened a new issue, #13138: URL: https://github.com/apache/datafusion/issues/13138 ### Is your feature request related to a problem or challenge? In https://github.com/apache/datafusion/pull/13134 we add a LeftMark join. To allow us to swap the side we build the hashtable f

Re: [PR] POC: Vectorized hashtable for aggregation [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #12996: URL: https://github.com/apache/datafusion/pull/12996#discussion_r1818156207 ## datafusion/physical-plan/src/aggregates/group_values/group_column.rs: ## @@ -287,6 +469,63 @@ where }; } +fn vectorized_equal_to( Review

Re: [I] Remove old static function pages [datafusion]

2024-10-27 Thread via GitHub
jonathanc-n commented on issue #12741: URL: https://github.com/apache/datafusion/issues/12741#issuecomment-2440125307 @Omega359 I'm thinking we can probably put them as aliases? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] Add spilling support for HashJoin [datafusion]

2024-10-27 Thread via GitHub
dmitrybugakov commented on issue #12952: URL: https://github.com/apache/datafusion/issues/12952#issuecomment-2440125370 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Use twox-hash 2.0 xxhash64 oneshot api instead of custom implementation [datafusion-comet]

2024-10-27 Thread via GitHub
andygrove commented on PR #1041: URL: https://github.com/apache/datafusion-comet/pull/1041#issuecomment-2440125276 > I am unsure if license.txt and Notice.txt also need to be updated The references regarding the code from twox-hash got introduced with #575 Yes, the twox_hash referenc

Re: [PR] POC: Vectorized hashtable for aggregation [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #12996: URL: https://github.com/apache/datafusion/pull/12996#discussion_r1818154221 ## datafusion/physical-plan/src/aggregates/group_values/column.rs: ## @@ -125,6 +233,292 @@ impl GroupValuesColumn { | DataType::BinaryView

Re: [PR] feat: Implement LeftMark join to fix subquery correctness issue [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13134: URL: https://github.com/apache/datafusion/pull/13134#discussion_r1818152032 ## datafusion/substrait/tests/cases/roundtrip_logical_plan.rs: ## @@ -473,15 +473,15 @@ async fn roundtrip_inlist_5() -> Result<()> { // on roundtrip there

Re: [I] [DISCUSSION] Make DataFusion the fastest engine for querying parquet data in ClickBench [datafusion]

2024-10-27 Thread via GitHub
Rachelint commented on issue #12821: URL: https://github.com/apache/datafusion/issues/12821#issuecomment-2440106787 I made a poc https://github.com/apache/datafusion/pull/12996#issuecomment-2440105534 about what @Dandandan mentioned in https://github.com/apache/datafusion/issues/12821#issu

Re: [PR] POC: Vectorized hashtable for aggregation [datafusion]

2024-10-27 Thread via GitHub
Rachelint commented on PR #12996: URL: https://github.com/apache/datafusion/pull/12996#issuecomment-2440105531 It is really to see that the vectorized approach is promising! The rest work is to find why it get `q28` slower. ``` Benchmark clickbench_1.j

Re: [PR] POC: Vectorized hashtable for aggregation [datafusion]

2024-10-27 Thread via GitHub
Rachelint commented on PR #12996: URL: https://github.com/apache/datafusion/pull/12996#issuecomment-2440105534 It is really to see that the vectorized approach is promising! The rest work is to find why it get `q28` slower. ``` Benchmark clickbench_1.j

Re: [I] filter_push_down internal error [datafusion]

2024-10-27 Thread via GitHub
andygrove closed issue #3416: filter_push_down internal error URL: https://github.com/apache/datafusion/issues/3416 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [I] Improve DataFusion scalability as more cores are added [datafusion]

2024-10-27 Thread via GitHub
andygrove closed issue #5999: Improve DataFusion scalability as more cores are added URL: https://github.com/apache/datafusion/issues/5999 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] [minor] overload from_unixtime func to have optional timezone parameter [datafusion]

2024-10-27 Thread via GitHub
jonathanc-n commented on PR #13130: URL: https://github.com/apache/datafusion/pull/13130#issuecomment-2440099653 I can probably finish the rest of the issue in another pr after this is merged -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] [minor] overload from_unixtime func to have optional timezone parameter [datafusion]

2024-10-27 Thread via GitHub
buraksenn commented on code in PR #13130: URL: https://github.com/apache/datafusion/pull/13130#discussion_r1818134503 ## datafusion/functions/src/datetime/from_unixtime.rs: ## @@ -93,12 +124,59 @@ fn get_from_unixtime_doc() -> &'static Documentation { Documentation::bui

Re: [PR] Executor configuration accepts SessionState .. [datafusion-ballista]

2024-10-27 Thread via GitHub
milenkovicm commented on PR #1099: URL: https://github.com/apache/datafusion-ballista/pull/1099#issuecomment-2440075782 Thanks @andygrove, I need another pass to clean it up and put documention. Will let you know when ready -- This is an automated message from the Apache Git Service. To

Re: [PR] Use twox-hash 2.0 xxhash64 oneshot api instead of custom implementation [datafusion-comet]

2024-10-27 Thread via GitHub
NoeB commented on PR #1041: URL: https://github.com/apache/datafusion-comet/pull/1041#issuecomment-2440075895 Cargo Bench results: ```bash hash/xxhash64/8192 time: [414.63 µs 415.13 µs 415.67 µs] change: [-6.6364% -6.3063% -5.9526%] (p = 0.00 < 0.

Re: [PR] [minor] overload from_unixtime func to have optional timezone parameter [datafusion]

2024-10-27 Thread via GitHub
Omega359 commented on code in PR #13130: URL: https://github.com/apache/datafusion/pull/13130#discussion_r1818128858 ## datafusion/functions/src/datetime/from_unixtime.rs: ## @@ -93,12 +124,59 @@ fn get_from_unixtime_doc() -> &'static Documentation { Documentation::buil

Re: [PR] Use twox-hash 2.0 xxhash64 oneshot api instead of custom implementation [datafusion-comet]

2024-10-27 Thread via GitHub
NoeB commented on PR #1041: URL: https://github.com/apache/datafusion-comet/pull/1041#issuecomment-2440074569 I am unsure if license.txt and Notice.txt also need to be updated They got introduced with #575 -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [minor] overload from_unixtime func to have optional timezone parameter [datafusion]

2024-10-27 Thread via GitHub
Omega359 commented on code in PR #13130: URL: https://github.com/apache/datafusion/pull/13130#discussion_r1818126671 ## docs/source/user-guide/sql/scalar_functions_new.md: ## @@ -2003,12 +2003,13 @@ _Alias of [date_trunc](#date_trunc)._ Converts an integer to RFC3339 timestamp

[PR] Use twox-hash 2.0 xxhash64 oneshot api instead of custom implementation [datafusion-comet]

2024-10-27 Thread via GitHub
NoeB opened a new pull request, #1041: URL: https://github.com/apache/datafusion-comet/pull/1041 ## Which issue does this PR close? Closes #1032 ## Rationale for this change ## What changes are included in this PR? - Use two-hash 2.0 onshot api instead of

Re: [PR] [minor] overload from_unixtime func to have optional timezone parameter [datafusion]

2024-10-27 Thread via GitHub
Omega359 commented on code in PR #13130: URL: https://github.com/apache/datafusion/pull/13130#discussion_r1818128242 ## datafusion/functions/src/datetime/from_unixtime.rs: ## @@ -93,12 +124,59 @@ fn get_from_unixtime_doc() -> &'static Documentation { Documentation::buil

Re: [PR] [minor] overload from_unixtime func to have optional timezone parameter [datafusion]

2024-10-27 Thread via GitHub
Omega359 commented on code in PR #13130: URL: https://github.com/apache/datafusion/pull/13130#discussion_r1818127102 ## datafusion/functions/src/datetime/from_unixtime.rs: ## @@ -93,12 +124,59 @@ fn get_from_unixtime_doc() -> &'static Documentation { Documentation::buil

Re: [PR] Improve TableScan with filters pushdown unparsing (joins) [datafusion]

2024-10-27 Thread via GitHub
goldmedal commented on code in PR #13132: URL: https://github.com/apache/datafusion/pull/13132#discussion_r1818126338 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## @@ -991,6 +991,68 @@ fn test_sort_with_push_down_fetch() -> Result<()> { Ok(()) } +#[test] +fn test_join

Re: [PR] Improve TableScan with filters pushdown unparsing (multiple filters) [datafusion]

2024-10-27 Thread via GitHub
goldmedal merged PR #13131: URL: https://github.com/apache/datafusion/pull/13131 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Add basic support for `unnest` unparsing [datafusion]

2024-10-27 Thread via GitHub
goldmedal commented on code in PR #13129: URL: https://github.com/apache/datafusion/pull/13129#discussion_r1818112492 ## datafusion/sql/src/unparser/expr.rs: ## @@ -1340,6 +1341,29 @@ impl Unparser<'_> { } } +/// Converts an UNNEST operation to an AST express

Re: [PR] feat: Implement LeftMark join to fix subquery correctness issue [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13134: URL: https://github.com/apache/datafusion/pull/13134#discussion_r1818113292 ## datafusion/common/src/join_type.rs: ## @@ -113,6 +118,9 @@ pub enum JoinSide { Left, /// Right side of the join Right, +/// Neither side of

Re: [PR] feat: Implement LeftMark join to fix subquery correctness issue [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13134: URL: https://github.com/apache/datafusion/pull/13134#discussion_r1818113940 ## datafusion/physical-plan/src/joins/utils.rs: ## @@ -1171,6 +1173,15 @@ pub(crate) fn get_final_indices_from_bit_map( join_type: JoinType, ) -> (UInt64Ar

[PR] fix `cargo run` error [datafusion-sqlparser-rs]

2024-10-27 Thread via GitHub
wugeer opened a new pull request, #1486: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1486 This resolves issue https://github.com/apache/datafusion-sqlparser-rs/issues/1484 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] FFI initial implementation [datafusion]

2024-10-27 Thread via GitHub
timsaucer commented on code in PR #12920: URL: https://github.com/apache/datafusion/pull/12920#discussion_r1818111708 ## datafusion/ffi/src/plan_properties.rs: ## @@ -0,0 +1,330 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agr

[PR] Add physical plan properties to protobuf definition [datafusion]

2024-10-27 Thread via GitHub
timsaucer opened a new pull request, #13136: URL: https://github.com/apache/datafusion/pull/13136 ## Which issue does this PR close? This is to support https://github.com/apache/datafusion-python/issues/823 and to address [this comment](https://github.com/apache/datafusion/pull/12920

Re: [PR] POC: Try to optimize `columnized_output_exprs` [datafusion]

2024-10-27 Thread via GitHub
goldmedal commented on PR #13018: URL: https://github.com/apache/datafusion/pull/13018#issuecomment-2440038947 > However, then for each expr it builds up the same hash table: > > https://github.com/apache/datafusion/blob/ac827abe1b66b1dfa02ce65ae857477f68667843/datafusion/expr/src/uti

[I] Decorrelated predicate subqueries with disjuncition has duplicated rows [datafusion]

2024-10-27 Thread via GitHub
eejbyfeldt opened a new issue, #13135: URL: https://github.com/apache/datafusion/issues/13135 ### Describe the bug In https://github.com/apache/datafusion/pull/12945 the emulation of an mark join has a bug when there is duplicate values in the subquery. This would be fixable by addin

[PR] feat: Support LeftMark join to fix subquery correctness issue [datafusion]

2024-10-27 Thread via GitHub
eejbyfeldt opened a new pull request, #13134: URL: https://github.com/apache/datafusion/pull/13134 ## Which issue does this PR close? Closes #. ## Rationale for this change In https://github.com/apache/datafusion/pull/12945 the emulation of an mark join has a bug w

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818076266 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -127,12 +148,18 @@ impl SortPreservingMergeStream { metrics, aborted: false,

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818078524 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -127,12 +148,18 @@ impl SortPreservingMergeStream { metrics, aborted: false,

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818078524 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -127,12 +148,18 @@ impl SortPreservingMergeStream { metrics, aborted: false,

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818076266 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -127,12 +148,18 @@ impl SortPreservingMergeStream { metrics, aborted: false,

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818076266 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -127,12 +148,18 @@ impl SortPreservingMergeStream { metrics, aborted: false,

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818076266 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -127,12 +148,18 @@ impl SortPreservingMergeStream { metrics, aborted: false,

Re: [PR] feat(logical-types): add NativeType and LogicalType [datafusion]

2024-10-27 Thread via GitHub
goldmedal commented on code in PR #12853: URL: https://github.com/apache/datafusion/pull/12853#discussion_r1818073887 ## datafusion/common/src/types/native.rs: ## @@ -0,0 +1,399 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agr

Re: [I] Error with type coercion with `CREATE TABLE AS SELECT` ... inserting `VALUES` [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on issue #13124: URL: https://github.com/apache/datafusion/issues/13124#issuecomment-2439984175 I would prefer not to block on the query not supported by PostgreSQL, especially when it's not intended for support in DataFusion (no tests or issues related). The que

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818071424 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -127,12 +148,18 @@ impl SortPreservingMergeStream { metrics, aborted: false,

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818071424 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -127,12 +148,18 @@ impl SortPreservingMergeStream { metrics, aborted: false,

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818071424 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -127,12 +148,18 @@ impl SortPreservingMergeStream { metrics, aborted: false,

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818071424 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -127,12 +148,18 @@ impl SortPreservingMergeStream { metrics, aborted: false,

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
Dandandan commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818071424 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -127,12 +148,18 @@ impl SortPreservingMergeStream { metrics, aborted: false,

Re: [PR] hive: support for special not expression `!a` and raise error for `a!` factorial operator [datafusion-sqlparser-rs]

2024-10-27 Thread via GitHub
wugeer commented on PR #1472: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1472#issuecomment-2439971661 @iffyio Sorry, my oversight caused the code not to be merged. The code has been re-pushed. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] hive: support for special not expression `!a` and raise error for `a!` factorial operator [datafusion-sqlparser-rs]

2024-10-27 Thread via GitHub
wugeer commented on code in PR #1472: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1472#discussion_r1818063793 ## src/parser/mod.rs: ## @@ -1174,6 +1175,14 @@ impl<'a> Parser<'a> { ), }) } +Token::Exc

Re: [PR] hive: support for special not expression `!a` and raise error for `a!` factorial operator [datafusion-sqlparser-rs]

2024-10-27 Thread via GitHub
wugeer commented on code in PR #1472: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1472#discussion_r1818063632 ## src/ast/operator.rs: ## @@ -51,6 +51,8 @@ pub enum UnaryOperator { PGPrefixFactorial, /// Absolute value, e.g. `@ -9` (PostgreSQL-specific)

Re: [PR] feat(logical-types): add NativeType and LogicalType [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #12853: URL: https://github.com/apache/datafusion/pull/12853#discussion_r1818059422 ## datafusion/common/src/types/native.rs: ## @@ -0,0 +1,399 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license ag

Re: [PR] feat(logical-types): add NativeType and LogicalType [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #12853: URL: https://github.com/apache/datafusion/pull/12853#discussion_r1818059422 ## datafusion/common/src/types/native.rs: ## @@ -0,0 +1,399 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license ag

Re: [PR] feat(logical-types): add NativeType and LogicalType [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #12853: URL: https://github.com/apache/datafusion/pull/12853#discussion_r1818058563 ## datafusion/common/src/types/native.rs: ## @@ -0,0 +1,399 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license ag

[PR] Add support for PostgreSQL LISTEN/NOTIFY [datafusion-sqlparser-rs]

2024-10-27 Thread via GitHub
wugeer opened a new pull request, #1485: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1485 This PR supports `LISTEN/NOTIFY` clause for postgres dialect,. For more information, please refer to: https://www.postgresql.org/docs/current/sql-listen.html https://www.postgresql

Re: [PR] Round robin polling between tied winners in sort preserving merge [datafusion]

2024-10-27 Thread via GitHub
jayzhan211 commented on code in PR #13133: URL: https://github.com/apache/datafusion/pull/13133#discussion_r1818032540 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -327,16 +409,79 @@ impl SortPreservingMergeStream { self.loser_tree_adjusted = true; } -

Re: [PR] Do not push down filter through distinct on [datafusion]

2024-10-27 Thread via GitHub
epsio-banay commented on code in PR #12943: URL: https://github.com/apache/datafusion/pull/12943#discussion_r1818028101 ## datafusion/optimizer/src/push_down_filter.rs: ## @@ -628,6 +627,103 @@ fn infer_join_predicates( .collect::>>() } +/// Check whether the given e

  1   2   >