Re: [PR] Add generate_series() udtf (and introduce 'lazy' `MemoryExec`) [datafusion]

2024-11-29 Thread via GitHub
2010YOUY01 commented on code in PR #13540: URL: https://github.com/apache/datafusion/pull/13540#discussion_r1864147572 ## datafusion/functions-table/src/generate_series.rs: ## @@ -0,0 +1,180 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributo

Re: [PR] Fix: JOIN should require ON condition [datafusion-sqlparser-rs]

2024-11-29 Thread via GitHub
demetribu commented on code in PR #1552: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1552#discussion_r1864152593 ## src/dialect/mod.rs: ## @@ -687,6 +687,41 @@ pub trait Dialect: Debug + Any { fn is_reserved_for_identifier(&self, kw: Keyword) -> bool {

Re: [PR] Add generate_series() udtf (and introduce 'lazy' `MemoryExec`) [datafusion]

2024-11-29 Thread via GitHub
2010YOUY01 commented on PR #13540: URL: https://github.com/apache/datafusion/pull/13540#issuecomment-2508873495 > Thank you @2010YOUY01. I reviewed the changes and LGTM. I have a few minor comments and one question: I noticed another approach of `generate_series()`, which can be used like t

Re: [PR] Add generate_series() udtf (and introduce 'lazy' `MemoryExec`) [datafusion]

2024-11-29 Thread via GitHub
2010YOUY01 commented on code in PR #13540: URL: https://github.com/apache/datafusion/pull/13540#discussion_r1864147810 ## datafusion/physical-plan/src/memory.rs: ## @@ -365,8 +366,165 @@ impl RecordBatchStream for MemoryStream { } } +pub trait StreamingBatchGenerator: Se

Re: [PR] Add generate_series() udtf (and introduce 'lazy' `MemoryExec`) [datafusion]

2024-11-29 Thread via GitHub
2010YOUY01 commented on code in PR #13540: URL: https://github.com/apache/datafusion/pull/13540#discussion_r1864147468 ## datafusion/core/src/execution/session_state_defaults.rs: ## @@ -119,6 +120,11 @@ impl SessionStateDefaults { functions_window::all_default_window_fu

[PR] Replace is_sorted helper with standard one. [datafusion]

2024-11-29 Thread via GitHub
akurmustafa opened a new pull request, #13608: URL: https://github.com/apache/datafusion/pull/13608 ## Which issue does this PR close? Closes #. ## Rationale for this change With rust release `1.82.0` `is_sorted` util is available for iterators. This PR replaces exis

Re: [PR] Fix: JOIN should require ON condition [datafusion-sqlparser-rs]

2024-11-29 Thread via GitHub
iffyio commented on code in PR #1552: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1552#discussion_r1864122905 ## src/dialect/mod.rs: ## @@ -687,6 +687,41 @@ pub trait Dialect: Debug + Any { fn is_reserved_for_identifier(&self, kw: Keyword) -> bool {

Re: [PR] chore: Fix clippy issues after rust update (1.83.0) [datafusion-ballista]

2024-11-29 Thread via GitHub
andygrove merged PR #1143: URL: https://github.com/apache/datafusion-ballista/pull/1143 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] Compare schema as logically equivalent to workaround disappearing metadata [datafusion]

2024-11-29 Thread via GitHub
github-actions[bot] commented on PR #12631: URL: https://github.com/apache/datafusion/pull/12631#issuecomment-2508785091 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] Improve unsupported compound identifier message [datafusion]

2024-11-29 Thread via GitHub
Weijun-H commented on code in PR #13605: URL: https://github.com/apache/datafusion/pull/13605#discussion_r1864040114 ## datafusion/sqllogictest/test_files/errors.slt: ## @@ -70,7 +70,7 @@ SELECT COUNT(*) FROM nonexistentschema.aggregate_test_100 statement error Error during pla

[PR] Do not merge: Implementing Unit testing for Python [datafusion-ray]

2024-11-29 Thread via GitHub
edmondop opened a new pull request, #50: URL: https://github.com/apache/datafusion-ray/pull/50 Looking at #42 I think we should fix this before modifying the Python code. I was surprised to see the second test succeeding and the first failing btw -- This is an automated message from the A

Re: [PR] Add SimpleScalarUDF::new_with_signature [datafusion]

2024-11-29 Thread via GitHub
jayzhan211 commented on PR #13592: URL: https://github.com/apache/datafusion/pull/13592#issuecomment-2508760565 Thanks @findepi @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Add SimpleScalarUDF::new_with_signature [datafusion]

2024-11-29 Thread via GitHub
jayzhan211 merged PR #13592: URL: https://github.com/apache/datafusion/pull/13592 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] [minor]: Update median implementation [datafusion]

2024-11-29 Thread via GitHub
akurmustafa merged PR #13554: URL: https://github.com/apache/datafusion/pull/13554 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@da

Re: [PR] fix: [comet-parquet-exec] Use filePath instead of pathUri [datafusion-comet]

2024-11-29 Thread via GitHub
viirya merged PR #1124: URL: https://github.com/apache/datafusion-comet/pull/1124 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] fix: [comet-parquet-exec] Use filePath instead of pathUri [datafusion-comet]

2024-11-29 Thread via GitHub
viirya commented on PR #1124: URL: https://github.com/apache/datafusion-comet/pull/1124#issuecomment-2508703710 Thanks @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [I] Ballista Python Issue(s) [datafusion-ballista]

2024-11-29 Thread via GitHub
milenkovicm commented on issue #1142: URL: https://github.com/apache/datafusion-ballista/issues/1142#issuecomment-2508681263 Draft patch to illustrate "Possible Solution (I)", for `datafusion-python` (v42) which would solve (py)ballista issues: ```diff diff --git a/Cargo.lock b/Ca

Re: [PR] [POC] Fuse operations in `equal_rows_arr` [datafusion]

2024-11-29 Thread via GitHub
LeslieKid commented on PR #13607: URL: https://github.com/apache/datafusion/pull/13607#issuecomment-2508640484 Major changes currently: - Compare arrays with indices (in a for loop) without `take+eq`. - Update a single boolean buffer instead of create a new one every time. I am t

[PR] Encapsulate CreateFunction [datafusion-sqlparser-rs]

2024-11-29 Thread via GitHub
philipcristiano opened a new pull request, #1573: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1573 Extract CreateFunction from Statement enum into it's own struct. Continue moving structs as part of https://github.com/apache/datafusion-sqlparser-rs/issues/1204 -- Th

Re: [I] Improve vectorized operations of `GroupColumn` [datafusion]

2024-11-29 Thread via GitHub
LeslieKid commented on issue #13275: URL: https://github.com/apache/datafusion/issues/13275#issuecomment-2508540983 > 🤔 I think maybe we can indeed try some solutions without `take` in join? @LeslieKid seems trying it. Yes, I am working on comparing arrays with indices in a for-loop (

[PR] [POC] Fuse operations in `equal_rows_arr` [datafusion]

2024-11-29 Thread via GitHub
LeslieKid opened a new pull request, #13607: URL: https://github.com/apache/datafusion/pull/13607 ## Which issue does this PR close? Closes #12131 . ## Rationale for this change ## What changes are included in this PR? ## Are these changes t

Re: [I] [DISCUSSION] Making it easier to use DataFusion (lessons from GlareDB) [datafusion]

2024-11-29 Thread via GitHub
tustvold commented on issue #13525: URL: https://github.com/apache/datafusion/issues/13525#issuecomment-2508217233 R.e. WASM32 * Arrow support for 32-bit architectures - https://github.com/apache/arrow-rs/issues/6681 * Object Store support for 32-bit architectures - https://github

Re: [PR] More rigorous treatment of floats in tests [datafusion]

2024-11-29 Thread via GitHub
findepi commented on PR #13590: URL: https://github.com/apache/datafusion/pull/13590#issuecomment-2508177283 > * The IEEE-754 basic arithmetic operations are reproducible. > ... > * Floating-point arithmetic is not associative. These two points imply that a database's Floating-po

Re: [PR] Add SimpleScalarUDF::new_with_signature [datafusion]

2024-11-29 Thread via GitHub
findepi commented on code in PR #13592: URL: https://github.com/apache/datafusion/pull/13592#discussion_r1863791357 ## datafusion/expr/src/expr_fn.rs: ## @@ -434,10 +434,22 @@ impl SimpleScalarUDF { volatility: Volatility, fun: ScalarFunctionImplementation,

[PR] feat: Record Arrow FFI metrics [datafusion-comet]

2024-11-29 Thread via GitHub
andygrove opened a new pull request, #1128: URL: https://github.com/apache/datafusion-comet/pull/1128 ## Which issue does this PR close? N/A ## Rationale for this change This is a subset of https://github.com/apache/datafusion-comet/pull/, separated o

Re: [PR] Documentation updates: simplify examples and add section on data sources [datafusion-python]

2024-11-29 Thread via GitHub
andygrove merged PR #955: URL: https://github.com/apache/datafusion-python/pull/955 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

Re: [I] [DISCUSSION] More SqlLogicTest test coverage for queries, including join queries [datafusion]

2024-11-29 Thread via GitHub
Omega359 commented on issue #13470: URL: https://github.com/apache/datafusion/issues/13470#issuecomment-2508137692 I came across [this paper](https://dl.acm.org/doi/10.1145/3654991) which I found interesting for join testing. It won't work for DF yet since DF doesn't support query hints (at

[PR] chore: Fix clippy issues after rust update (1.83.0) [datafusion-ballista]

2024-11-29 Thread via GitHub
milenkovicm opened a new pull request, #1143: URL: https://github.com/apache/datafusion-ballista/pull/1143 Fix clippy issues after rust update (1.83.0), + change a logger from warn to debug, as we cant do much with those errors at the moment and they happen even with core datafusion c

Re: [PR] feat(scheduler): add csv support for get_file_metadata grpc method [datafusion-ballista]

2024-11-29 Thread via GitHub
etolbakov commented on PR #1011: URL: https://github.com/apache/datafusion-ballista/pull/1011#issuecomment-2508119888 @milenkovicm makes sense! Thanks! I’m already in the discord, will check the ballista stream. Happy if you close the issue. -- This is an automated message from the

Re: [PR] feat(scheduler): add csv support for get_file_metadata grpc method [datafusion-ballista]

2024-11-29 Thread via GitHub
milenkovicm commented on PR #1011: URL: https://github.com/apache/datafusion-ballista/pull/1011#issuecomment-2508116445 As project was not actively maintained and lot of work accumulated we decide to change project's scope and remove code so maintainers have smaller code base to maintain (

Re: [PR] feat(scheduler): add csv support for get_file_metadata grpc method [datafusion-ballista]

2024-11-29 Thread via GitHub
etolbakov commented on PR #1011: URL: https://github.com/apache/datafusion-ballista/pull/1011#issuecomment-2508078694 @milenkovicm @Dandandan sorry for the confusion, as per description I was exploring the code and came across the TODO about the csv file format support. So decided to ad

Re: [PR] More rigorous treatment of floats in tests [datafusion]

2024-11-29 Thread via GitHub
leoyvens commented on PR #13590: URL: https://github.com/apache/datafusion/pull/13590#issuecomment-2508063268 There are myths and truths to floating-point reproducibility across platforms. Some facts I've gathered while working on this: 1. f32 and f64 in Rust follow IEEE-754. 2. Th

Re: [PR] Improve unsupported compound identifier message [datafusion]

2024-11-29 Thread via GitHub
jonahgao commented on code in PR #13605: URL: https://github.com/apache/datafusion/pull/13605#discussion_r1863695929 ## datafusion/sql/src/planner.rs: ## @@ -622,24 +622,41 @@ pub fn object_name_to_table_reference( idents_to_table_reference(idents, enable_normalization) }

Re: [PR] feat: add expression array_size [datafusion-comet]

2024-11-29 Thread via GitHub
Groennbeck commented on PR #1122: URL: https://github.com/apache/datafusion-comet/pull/1122#issuecomment-2508022496 https://github.com/apache/datafusion/pull/13600 Have to wait for this to get into the next version -- This is an automated message from the Apache Git Service. To res

Re: [PR] More rigorous treatment of floats in tests [datafusion]

2024-11-29 Thread via GitHub
jonahgao commented on PR #13590: URL: https://github.com/apache/datafusion/pull/13590#issuecomment-2508009725 > If we value portability, I'd propose that we switch to `libm`, which is what I've implemented in the second commit. I think portability is not necessary, and [PostgreSQL](

Re: [PR] Fix `LogicalPlan::..._with_subqueries` methods [datafusion]

2024-11-29 Thread via GitHub
peter-toth commented on PR #13589: URL: https://github.com/apache/datafusion/pull/13589#issuecomment-2508004060 > I have one minor suggestion: if it doesn’t require much effort, could we also add a test to ensure that jump does not continue to visit subqueries also? Even though this behavio

Re: [PR] More rigorous treatment of floats in tests [datafusion]

2024-11-29 Thread via GitHub
jonahgao commented on code in PR #13590: URL: https://github.com/apache/datafusion/pull/13590#discussion_r1863646359 ## datafusion/sqllogictest/test_files/aggregate_skip_partial.slt: ## @@ -261,11 +261,11 @@ SELECT c2, min(c5), max(c5), min(c11), max(c11) FROM aggregate_test_10

Re: [PR] Tidy up join test code [datafusion]

2024-11-29 Thread via GitHub
berkaysynnada merged PR #13604: URL: https://github.com/apache/datafusion/pull/13604 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [I] Allow to unparse `UNNEST` plan back to a table function SQL text [datafusion]

2024-11-29 Thread via GitHub
jatin510 commented on issue #13601: URL: https://github.com/apache/datafusion/issues/13601#issuecomment-2507913083 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[I] Missing "INFO" log level [datafusion-comet]

2024-11-29 Thread via GitHub
SalcaE opened a new issue, #1127: URL: https://github.com/apache/datafusion-comet/issues/1127 ### Describe the bug I'm running Comet 0.4.0 and I can't see the log level info to check if Comet is enabled for Spark SQL query, which I was able to do with version 0.3.0. ### Steps t

[I] Comet 0.3.0 jar file not present [datafusion-comet]

2024-11-29 Thread via GitHub
SalcaE opened a new issue, #1126: URL: https://github.com/apache/datafusion-comet/issues/1126 ### Describe the bug I noticed that the file "comet-spark-spark3.4_2.12-0.3.0.jar" is not present in the "jars" folder inside the [image](https://github.com/apache/datafusion-comet/pkgs/cont

Re: [PR] Add SimpleScalarUDF::new_with_signature [datafusion]

2024-11-29 Thread via GitHub
alamb commented on code in PR #13592: URL: https://github.com/apache/datafusion/pull/13592#discussion_r1863528647 ## datafusion/expr/src/expr_fn.rs: ## @@ -434,10 +434,22 @@ impl SimpleScalarUDF { volatility: Volatility, fun: ScalarFunctionImplementation,

Re: [I] [substrait] support try_cast [datafusion]

2024-11-29 Thread via GitHub
alamb closed issue #13419: [substrait] support try_cast URL: https://github.com/apache/datafusion/issues/13419 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

Re: [PR] feat(substrait): support-try-cast [datafusion]

2024-11-29 Thread via GitHub
alamb merged PR #13562: URL: https://github.com/apache/datafusion/pull/13562 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] More rigorous treatment of floats in tests [datafusion]

2024-11-29 Thread via GitHub
findepi commented on PR #13590: URL: https://github.com/apache/datafusion/pull/13590#issuecomment-2507810984 > So I went looking to see if there was a performant but portable float math library we could use. [libm](https://github.com/rust-lang/libm) seems to be it, it's what rustc uses when

Re: [PR] More rigorous treatment of floats in tests [datafusion]

2024-11-29 Thread via GitHub
findepi commented on code in PR #13590: URL: https://github.com/apache/datafusion/pull/13590#discussion_r1863517475 ## datafusion/sqllogictest/test_files/aggregate_skip_partial.slt: ## @@ -261,11 +261,11 @@ SELECT c2, min(c5), max(c5), min(c11), max(c11) FROM aggregate_test_100

[PR] Test for string / numeric coercion [datafusion]

2024-11-29 Thread via GitHub
alamb opened a new pull request, #13606: URL: https://github.com/apache/datafusion/pull/13606 ## Which issue does this PR close? - Part of https://github.com/apache/datafusion/issues/13359 ## Rationale for this change While testing https://github.com/apache/datafusion

Re: [PR] More rigorous treatment of floats in tests [datafusion]

2024-11-29 Thread via GitHub
leoyvens commented on PR #13590: URL: https://github.com/apache/datafusion/pull/13590#issuecomment-2507780521 There was the following test failure on amd64 and win64: ``` External error: query result mismatch: [SQL] select acos(0), acos(0.5), acos(1); [Diff] (-expected|+actual)

Re: [PR] Improve unsupported compound identifier message [datafusion]

2024-11-29 Thread via GitHub
alamb commented on code in PR #13605: URL: https://github.com/apache/datafusion/pull/13605#discussion_r1863452171 ## datafusion/sqllogictest/test_files/errors.slt: ## @@ -70,7 +70,7 @@ SELECT COUNT(*) FROM nonexistentschema.aggregate_test_100 statement error Error during planni

Re: [PR] feat: Add implicit casting to `TypeSignature::String` [datafusion]

2024-11-29 Thread via GitHub
Omega359 commented on PR #13404: URL: https://github.com/apache/datafusion/pull/13404#issuecomment-2507747026 > An alternative approach is that we need to differentiate `string literal` and `varchar` like Postgres an DuckDB. Only untyped `string literal` is able to cast to any other types,

Re: [PR] feat: Add implicit casting to `TypeSignature::String` [datafusion]

2024-11-29 Thread via GitHub
Omega359 commented on PR #13404: URL: https://github.com/apache/datafusion/pull/13404#issuecomment-2507744060 > @Omega359 How about we make this configurable? Enable implicit coercion if we want the ease of use and the casting cost is acceptable, disable it if we prefer explicit casting wit

Re: [I] "recursive" Dependency Causes "section too large" Error When Compiling for wasm [datafusion]

2024-11-29 Thread via GitHub
berkaysynnada commented on issue #13513: URL: https://github.com/apache/datafusion/issues/13513#issuecomment-2507725495 I have applied the @blaginin suggestion: ``` berkaysahin@Berkays-MacBook-Pro wasmtest % clang --version Homebrew clang vers

[PR] Improve unsupported compound identifier message [datafusion]

2024-11-29 Thread via GitHub
alamb opened a new pull request, #13605: URL: https://github.com/apache/datafusion/pull/13605 ## Which issue does this PR close? - Part of https://github.com/apache/datafusion/pull/13546 ## Rationale for this change While working on the upgrade of sqlparser, the structure

Re: [I] Ballista Python Issue(s) [datafusion-ballista]

2024-11-29 Thread via GitHub
timsaucer commented on issue #1142: URL: https://github.com/apache/datafusion-ballista/issues/1142#issuecomment-2507706840 But even if that unblocks you I worry it still doesn’t resolve to core issue of trying to share that session context from one python package to another. -- This is a

Re: [I] Ballista Python Issue(s) [datafusion-ballista]

2024-11-29 Thread via GitHub
timsaucer commented on issue #1142: URL: https://github.com/apache/datafusion-ballista/issues/1142#issuecomment-2507704638 I’ve been meaning to dive into this and also some work happening on `datafusion-ray` that may encounter similar problems. One thing the `datafusion-python` package is

Re: [PR] Support relation visitor to visit the `Option` field [datafusion-sqlparser-rs]

2024-11-29 Thread via GitHub
alamb merged PR #1556: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1556 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] Support relation visitor to visit the `Option` field [datafusion-sqlparser-rs]

2024-11-29 Thread via GitHub
alamb commented on PR #1556: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1556#issuecomment-2507686514 I double checked with https://github.com/apache/datafusion/pull/13546 and this works great. Thank you again @goldmedal and @iffyio -- This is an automated message from t

Re: [I] Relation visitor fails to visit the `SHOW COLUMNS` statement in the latest commit of the main branch [datafusion-sqlparser-rs]

2024-11-29 Thread via GitHub
alamb closed issue #1554: Relation visitor fails to visit the `SHOW COLUMNS` statement in the latest commit of the main branch URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1554 -- This is an automated message from the Apache Git Service. To respond to the message, please log

[PR] Tidy up join test code [datafusion]

2024-11-29 Thread via GitHub
ozankabak opened a new pull request, #13604: URL: https://github.com/apache/datafusion/pull/13604 ## Which issue does this PR close? N/A. Closes #. ## Rationale for this change Using succinct test utility functions instead of verbose constructions, making some dow

Re: [I] Main branch, linter failure on new Rust version [datafusion-sqlparser-rs]

2024-11-29 Thread via GitHub
alamb commented on issue #1569: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1569#issuecomment-2507640203 Thanks for the report @demetribu -- hopefully this is fixed now -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [I] Main branch, linter failure on new Rust version [datafusion-sqlparser-rs]

2024-11-29 Thread via GitHub
alamb closed issue #1569: Main branch, linter failure on new Rust version URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1569 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Fix clippy warnings on rust 1.83 [datafusion-sqlparser-rs]

2024-11-29 Thread via GitHub
alamb merged PR #1570: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1570 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] Handle alias when parsing sql(parse_sql_expr) [datafusion]

2024-11-29 Thread via GitHub
milenkovicm commented on code in PR #12939: URL: https://github.com/apache/datafusion/pull/12939#discussion_r1863370988 ## datafusion/sql/src/expr/mod.rs: ## @@ -1195,4 +1195,25 @@ mod tests { test_stack_overflow!(2048); test_stack_overflow!(4096); test_stack_over

[PR] support unknown col expr in proto [datafusion]

2024-11-29 Thread via GitHub
onursatici opened a new pull request, #13603: URL: https://github.com/apache/datafusion/pull/13603 ## Which issue does this PR close? Closes #. ## Rationale for this change If a projection is done on a hash partitioned input and if the projection does not include

Re: [I] Using PARTITION BY in SQL generates 'Error: External(NotImplemented("it is not yet supported to write to hive partitions with datatype Float64"))' [datafusion]

2024-11-29 Thread via GitHub
ajazam commented on issue #13602: URL: https://github.com/apache/datafusion/issues/13602#issuecomment-2507538896 I've tried rust 1.82 and 1.83 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[I] Using PARTITION BY in SQL generates 'Error: External(NotImplemented("it is not yet supported to write to hive partitions with datatype Float64"))' [datafusion]

2024-11-29 Thread via GitHub
ajazam opened a new issue, #13602: URL: https://github.com/apache/datafusion/issues/13602 ### Describe the bug I am trying to create a parquet file with hive partitioning, from csv data and get error Error: External(NotImplemented("it is not yet supported to write to hive part

[I] Allow to unparse `UNNEST` plan back to a table function SQL text [datafusion]

2024-11-29 Thread via GitHub
goldmedal opened a new issue, #13601: URL: https://github.com/apache/datafusion/issues/13601 ### Is your feature request related to a problem or challenge? Given a SQL: ```SQL select * from unnest([1,2,3]) as t(c1) ``` DataFusion plans the unnest to `Projection/Unnest/Proje

Re: [I] Improve performance of db-benchmark query 8 [datafusion]

2024-11-29 Thread via GitHub
Dandandan commented on issue #13586: URL: https://github.com/apache/datafusion/issues/13586#issuecomment-2507498291 I didn't profile yet, but one potentially problematic line I found here: `concat_batches(self.input_schema(), [input_buffer, &record_batch])?` This concatenates `[

Re: [I] Improve vectorized operations of `GroupColumn` [datafusion]

2024-11-29 Thread via GitHub
Dandandan commented on issue #13275: URL: https://github.com/apache/datafusion/issues/13275#issuecomment-2507429313 Oh I saw @LeslieKid already commented on that issue 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Improve vectorized operations of `GroupColumn` [datafusion]

2024-11-29 Thread via GitHub
Dandandan commented on issue #13275: URL: https://github.com/apache/datafusion/issues/13275#issuecomment-2507428266 Yeah som experiments around that would be nice. Also see this issue: https://github.com/apache/datafusion/issues/12131 -- This is an automated message from the Apac

[I] Ballista Python Issue(s) [datafusion-ballista]

2024-11-29 Thread via GitHub
milenkovicm opened a new issue, #1142: URL: https://github.com/apache/datafusion-ballista/issues/1142 First of all, I'm not expert in rust-python (pyo3) integration, if I've done/said something stupid, my apologies. Current implementation of (py)ballista has limitation when it come

Re: [I] Improve vectorized operations of `GroupColumn` [datafusion]

2024-11-29 Thread via GitHub
Rachelint commented on issue #13275: URL: https://github.com/apache/datafusion/issues/13275#issuecomment-2507419912 > Thanks @Rachelint for summarizing, that's interesting. One big difference between `take` + `eq` in join versus grouped aggregates seems to be that the `ValueBuilder`s alread

Re: [I] Main branch, linter failure on new Rust version [datafusion-sqlparser-rs]

2024-11-29 Thread via GitHub
demetribu commented on issue #1569: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1569#issuecomment-2507377938 related #1570 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Fix join with sort push down [datafusion]

2024-11-29 Thread via GitHub
berkaysynnada commented on PR #13560: URL: https://github.com/apache/datafusion/pull/13560#issuecomment-2507351353 Hi @haohuaijin, and sorry for the delayed response. I have been very busy over the past few days. I have reviewed your fix and have some comments about the problem and the solu

Re: [I] Improve vectorized operations of `GroupColumn` [datafusion]

2024-11-29 Thread via GitHub
Dandandan commented on issue #13275: URL: https://github.com/apache/datafusion/issues/13275#issuecomment-2507291597 Thanks @Rachelint for summarizing, that's interesting. One big difference between `take` + `eq` in join versus grouped aggregates seems to be that the `ValueBuilder`s already

[PR] Chore: exposing ArraySize and ArrayFlatten [datafusion]

2024-11-29 Thread via GitHub
Groennbeck opened a new pull request, #13600: URL: https://github.com/apache/datafusion/pull/13600 ## Which issue does this PR close? Closes #. ## Rationale for this change Want to able to call these functions in apache comet. But cannot create the expression because